Security exploits, Java vs. C / C++

JavaCoder

In the other thread the discussion turned to this. I'm starting a separate thread on this issue. I'm interested in remote exploits. Which are possible, likely, common, or impossible in Java vs. C / C++?

#1 real-world exploit: Buffer overflow exploits (including format strings, etc)

This is one type of vulnerability which directly allows execution of arbitrary code.

C: It's easy to make these types of mistakes in C. These are the most common exploits over the past ten to twenty years. Even good programmers make mistakes in this area.

Java: There are no buffer overflow exploits in Java, except in cases where Java calls native code and the native code has a buffer overflow problem. There have been a few of these situations.

TOCTTOU race conditions:

C and Java: These errors are possible in both C and Java, and have come up in both. In C, the solution is to always copy arguments before doing the check. In Java, the best solution is to use immutable arguments. Copying arguments before checking is also an option. Java programmers tend to use threads more often, and threads are tricky.

SQL injections:

Possible in both C and Java and other languages. These are common on websites. The way to prevent them is to use the right DB access library.

Other escaping bugs:

Possible in any language. It is best to avoid using the shell or other situations where escaping is needed, if possible. Incorrectly escaping a file name while writing is the other bug that can lead to execution of arbitrary code.

Program logic exploits:

These exist in all languages. If your program doesn't have the right logic it will be attackable. These include integer overflows, which wrap around in both Java and C, and other logic problems.

Mikademus

I get the sinking feeling that you'll continue to try to turn TDWTF into an instrument of your personal vendetta against C/C++. Again you make a post, thinly disguised as objective and inquisitive, to make the point that anything ever made in C/C++ is dangerous, bad, bad and evil. Re-read the definition of "tendentious" I gave you in your other thread and get it into your head that language wars are silly, even when camouflaged, and so are you!

*annoyed*

VGR

Earlier this week, I explained to someone that exploits/bugs due to uninitialized memory are impossible in Java, because all fields are required to be initialized to zero/false/null (unless explicitly initialized to something else).

JavaCoder

"I get the sinking feeling that you'll continue to try to turn TDWTF into an instrument of your personal vendetta against C/C++."

No, C and C++ are powerful general-purpose programming languages. With the right tools they can be used fairly safely. All the code I'm using to post this message, from the OS to the browser, is written in C or C++. Anyway, Java is a derivative of C++, both in terms of syntax and ideas. They took C++, removed a bunch of stuff, like MI, memory access, procedural files (code outside classes), the preprocessor, etc, and put it in a virtual machine, and that's Java.

"Earlier this week, I explained to someone that exploits/bugs due to uninitialized memory are impossible in Java, because all fields are required to be initialized to zero/false/null (unless explicitly initialized to something else)."

Right. Bugs related unitialized memory just can't happen in Java, and they can happen in C. It's pretty cool when a certain class of bugs is impossible. That means something.

ammoQ

@JavaCoder said:

Anyway, Java is a derivative of C++, both in terms of syntax and ideas. They took C++, removed a bunch of stuff, like MI, memory access, procedural files (code outside classes), the preprocessor, etc, and put it in a virtual machine, and that's Java.

IMO they created a completely new language and decided to use a C++-like syntax where possible. The relationship between C++ and Java is not closer than the relationship between Java and JavaScript.

JavaCoder

"IMO they created a completely new language and decided to use a C++-like syntax where possible. The relationship between C++ and Java is not closer than the relationship between Java and JavaScript."

It's way closer. They took not just the syntax, but all the ideas they could. I'm not saying that C++ invented OO programming, but it sure did put it into the form that is the same as the form in Java. Java just went wild and decided to strip out some complicated features, and also put the entire thing into a managed environment from the very beginning. That doesn't invalidate the ideas of C++, it is just expressing them in a different way.

PSWorx

@Mikademus said:

I get the sinking feeling that you'll continue to try to turn TDWTF into an instrument of your personal vendetta against C/C++. Again you make a post, thinly disguised as objective and inquisitive, to make the point that anything ever made in C/C++ is dangerous, bad, bad and evil. Re-read the definition of "tendentious" I gave you in your other thread and get it into your head that language wars are silly, even when camouflaged, and so are you!
annoyed

Well, we all can read his first post. If he really tried to camouflage anything then very thinly. Of course his post is pro-java, but it's not like it was ever meant to be objective. (heck, just read his nick). This was the invitation to a discussion, not the end.

Also, language wars are of course silly. But they won't become any more smart by flaming. At least try to stay calm.

As for me, I believe that Java and managed code in general is a lot more secure than unmanaged code. Which does not mean unmanaged code has no place to exist anymore from now on. I have, however not enough experience to prove anything.

JavaCoder

Read my first post. I list a series of vulnerabilities which can exist, and guess what, most of them can exist in Java, C and other languages. There's on that is impossible in Java, which is the buffer overflow exploit, which happens to be the cause of most vulnerabilities. That's a simple fact and it is an advantage of Java. The other vulnerabilities, like race conditions and SQL injections, are things we need to worry about in Java, C and most other languages. In fact I posted a blog entry about [url=http://chiralsoftware.com/blog/Race-condition-vulnerability-in-syscall-wrappers-fa3e57c594119803.html]race condition vulnerabilities in Java[/url], complete with demonstration source code. I'll be posting a follow-up where I demonstrate some even more subtle race situations that exist in Java (and in C) and which are exploitable.

If someone is claiming all languages are equal, that doesn't make sense. Direct memory access is dangerous, that should be obvious from the fact that it is the basis for most vulnerabilities.

dhromed

@JavaCoder said:

Direct memory access is dangerous, that should be obvious from the fact that it is the basis for most vulnerabilities.

I think that a statement like that, however true, should be followed up with a conclusion that it is also a powerful tool that can be used for Good, and should not be "outlawed".

VGR

@ammoQ said:

IMO they created a completely new language and decided to use a C++-like syntax where possible. The relationship between C++ and Java is not closer than the relationship between Java and JavaScript.

I am pretty sure you are exactly correct. C++-like syntax was chosen with the knowledge that only a familiar syntax would give a strange new (interpreted!) language any chance of adoption.

It is my understanding that some of the creators of Java had been involved in the design of ObjectiveC, and for a while entertained the idea of using that syntax for Java (and I can't blame them), but its chance at adoption would have been virtually nil if they had.

JavaCoder

I think that a statement like that, however true, should be followed up with a conclusion that it is also a powerful tool that can be used for Good, and should not be "outlawed".

Dangerous stuff should be outlawed except when it is needed. And direct memory access needs to happen somewhere. The key with dangerous stuff like that is to narrow down the points where it can happen. Think of a firewall: you define a set of ports that you want to allow access to, you define the software that can listen to those ports, you define which permissions that software has (not all FW software can do all these things). The same with dangerous language features. Does an Ethernet driver or a kernel need direct memory access? Yes. Does a web sever? No. Do most user programs? No.

I'm all in favor of using the minimum access / power / capability needed to get the job done. That's the basis of good security.

JavaCoder

As I said in my original post, the #1 remotely exploitable security hole in software today is the buffer overflow, and this is an inherent problem in platforms that allow the code to manipulate memory directly. You send in a data that overflows a buffer, and then parts of the buffer / data get executed. Most of the remote exploits over the past ten years or so have been of this type. Java is immune to this type of exploit, because Java bytecode can't access memory, so no matter how broken the bytecode is, an attacker can't send in data which overflows and gets executed. This is a simple fact. The one exception is, there almost certainly are exploits lurking in the places where Sun's JRE uses native code for various things. Most recent was the JPEG handling vulnerability, where a native JPEG library was used. Again this shows, don't handle dangerous data using unmanaged code.

And today we have yet another remote buffer overflow exploit, which caused significant harm to a huge company (Ebay / Skype), in modern code written by competent prorgrammers:

Did Russian Hackers Crash Skype? - Slashdot

An anonymous reader sends us to the www.xakep.ru forum where a poster claims that the worldwide Skype crash was caused by Russian hackers (in Russian). The claim is that they found a local buffer overflow vulnerability caused by sending a long string to the Skype authorization server. You can try...

(NB, it is not confirmed and in fact Skype is denying it, but the point is, these types of buffer overflows are common.)

kirchhoff

What the fuck kind of thread is this?

C/C++ is a fine language to write a networked application in if it is performance-critical.

It's simple.

Never allocate a variably-sized structure on the stack (local variables).
Don't conditionally allocate on the stack either. You allocate fixed-sized structures, and you use them LOCALLY. Do not monkey around with the stack, assholes.
Don't use longjmp().
If for some function <foo> a <foo>_r function exists in your standard library, USE IT.
This is the 21st century. Prefer AF_UNIX and mmap'd anonymous memory to SYSV IPC (or god forbid, shared/locking files on disk) where appropriate.
Libraries like glib exist for you to do things like deal with data structures, thread-local storage, pipes and internet sockets without having to know much about the ins/outs of the target system. Please don't reinvent the wheel badly. Thanks.
Drop privs early before you enter that accept/select loop.

Have fun with your with networked C/C++ application!

Mikademus

@kirchhoff said:

What the fuck kind of thread is this?

It is a "bash C/C++" thread, regardless of apparent attempts at objectivity by the OP.

@JavaCoder said:

Dangerous stuff should be outlawed except when it is needed.

And this quite excellently illustrated the "Java mindset" and why the Java users and C++ programmers comes from different ideological planets (to spell it out, Java users happily accepts that Sun defines "danger" and "inappropriate" etc (if in media it would be called "censure") while C++ trusts programmer to decide for themselves). However, most seasoned programmers accepts the two languages as tools, knows the advantages and shortcomings of both, use them for for suitable tasks, and don't try to clandestinely propagandise their convictions in serious fora.

frotter

One thing that's interesting is the level of indirection in accessing "raw" memory in a user program. When dereferencing virtual memory, a table must be consulted first to determine the physical memory location. IIRC, the CPU performs the table lookup, but the OS must map memory, set access restrictions, retrieve swapped data, etc.

Wikipedia: Page table

masklinn

@JavaCoder said:

I think that a statement like that, however true, should be followed up with a conclusion that it is also a powerful tool that can be used for Good, and should not be "outlawed".

Dangerous stuff should be outlawed except when it is needed.

Say, are you in favor of outlawing destructive updates? I know I am, mostly because I don't like them, but they're also dangerous and potential sources of bug by introducing mutable state in software.

Lingerance

http://www.codinghorror.com/blog/archives/000841.html

As another user has already implied security of a program is really dependent on the coders, hopefully quality standards will remove that risk.

Please prove or cite a good source that says that buffer overflows are the #1 exploitable security flaw in software.

asuffield

@tster said:

Please prove or cite a good source that says that buffer overflows are the #1 exploitable security flaw in software.

The obvious question would be, #1 on what measurement? It's really a pretty meaningless statement. You might as well say "blue is the #1 colour".

One notable criteria that they meet is that buffer overflows are by a large margin the most easily detectable and correctable exploitable security flaw in software - but I don't think "easiest major bug to fix" is a desperately significant observation. It does mean that you would expect them to form a large fraction of the discovered, reported, corrected, and unexploited security flaws in software of medium to low significance.

@asuffield said:

@tster said:
Please prove or cite a good source that says that buffer overflows are the #1 exploitable security flaw in software.
The obvious question would be, #1 on what measurement? It's really a pretty meaningless statement. You might as well say "blue is the #1 colour".
One notable criteria that they meet is that buffer overflows are by a large margin the most easily detectable and correctable exploitable security flaw in software - but I don't think "easiest major bug to fix" is a desperately significant observation. It does mean that you would expect them to form a large fraction of the discovered, reported, corrected, and unexploited security flaws in software of medium to low significance.

Sorry, you're wrong. Yellow is the #1 color. And what you said was pretty much exactly what I was getting at, but I don't have the patience to type up an argument as elegantly as you do in response to a Java fanboy who thinks that C++ is bad because pointers lead you to scary places which you should not be able to go to in great languages like Java.

However, in response to professor Java:

Java is fine, it's a fairly good language. However, it is not spectacular, and it is not ground breaking or revolutionary.

Some things in the JVM are pretty cool, and webstart is neeto. However, that isn't a new programming language concept or idea. It's just another way to use the language (a nice way I might add).

Some facts:

Java is not the first language to be cross-platform
Java is not the first OO language.
Java is not even as OO as other languages.
Java is not the first interpreted language. It is also not possible to distribute a java program and guarantee the source code is safe. Java decompilers exists that leave the code in a good enough state to make modifications to.
Java is not the first language with garbage collection.
Java is missing tons of powerful language constructs that have been around for a long time.
Java is not the only language with an extensive library (which really has nothing to do with the design of the actual language.)
Java is not the first language that had users that think that it is the end all be all of programming languages.

Think about that last one for a while.

Sunstorm

Both suck, C# wins everything! </fuelThefire>

Tthere's something that confuses me in this argument.

Managed languages at one point have to interact directly with memory, because they're basically putting a checked abstraction layer over it. Ultimately, it's the same as using checked memory access functions in your unmanaged language of choice. So why is it so hard for people to simply check the size of their buffers, or if they're lazy, use library functions that will do that for them? Everyone's talking about unmanaged buffers like if they're something completely impossible to tame and control without deploying a special buffer controlling robot to do it for you.

I'm guessing that it's really not and they do. The whole argument seems to stem from different approaches to coding discipline. The people that like having it enforced on them will suffer the overhead, and the people that don't will sometimes make mistakes. It's up to the coder to decide according to their own skill level what's the best approach.

The main downside I see for managed languages is that they lower the bar for newbie entry significantly. Overall code quality lowers because people don't learn the danger of blowing your lungs off through your foot. If anything, unmanaged languages should always be taught and used earliest in one's programming life, and only when one gets fairly competent with that, move on, or not, depending on the needs, to the safer ones. Not only a good background closer to the memory helps understand the underlying nature and caveats of managed languages, but most mistakes are learned before you get the chance of blowing up something important.

dhromed

@Sunstorm said:

If anything, unmanaged languages should always be taught and used earliest in one's programming life, and only when one gets fairly competent with that, move on, or not, depending on the needs, to the safer ones. Not only a good background closer to the memory helps understand the underlying nature and caveats of managed languages, but most mistakes are learned before you get the chance of blowing up something important.

I agree with your case, but not quite with the order of learning things.

I'd say a new learner needs to get comfortable dishing out a logical, comprehensible and (most of all!) mainatainable script, before even knowing what the hell malloc() does. If you teach someone about memory management first, and good coding later, there's an increased risk of blowing out not only the lungs, but also the eyes and eardrums -- through the little toe.

Or are you arguing that they learn to appreciate a katana's sharpness, without actually being given a chance to swing it around, chopping off bits left, right & center?

Hm.

This might be a new thread.

Welbog

@dhromed said:

This might be a new thread.

I support this plan. Arguments of the educated about how those who aren't educated should be educated are always entertaining if not educational.

Random832

@Welbog said:

@dhromed said:
This might be a new thread.
I support this plan. Arguments of the educated about how those who aren't educated should be educated are always entertaining if not educational.

Done.