Two questions: phpBB security holes and (unrelated) md5



  • @nick8325 said:

    @Javariel said:
    On #2



    md5 is a 128 bit signature.  So its 16 bytes, not 32.  The
    chance of generating the same string with 2 files is 1/2^16, or approx
    1/65000.  So the chance of 2 random files having the same md5 is
    unlikely.  Md5 does have some attacks though, so use of it for
    cryptographic purposes is not recomended.  Using it to id file
    versions or the like is just fine.



    If your original text is too short for the algorithm, you generally
    fill the extra bits with a known pattern (all 0s) and take the md5 sum
    of that.


    Should that not be 1/2^128, or 1/256^16, or approximately 3x10^38? Although...there are recently-discovered attacks which can that chance a fair bit.

    Sort of.  For an ideal 128-bit hash function:
    • 1/2^128 is the probability of a random string having a given hash value, finding a match in less than expected is a pre-image attack
    • 1/2^64 is (approximately) the probability of two random strings sharing the same hash value, due to the birthday problem, finding a match in less than expected is a collision attack
    MD5 is, however, not as strong as an ideal 138-bit hash - a collision attack that runs in a few hours on a laptop PC has been shown (see http://eprint.iacr.org/2005/075), but I'm not aware of any pre-image attacks having been published (although I wouldn't be surprised if there are attacks waiting to be found, and who knows what the alphabet agencies have hidden away.)



  • @tofu said:

    Think of it as being like a gun. If I
    buy a gun and shoot myself in the foot, who's fault is that? Most
    people will agree that it's my fault. Same situation here. If I dive
    into PHP and never think about safety, that's my fault.

    It's a bit like a gun with a hair trigger, no trigger guard and no safety that's marketed to beginners. Ultimately, it's always the user's responsibility to use the tool properly, but some tools are accidents waiting to happen.



  • @Zak said:

    It's a bit like a gun with a hair trigger, no trigger guard and no safety that's marketed to beginners.


    mar·ket·ed, mar·ket·ing, mar·kets

    v. tr.
    1. To offer for sale.
    2. To sell.
    PHP isn't for sale therefore it can't be marketed to anyone.  Sure, it's dangerous.  But if you want something safer you have to pay for it.  People who aren't willing or able to pay fall back on the free and dangerous tool.

    I don't see what the problem is?

    "How dare those mean evil PHP people give this away for free!"  huh?



  • @tofu said:

    But if you want something safer you have to pay for it.  People who aren't willing or able to pay fall back on the free and dangerous tool.

    I don't see what the problem is?

    "How dare those mean evil PHP people give this away for free!"  huh?

    I'm not making moral judgements about the authors of PHP; I'm making a correction to the gun analogy. Someone pointed out that guns, being weapons are inherently dangerous. Programming languages, being intended to control computers are also dangerous - any programming language that you can't seriously screw something up with is probably not worth using.

    My point is not to say that the designers of PHP are bad people, but to suggest that some features should work differently, especially in a language marketed to intended for use by beginners. You don't have to pay for a better web scripting language: eruby can run inside HTML just like PHP, and I think there's something similar for Python. Of course, for a large project, you'd probably restrict code embedded in HTML to simple templating and put your logic somewhere else. There are far too many free frameworks for that for me to enumerate here. Some of them are good (Rails comes to mind).



  • I cannot speak for phpbb, and I believe most that needs to be said has been said.

    md5, on the other hand...

    Any hash function will have only 2^size (and that's not necessarily the output size) possible results, and that means that, for 'plaintext' values larger than size bits, there's bound to be collisions. (I heard SHA-1 was 160 in output but 128 in effective size... Meaning plenty of values will never occur. I don't really know if it's true.)

    Oh, and, I think the birthday paradox starts when there are 3 entries, not 2.


    As far as passwords are concerned, the attacker needs to be able to find a sequence of characters that will generate the same hash value. (Though circumventing it, like bruteforce or rainbow tables[1], is far more popular.) Preferrably finding the value actually used, so that you can use it with places that use other hashes. (Presuming they use the same password everywhere.)
    As far as hashes for data integrity purposes go, you have one sequence and need to be able to find another that hashes out the same.

    Both may have additional constraints complicating the task; passwords could be required to have all normal printable characters (I once tried having the non-breaking-space in my password. The system accepted setting it, but couldn't handle it when I tried to log back in.)  The data corrupted may have to remain the same length. Or if you're trying to insert something in it, then you need to make sure it follow the rules of the format - and when trying to make it match both hash and the format, your options are more restricted. (On this basis, two different hashes in combination (by concatenation), md5 and sha1, even if both are broken, may remain secure. That's what I think anyway.)

    The only attack I've heard of so far is the ability to generate two values that generate the same hash value, but neither is predetermined (as in hash value or file to be corrupted/infected/changed) but then the assumption is that the rest isn't all that far behind, and so "any cryptographer who doesn't live in a cave" has declared md5 deprecated.

    I think I heard something similar was going on with sha-1, but I'm not so sure - verify that on your own before you act on it.




    I have a unique password for every site I'm on. And they're all random-generated, as opposed to something like mahuja@thedailywtf
    And no, I don't have a supermemory for them. I use an encrypted password database, so that I only need to remember one password - and it's in the 20+ characters, including lowercase, uppercase, numbers and symbols... http://passwordsafe.sourceforge.net/

    [1]http://en.wikipedia.org/wiki/Rainbow_tables



  • @Alex Papadimoulis said:

    You know that's a good point. Everyone knows that network sniffers can only read binary and hex. A MD5 sum is hex-looking, and probably hex-like. But passwords are plain text, so if you send the password over the network in plain text, the sniffer is worthless.

    Er.... WTF? I assume you're being ironic here, but I honestly can't tell. Just in case you're really serious (and I really hope you aren't), plain text can easily be sniffed off a network.

    That said, I agree with murphyman, and the rest of this flamefest is.... saddening, really.



  • This may interest you: a reverse-lookup database of common md5 hashes:



  • @RiX0R said:

    This may interest you: a reverse-lookup database of common md5 hashes:


    @MaHuJa said:
    http://en.wikipedia.org/wiki/Rainbow_tables





Log in to reply