• I had a small argument about this and I'm interested in how practical it would be to crack a single password of a high value target given the following conditions:

1. its sum is either MD5 or SHA-1 (both declared shit by now)
2. it has some salt so it can't be directly pulled from a rainbow table but you have access to it and know how it's applied
3. the length of the password is, say, 12 characters, in this case we would know the exact length for some reason
4. you know the service password charset limitations, say, printable ascii characters (alnum + special but no space)
5. a) you have access to the regular consumer crap (high end GPUs and such)
b) you run a criminal organization and have enough money to buy shittons of consumer crap

How practical would it be to get the original string out? How about the vulnerabilities in MD5 and SHA-1, can brute forcing be accelerated enough for practical attacks? Getting a collision is not that useful if your intention is to reuse the password to gain access to something else than the service it was leaked from.

My argument is that given these conditions it's probably impractical even with a lot of money to crack a password even when hashed with these algorithms if you have length and salt.

Now tell me I'm wrong about the world not ending.

• @hifi Say 90 possible values per character, 12 characters long, that's 90^12=2.8e23 passwords to brute-force.

I'm having trouble finding a good speed number, but this appears to be 1e10 to 1e11 per second. Assuming 1e11, it will take about 1e12 seconds or 30k years to crack the password.

This can probably be reduced by making assumptions on the size of the character set - for example, if it's known the password contains 11 alnums and 1 special that brings you down to 62^112812=1.7e22 possible passwords.

• 62^11*28*12

FTFY.

• How practical would it be to get the original string out?

The weaknesses I remember about these hashes are about collisions. An attacker doesn't get the original string, but something else that hashes to the same md5.

Wikipedia said:

A collision attack exists that can find collisions within seconds on a computer with a 2.6 GHz Pentium 4 processor (complexity of 224.1)

• Now tell me I'm wrong about the world not ending.

The world isn't ending… but it is time to move to the exits in relation to MD5 and SHA1. We don't need to panic over it, but we shouldn't just shrug and do nothing. Instead, switch to something like bcrypt, which is quite a strong algorithm and comes with built-in per-password salting. (I'm assuming you're using session tokens as well; bcrypt is actually too computationally expensive for use on a per-request basis in any site or API that requires many calls to be made in quick session.)

• @hifi If you know the salt and how the salt is applied (or can make a reasonable guess at how it's applied) it's as simple as building a custom rainbow table.

This is why, when a salted password database using high end hash algorithms is compromised, you still change your passwords. Too much work to compromise ALL the accounts, but relatively trivial to take on a few targeted ones.

• Instead, switch to something like bcrypt, which is quite a strong algorithm and comes with built-in per-password salting. (I'm assuming you're using session tokens as well; bcrypt is actually too computationally expensive for use on a per-request basis in any site or API that requires many calls to be made in quick session.)

I've heard good things about bcrypt from multiple sources so it seems to be the obvious choice.

@hifi If you know the salt and how the salt is applied (or can make a reasonable guess at how it's applied) it's as simple as building a custom rainbow table.

Isn't building a custom rainbow table essentially the same as brute forcing?

This is why, when a salted password database using high end hash algorithms is compromised, you still change your passwords. Too much work to compromise ALL the accounts, but relatively trivial to take on a few targeted ones.

What's relatively trivial? If the length of the password is anything reasonable, is it actually trivial to brute 12+ character MD5 or SHA-1 password? @PleegWat threw some numbers on the table which seem to speak on the behalf of "not practical" as of today.

• @hifi But if it's trivial to find a collision for a single, targeted account, then you're in anyway.

• Isn't building a custom rainbow table essentially the same as brute forcing?

Anyway, MD5 collisions are fucking easy.

SHA1 is more interesting. A determined, resourced attacker has a number of options available to them.

Throwing a shitload of CPU cores at the problem using off the shelf software is far from impossible. A 2012-era midrange Core i5 quad core pushes 1E11 hashes per second (which meshes with PleegWat's figures).

SHA1 running on a high end 2015-era GPU is in the 1E13 hashes per second range.

And then there are FPGA's... A random guy built one of these FOR FUN that runs at 1E8 hashes per second. In a 1u space. Using GARBAGE.

There are FPGA designs that run at 2900mbps of hashing input per chip. That's 3E11 96-bit (12 character) hashes per second per chip in just a couple of watts. Hilariously, these chips are designed for use in IT security appliances and appear in sets of DOZENS in high end units if you're into scavenging for parts. Or you can just buy them if you're into high end corporate espionage or the nation state gig.

• A 2012-era midrange Core i5 quad core pushes 1E11 hashes per second (which meshes with PleegWat's figures).

Yeah, I didn't think of that bit when I was looking - all of those will have been for hardware of a couple years ago when people were actively driving out md5. Current hardware will be significantly faster.

If you've got some particular hardware in mind, it is (in most jurisdictions) perfectly fine to obtain a copy of john the ripper and try it out on your own hardware.

• Salted vs. unsalted only makes a difference when cracking lots of passwords with different salts.

If you're only cracking one password, or if all the passwords have the same salt (!!!), it makes no difference whatsoever.

• Salted vs. unsalted only makes a difference when cracking lots of passwords with different salts.

Doesn't it make a difference if you can find collisions easily but not brute-force easily? Without salt you can use your colliding password instead of the real password; with salt you have to find a collision that has the same salt embedded in it that the real password will get, which is much harder.

• If you're only cracking one password, or if all the passwords have the same salt (!!!), it makes no difference whatsoever.

• Now tell me I'm wrong about the world not ending.

The world isn't ending… but it is time to move to the exits in relation to MD5 and SHA1. We don't need to panic over it, but we shouldn't just shrug and do nothing. Instead, switch to something like bcrypt, which is quite a strong algorithm and comes with built-in per-password salting. (I'm assuming you're using session tokens as well; bcrypt is actually too computationally expensive for use on a per-request basis in any site or API that requires many calls to be made in quick session.)Oauth, or whatever equivalent technology replaces it. Storing passwords securely is too fucking complicated, let the likes of Google do it instead

FTFY

• Now tell me I'm wrong about the world not ending.

The world isn't ending… but it is time to move to the exits in relation to MD5 and SHA1. We don't need to panic over it, but we shouldn't just shrug and do nothing. Instead, switch to something like bcrypt, which is quite a strong algorithm and comes with built-in per-password salting. (I'm assuming you're using session tokens as well; bcrypt is actually too computationally expensive for use on a per-request basis in any site or API that requires many calls to be made in quick session.)Oauth, or whatever equivalent technology replaces it. Storing passwords securely is too fucking complicated, let the likes of Google do it instead

FTFY

Is OAuth still fucking unpossible to implement without a PhD in webtardanese?

• Now tell me I'm wrong about the world not ending.

The world isn't ending… but it is time to move to the exits in relation to MD5 and SHA1. We don't need to panic over it, but we shouldn't just shrug and do nothing. Instead, switch to something like bcrypt, which is quite a strong algorithm and comes with built-in per-password salting. (I'm assuming you're using session tokens as well; bcrypt is actually too computationally expensive for use on a per-request basis in any site or API that requires many calls to be made in quick session.)Oauth, or whatever equivalent technology replaces it. Storing passwords securely is too fucking complicated, let the likes of Google do it instead

FTFY

Is OAuth still fucking unpossible to implement without a PhD in webtardanese?

-shrug-

still better than dealing with the ulcers from knowing that you are storing passwords and might be doing it wrong......

• Now tell me I'm wrong about the world not ending.

The world isn't ending… but it is time to move to the exits in relation to MD5 and SHA1. We don't need to panic over it, but we shouldn't just shrug and do nothing. Instead, switch to something like bcrypt, which is quite a strong algorithm and comes with built-in per-password salting. (I'm assuming you're using session tokens as well; bcrypt is actually too computationally expensive for use on a per-request basis in any site or API that requires many calls to be made in quick session.)Oauth, or whatever equivalent technology replaces it. Storing passwords securely is too fucking complicated, let the likes of Google do it instead

FTFY

Is OAuth still fucking unpossible to implement without a PhD in webtardanese?

Based on my cow-orkers' difficulties, I'm gonna go with yes, yes it is.

• Is OAuth still fucking unpossible to implement without a PhD in webtardanese?

AIUI, OAuth is not too hard to implement if you're only supporting one provider. It's only when you want to support many providers that it gets truly miserable, because they each implement OAuth a bit differently and the big players (Google, Facebook, etc.) don't have much reason to change that.

• Salted vs. unsalted only makes a difference when cracking lots of passwords with different salts.

Doesn't it make a difference if you can find collisions easily but not brute-force easily? Without salt you can use your colliding password instead of the real password; with salt you have to find a collision that has the same salt embedded in it that the real password will get, which is much harder.

I think you are confusing a few concepts.

Collision attacks are about finding two inputs to a hash function which yield the same output. In case of a password database you don't know what the input looks like so you actually need brute-force first. Since secure hash functions try to minimize collisions, any "hit" will likely be the actual password and not a collision.

Meanwhile, salts help to hide information from an attacker. Because hashing functions are deterministic, having the same input to the function means you get the same output, thus revealing how often a given input is used by counting duplicates in hash outputs. If a non-salted password hash output occurs frequently, an attacker would know that the password is either a lazy-ass password which is easy to guess, or it's a common password which might be the default "Welcome" password or even one used by all accounts from a given company.

When a random salt is added and combined with the input, each stored hash should be unique. Any stored brute-force results quickly become meaningless unless you add an extra dimension "salt" to your rainbow table, suddenly increasing its size by some exponent.

• Doesn't it make a difference if you can find collisions easily but not brute-force easily? Without salt you can use your colliding password instead of the real password; with salt you have to find a collision that has the same salt embedded in it that the real password will get, which is much harder.

I think you are confusing a few concepts.

Collision attacks are about finding two inputs to a hash function which yield the same output. In case of a password database you don't know what the input looks like so you actually need brute-force first.

Hmm, if that's the case, then fair enough. I'm not at all up on the details but I assumed the collision attack was along the lines of "find a string that matches this hash even if it's not the original string". In that case what I said above would be correct, wouldn't it? If no salt is applied, your new string would have the correct hash and be accepted as the password. If salt is applied, you'd have to find a colliding string that contains that specific salt, which would probably wind up essentially as brute-forcing (unless the salt is really small).

If it's just that for a given source string, you can find another source string that has the same hash, or the even weaker condition of being able to find any two source stings that have the same hash, then the collision attack doesn't seem to make breaking passwords any easier as far as I can see, whether salted or not. Obviously salt is still preferable for the reasons you stated.

• I'm not at all up on the details but I assumed the collision attack was along the lines of "find a string that matches this hash even if it's not the original string". In that case what I said above would be correct, wouldn't it?

What you are describing is called a preimage attack: given only the output, find an input which hashes to the given output.

Since the input was unknown to you it is possible that you found a collision but you cannot know if it was the original string or not. Either way, if you found such an input then you should be able to use it to log in.

If no salt is applied, your new string would have the correct hash and be accepted as the password. If salt is applied, you'd have to find a colliding string that contains that specific salt, which would probably wind up essentially as brute-forcing (unless the salt is really small).

You start out by saying that you have "your new string". How did you find it if not through brute force? Remember that you know nothing about the input. Guessing or using a dictionary is also considered a brute force attack, it's just being smart about picking input values.

Let me give a contrived example. Let's say passwords can only be the numbers 1 through 9, the hash algo is `(i * 3) mod 10` and the output hash is `7`. You would then run the following operations:

• `hash(1) -> 3`
• `hash(2) -> 6`
• `hash(3) -> 9`
• `hash(4) -> 2`
• `hash(5) -> 5`
• `hash(6) -> 8`
• `hash(7) -> 1`
• `hash(8) -> 4`
• `hash(9) -> 7`

From this we can gather that `9` was the correct input.

Suppose we add a salt with value `2` and the output is `3` you would also need the following operations:

• `hash(2 + 1) -> 9`
• `hash(2 + 2) -> 2`
• `hash(2 + 3) -> 5`
• `hash(2 + 4) -> 8`
• `hash(2 + 5) -> 1`
• `hash(2 + 6) -> 4`
• `hash(2 + 7) -> 7`
• `hash(2 + 8) -> 0`
• `hash(2 + 9) -> 3`

In both cases you need brute force. Salting just makes sure that a new brute force operation needs to be run for that particular salt value, but even having no salt needs a brute force operation unless you can reuse cached results.

If it's just that for a given source string, you can find another source string that has the same hash, or the even weaker condition of being able to find any two source stings that have the same hash, then the collision attack doesn't seem to make breaking passwords any easier as far as I can see, whether salted or not. Obviously salt is still preferable for the reasons you stated.

In the previous example collisions were not possible. What if we allow numbers up to 20 and change the hash algo to `(i * 3) mod 11`? Guess the password for salt `3` and output `03`:

• `hash(3 + 01) -> 01`
• `hash(3 + 02) -> 04`
• `hash(3 + 03) -> 07`
• `hash(3 + 04) -> 10`
• `hash(3 + 05) -> 02`
• `hash(3 + 06) -> 05`
• `hash(3 + 07) -> 08`
• `hash(3 + 08) -> 00`
• `hash(3 + 09) -> 03`
• `hash(3 + 10) -> 06`
• `hash(3 + 11) -> 09`
• `hash(3 + 12) -> 01`
• `hash(3 + 13) -> 04`
• `hash(3 + 14) -> 07`
• `hash(3 + 15) -> 10`
• `hash(3 + 16) -> 02`
• `hash(3 + 17) -> 05`
• `hash(3 + 18) -> 08`
• `hash(3 + 19) -> 00`
• `hash(3 + 20) -> 03`

Both passwords `9` and `20` match, but keep in mind that to find this collision we had to run 20 operations. A real cracker would simply call it a day after finding `9` because all those other calculations are wasted.

• You start out by saying that you have "your new string". How did you find it if not through brute force?

As you noted, in that section I was assuming a pre-image attack was possible. That's how I was thinking of it, actually, but I didn't know it was a term of art in cryptography. It's not usual that using mathematical jargon makes one's posts easier to understand so I usually avoid it - though in retrospect I should have realised cryptography would be a likely exception. (I should also have thought about it a bit more deeply to realise the large gulf in strength between the two things.) I also said

the collision attack doesn't seem to make breaking passwords any easier as far as I can see, whether salted or not

which agrees with your post. In both cases you need to brute-force it; the existence of collisions "merely" reduces the expected number of attempts to get a solution. (Air quotes because this could be a large reduction.) But this is independent of whether or not you have an attack, so the existence of the attack is not important in this context.

On the other hand, there could be more subtle consequences of an attack than merely the ability to find random collisions. For instance, if there was an identifiable set of hashes that had many more collisions than average, an attacker could check to see which hashes were particularly vulnerable and break those ones more quickly. I have no idea whether this is true in the current cases, but that would be a bad feature for a hashing algorithm anyway, so I'd hope not. But it's pretty certain that someone somewhere can come up with a better way to exploit the attack than what I can think of off the top of my head.

Of course in most cases an attacker will probably get better results from dictionary attacks etc, unless they're targeting a particular account.

• If it's just that for a given source string, you can find another source string that has the same hash, or the even weaker condition of being able to find any two source stings that have the same hash, then the collision attack doesn't seem to make breaking passwords any easier as far as I can see, whether salted or not.

There's a big difference between finding collisions with a document vs with a relatively tiny password. Consider the range of possible hashes and note that a 1MB document has a much bigger space being mapped onto the same hash output as a password with (say) a max length of 16 characters.

• This is why, when a salted password database using high end hash algorithms is compromised, you still change your passwords. Too much work to compromise ALL the accounts, but relatively trivial to take on a few targeted ones.

I still don't get why there's so much concern over which hashing algorithm is used when the attacker already has direct access to your database (and therefore, all the data therein). Why would an attacker care about your credentials when s/he has access to all the underlying data regardless?

The only thing a more expensive hash would accomplish in that case is protect idiots who use the same password on multiple sites.

• @Groaner Maybe they didn't get access to the whole thing, but just the Users table. Maybe it was previously dumped using a no longer working exploit. Maybe the real data is in a different schema from the user logins.

But even if it's none of those and the hacker has the whole database, there's still value in

protecting idiots who use the same password on multiple sites

• @Groaner Maybe they didn't get access to the whole thing, but just the Users table. Maybe it was previously dumped using a no longer working exploit. Maybe the real data is in a different schema from the user logins.

Perhaps. All the companies I've worked for use a database user in the `sysadmin` role, though. I was even in a DBA interview at a fairly large company and when asked about security, I proposed that the app user should only get `GRANT EXECUTE` on procs and maybe `SELECT` on views, and the interviewer said that that would be too much management overhead. So it's probably likely that in a randomly-selected information system, the DB user is going to have unrestricted access.

But even if it's none of those and the hacker has the whole database, there's still value in

protecting idiots who use the same password on multiple sites

Sometimes, a child has to touch a hot stove before s/he learns not to.

• I still don't get why there's so much concern over which hashing algorithm is used when the attacker already has direct access to your database (and therefore, all the data therein). Why would an attacker care about your credentials when s/he has access to all the underlying data regardless?

• I still don't get why there's so much concern over which hashing algorithm is used when the attacker already has direct access to your database (and therefore, all the data therein). Why would an attacker care about your credentials when s/he has access to all the underlying data regardless?

Addressed above with the fact that many companies use `sa` as the app database login.

If your app user can only call stored procedures and select from views, then you have a point. Also, last I heard, stored procedures are so 1990's and the new fad is to use an ORM which requires write access to each of the tables anyway.

Addressed above with the fact that many companies use `sa` as the app database login.

I don't follow. If the server side of the application uses a superuser login to the database, how does that provide access to someone out on the Internet? The database server shouldn't even be reachable to anyone outside the local network.

Addressed above with the fact that many companies use `sa` as the app database login.

I don't follow. If the server side of the application uses a superuser login to the database, how does that provide access to someone out on the Internet? The database server shouldn't even be reachable to anyone outside the local network.

Exactly. At this point, we're assuming that several layers of the defense have already been breached.

The most plausible vector I can think of in which all those layers are simultaneously breached and the attacker has access to a large chunk of the data is a SQL injection vulnerability, or some other exploit. Being able to run ad hoc queries at will mean it's only a matter of time before an attacker can figure out enough about your schema to glean all the useful information from it. I've worked on systems that store unencrypted full addresses and SSNs. Who cares about passwords when that kind of info is readily available in the same database?

Other vectors (like a DB backup being uploaded somewhere it shouldn't) can be mitigated by other means. I've only started playing with encryption in SQL Server, but one can make it difficult to restore a database or read encrypted columns without having the server's master key.

Is there something I'm missing?

• The most plausible vector I can think of in which all those layers are simultaneously breached and the attacker has access to a large chunk of the data is a SQL injection vulnerability...

Sure, SQL injection is a Bad Thing (TM), and we've seen plenty of stories about it over on the front page. But we're talking about password hashing here.

• Exactly. At this point, we're assuming that several layers of the defense have already been breached.

Or that there's a strong correlation between the kind of incompetent twats that would use `sa` as their application user and the kind of incompetent twats that would make the database server Internet-accessible.

• The most plausible vector I can think of in which all those layers are simultaneously breached and the attacker has access to a large chunk of the data is a SQL injection vulnerability...

Sure, SQL injection is a Bad Thing (TM), and we've seen plenty of stories about it over on the front page. But we're talking about password hashing here.

Right, and apart from saving idiots who use the same password on multiple sites, what value-add is an expensive hash on those passwords over a less-expensive one, assuming that the attacker already has full DB access?

• Exactly. At this point, we're assuming that several layers of the defense have already been breached.

Or that there's a strong correlation between the kind of incompetent twats that would use `sa` as their application user and the kind of incompetent twats that would make the database server Internet-accessible.

Indeed. But that still doesn't answer my question as to why application user account credentials are so valuable when the attacker already has the entire database.

• The most plausible vector I can think of in which all those layers are simultaneously breached and the attacker has access to a large chunk of the data is a SQL injection vulnerability...

Sure, SQL injection is a Bad Thing (TM), and we've seen plenty of stories about it over on the front page. But we're talking about password hashing here.

Right, and apart from saving idiots who use the same password on multiple sites, what value-add is an expensive hash on those passwords over a less-expensive one, assuming that the attacker already has full DB access?

I don't think it's valid to "[assume] that the attacker already has full DB access". I would guess that the majority of data breaches are either vulnerabilities that allow an attacker to view the contents of a database (e.g. it's much easier to get SQL injection to show you an entire table than to modify a table) or someone getting something like a backup copy of the database.

I'm certainly not saying that your entire point is ridiculous. The only part that I disagree with is your assumption that most or all vulnerabilities lead to admin-level access to the live database. All the other times, using a strong hashing algorithm will help prevent additional problems.

• that still doesn't answer my question as to why application user account credentials are so valuable when the attacker already has the entire database

Because many people use the same username and password for many services. The breach in one place may allow an attack in another more valuable location, despite that other service following many more proper security practices.

• that still doesn't answer my question as to why application user account credentials are so valuable when the attacker already has the entire database

Because many people use the same username and password for many services. The breach in one place may allow an attack in another more valuable location, despite that other service following many more proper security practices.

This thread has wandered a bit off from the topic. The question is how you can get the original non-salted password brute forced so you can reuse it in another service, like @dkf said here.

The original database that would have been already breached is irrelevant as that's already a lost cause but the damage can be extended by actually reusing a weak password that is weakly hashed on some other service, say, GMail or Hotmail which gives you a lot more valuable data.

People are now crazy about using SHA-82938 or something else but the question is do MD5 and SHA-1 actually pose the kind of security issue for sane salted passwords that they would be as insecure as being in plain text. Can we actually attack them in that way and if Moore's Law applies, how long would it take to get there?

Regarding the topic, because we assume salt we can rule out collisions as it would be very unlikely to hit a collision of `saltFoo` when the actual salted plain text password was `saltBar` so we do know when we find the real password by comparing the salt to the brute forced string and that's the real value of the attack.

• If your app user can only call stored procedures and select from views, then you have a point. Also, last I heard, stored procedures are so 1990's and the new fad is to use an ORM which requires write access to each of the tables anyway.

You're assuming that the attacker has live network access to the DB as opposed to, e.g., a stolen backup tape or some other data dump.

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.