Just saw this popup on Twitter

lusis

I had to create a forum account just to post this. I have some search columns in TweetDeck for things like #mysql, #ruby, #nagios. Well this gem just popped up in my feed:

from @nwkr:

"Make use of foreign keys and cascading delete in #MySQL. I didn't. A buyer had more products in an order than he ordered. And got it 4 free."

I really want to reply and point out that MySQL isn't the problem here.

*sigh*

DOA

See if you can mesage him: "Less twittering, more learning to code"

DescentJS

Twits are always twits

lolwtf

If he thinks that's the problem, he's probably never heard of SQL injection. Where did you say this website was?

lusis

@lolwtf said:

If he thinks that's the problem, he's probably never heard of SQL injection. Where did you say this website was?

Wish I knew. Depending on what they sell, might be a new career in my future ;)

I can imagine a possible scenario of what could have happened:

# "Buyer 2" buys some products:

 Buyers Products
| id | name | | buyer_id | product_id | quantity |
| 1 | Buyer 1 | | 2 | 42 | 5 |
| 2 | Buyer 2 | | 2 | 69 | 3 |
| 3 | Buyer 3 |

# "Buyer 2" gets deleted from the database

 Buyers Products
| id | name | | buyer_id | product_id | quantity |
| 1 | Buyer 1 | | 2 | 42 | 5 |
| 3 | Buyer 3 | | 2 | 69 | 3 |

# "Buyer 4" is added to the database, and id = 2 is reused

 Buyers Products
| id | name | | buyer_id | product_id | quantity |
| 1 | Buyer 1 | | 2 | 42 | 5 |
| 3 | Buyer 3 | | 2 | 69 | 3 |
| 2 | Buyer 4 |

# "Buyer 4" already has more products than really bought

Of course, this doesn't really explain why the buyer got them for free. But I'd bet that Products table was actually a ProductsToShip table.

Auction_God

@Zecc said:

# "Buyer 4" is added to the database, and id = 2 is reused

You'd have to be completely WTF'd to reuse an ID:

1) Column for id should be an identity. This will be unique (unless you purposely reset your identity to some value less than the max in the table).

2) Selecting Max(id) for self assigned id's is so much easier than looking for a missing number. I can't imagine a clueless person spending the time to do that.

It is much more likely that it is a race condition where they insert in the buyer's table, then before adding products doing a Select Max(id) from Buyers to find out what the just inserted id was....This is a much more likely failure mode for a clueless coder to make.

lusis

I actually got a followup tweet from the guy:

" I didn't delete children (from many-side) when deleting a parent. So children remained. Using ON DELETE CASCADE is the simplest way."

I think he's missing the point of my WTF. The database still isn't the problem. It's more systemic than that. I can understand in GENERAL (regardless of application) where not deleting children can happen.

random_next

@Auction_God said:

2) Selecting Max(id) for self assigned id's is so much easier than looking for a missing number. I can't imagine a clueless person spending the time to do that.

But it's sooooo easy...

[code] i = 1;
while("SELECT COUNT(*) FROM table WHERE id = i" > 0) { i++; }
"INSERT INTO table (id) VALUES(i)";
[/code]
See?

realmerlyn

Selecting max(ID)+1 for your next ID will rightfully get you on thedailywtf.com. Only the hopelessly naive do that, because it doesn't scale for multiple transactions.

But then, we're talking MySQL, where transactions until recently were either (a) very slow or (b) not used (see a).

Friends don't let friends start new projects with MySQL, now that PostgreSQL is mature.

morbiuswilters

@realmerlyn said:

Selecting max(ID)+1 for your next ID will rightfully get you on thedailywtf.com. Only the hopelessly naive do that, because it doesn't scale for multiple transactions.
But then, we're talking MySQL, where transactions until recently were either (a) very slow or (b) not used (see a).

Friends don't let friends start new projects with MySQL, now that PostgreSQL is mature.

Right, who needs the speed, stability, simple maintenance and high-availability of MySQL when you can have Postgres! Hope you like staring at an unresponsive console for a couple of hours while Postgres does its daily vacuum routine on a 100GB database.

Postgres is a fine DB if you don't expect anyone to actually use your product.

lusis

The database platform isn't the problem. I've dealt with LARGE databases on BOTH platforms for both OLTP and OLAP and the both have issues. Then again even the commercial dbs have issues too. Whether it's VACUUMing a huge DB on pgsql, performing a reorg on DB2 or running an optimize on MySQL.

Did you know indices are case sensitive on DB2? Did you know that innodb prefaces all indices with a subset of the PK value? Did you know that PGSQL can't (or at least couldn't) optimize queries with aggregate functions?

All of these things are "gotchas" that have no bearing on the validity of DB choice. In the case of my OP, the fucktard probably would have had the same issue regardless of database ;)

savar

This makes me feel better about not being on twitter. I have no idea what @nwkr is trying to say.

morbiuswilters

@lusis said:

The database platform isn't the problem. I've dealt with LARGE databases on BOTH platforms for both OLTP and OLAP and the both have issues. Then again even the commercial dbs have issues too. Whether it's VACUUMing a huge DB on pgsql, performing a reorg on DB2 or running an optimize on MySQL.
Did you know indices are case sensitive on DB2? Did you know that innodb prefaces all indices with a subset of the PK value? Did you know that PGSQL can't (or at least couldn't) optimize queries with aggregate functions?

All of these things are "gotchas" that have no bearing on the validity of DB choice. In the case of my OP, the fucktard probably would have had the same issue regardless of database ;)

Blah blah... postgres vacuum fulls take way, way longer than optimize on similarly-sized MySQL. Also, running optimize on MySQL speeds things up but isn't required. Not running vacuum on postgres will eventually result in the DB ovewriting existing rows at near-random before taking dropping trow, taking a massive shit all down the back of its legs, curling up in a fetal position with its shit-smeared buttocks still exposed to the air, and dying.

That's not even to mention the lack of good replication for postgres (fucking triggers? you fucking kidding me?) and the complete non-existence of any features similar to ndb. Postgres is also slow as fuck on queries, can't optimize queries for shit, spawns a process for every connect which takes for-fucking-ever (by web app standards, at least) and is prone to getting worker processes hung which ends up taking the whole DB down... Also, I'm pretty sure the Postgres devs are ugly.