I work for a company that shall remain anonymous, and I've run into a problem with the PHP/MySQL app I work on.
It's a big tangle of legacy code with a mess of a database behind it. Digging through the database, I've found no less than thirteen tables that have no primary key, and my particular problem involves one of them.
We have a table (let's call it Foo_Table) which is too big to alter. In MySQL just about any table alteration requires rebuilding the entire table, and this would be prohibitively expensive. At some point in the past, someone came up with the idea of extending the table using an attribute table (Foo_Attr). Foo_Attr is structured like this.
Foo_ID (int), Name (str), Value (str)
It would make sense to make the primary key of this table (Foo_ID, Name), but the developer who implemented the table decided to go without it. Now we have duplicate rows galore. I'm writing functionality to insert/update an attribute (let's say, 'Bar') and have the requirement that I only leave one copy of the attribute in the table.
Given that there's no primary key, the most straight forward way of doing this seems to be to delete, then insert a single row. Another developer on the team wants to query for the number of instances of attribute 'Bar', and if only one exists, update it. She makes an argument based on MySQL internals, but I think it's an unnecessarily complex from a coding standpoint.
What are the drawbacks/advantages of the two different approaches?
Both have the chance to be race conditions. The DELETE version has a lower chance of leaving duplicates though.
There is a third method, IMO better than the two already mentioned:
Do an update, then check how many rows have been updated (using mysql_affected_rows); if 0, do an insert. (If >1, you might want to do a little cleanup).
Oviously, this is still prone to race conditions, like any other method.
Yes!!! The infinitely extensible table. It's brilliant, and gets around the problem of being unable to ALTER TABLEs or CREATE TABLEs in the database later on.
Both approaches are awful. I suggest going to the datacenter, grabbing the server, and going Still on it.
Oops. Who woke him up?