SQL Delete



  • I recently discovered a bug in a web application in which some data was not deleted in a delete procedure.

    Let me fabricate an equivalent example: Consider a database which has two tables, recipies and ingredient_lists. Each entry in recipies has one or more associated entries in ingredient_lists which are related via Foreign Key.

    Now the buggy delete procedure was deleting entries from recipies, but not from ingredient_lists, so now there are 'orphaned' entries in the ingredient_lists table which will need to be removed.

     
    I tried the following query (working with mySQL 4.1.20, both tables on the MyISAM engine):
    DELETE FROM ingredient_lists il WHERE NOT EXISTS (SELECT * FROM recipies rp WHERE rp.id = il.rid)

    mySQL says it's a syntax error, but I can't see the problem. Am I not allowed to do this complicated where statement for my delete query?

    I'd appreciate any suggestions on the best way to clear out this data. The database in question is quite large with some 6,000 orphaned rows in one table responsible for 2.6GB of that database. 



  • The query looks correct and it does seem that 4.1 allows the EXISTS keyword.  My only thought is perhaps the front end is parsing the query and not the engine itself, so the front end might be using an older dictionary of valid sql.  I seem to remember having that issue once.

    Personally, I don't like the EXISTS key word except when using build scripts and conditionally checking for objects so I don't attempt to recreate them.  For data checks like this, I use the IN keyword as it seems more clear to what it actually does. That's just the way my mind works though.

    In this case I'd issue the following

    DELETE FROM ingrediate_lists WHERE rid NOT IN (SELECT id FROM recipes)

    Going forward, I'd consider switching to InnoDB and setting up a foreign key between the tables and setting the ON DELETE behavior to CASCADE.  That way when a recipe is deleted all referenced ingredients are deleted as well.

     



  • @lpope187 said:

    The query looks correct and it does seem that 4.1 allows the EXISTS keyword.  My only thought is perhaps the front end is parsing the query and not the engine itself, so the front end might be using an older dictionary of valid sql.  I seem to remember having that issue once.

    Personally, I don't like the EXISTS key word except when using build scripts and conditionally checking for objects so I don't attempt to recreate them.  For data checks like this, I use the IN keyword as it seems more clear to what it actually does. That's just the way my mind works though.

    In this case I'd issue the following

    DELETE FROM ingrediate_lists WHERE rid NOT IN (SELECT id FROM recipes)

    Going forward, I'd consider switching to InnoDB and setting up a foreign key between the tables and setting the ON DELETE behavior to CASCADE.  That way when a recipe is deleted all referenced ingredients are deleted as well.

    In some cases (and this could be one of them), EXISTS is significantely faster than IN. You should chose between them depending on the situation, not because you like one more than the other.
     



  • Using the IN keyword worked. Perhaps you were correct about the frontend being the problem.

    Thanks for the assistance. 



  • @Cattlyst said:

    I recently discovered a bug in a web application in which some data was not deleted in a delete procedure.

    Let me fabricate an equivalent example: Consider a database which has two tables, recipies and ingredient_lists. Each entry in recipies has one or more associated entries in ingredient_lists which are related via Foreign Key.

    Now the buggy delete procedure was deleting entries from recipies, but not from ingredient_lists, so now there are 'orphaned' entries in the ingredient_lists table which will need to be removed.


    I tried the following query (working with mySQL 4.1.20, both tables on the MyISAM engine):
    DELETE FROM ingredient_lists il WHERE NOT EXISTS (SELECT * FROM recipies rp WHERE rp.id = il.rid)

    mySQL says it's a syntax error, but I can't see the problem. Am I not allowed to do this complicated where statement for my delete query?

    I'd appreciate any suggestions on the best way to clear out this data. The database in question is quite large with some 6,000 orphaned rows in one table responsible for 2.6GB of that database. 

    Just for a followup...

    I'm a SQL 2005 guy, and not a mySQL guy, so please let me know if this is a platform difference; but...  Shouldn't the foreign key constraint prohibit removing records from the parent table while the child table still has references to it?  You may want to make sure the key is set up correctly to prevent having orphaned data in the future.

    On a side note, another way to delete these records would be:  (in sql2k5)

    DELETE     il
    FROM        ingredient_lists il
    LEFT JOIN  recipies rp
         ON        il.rid = rp.id
    WHERE     rp.id IS NULL

     



  • myISAM doesn't enforce foreign key constraints, or at least, it didn't.  I haven't used it recently.  I tend to stick with InnoDB, or better yet, Postgres.



  • I can't check right now, but maybe the problem was the alias you gave to the table you tried to delete from? So it should probably be more like

    DELETE FROM ingredient_lists WHERE NOT EXISTS (SELECT * FROM recipies rp WHERE rp.id = ingredient_lists.rid)

    I'm pretty sure you can't alias UPDATEd tables, but I'm not sure about DELETE.


Log in to reply