Select from two tables vs. Join



  • There's something I've come across a few times in the codebase that I'm working on now, and I'm really not certain if it's a bad thing or not.  Putting aside the fact that it's inline SQL rather than a stored proc, I just don't know enough about the MySQL engine to make this call.

     I see all the time this:

     SELECT * from First_Table ft, Second_Table st
    WHERE ft.id = st.id...

    Whereas I would do:

    SELECT * From First_Table ft INNER JOIN Second_Table st on ft.id = st.id

    Is their syntax just another way of doing a join, or is it doing something less than optimal?



  • It's called an implicit join.



  •  Thanks.  I've got some reading to do. =)



  • The first example is "old school" syntax that predates ANSI SQL. They will compile to the same thing.

    Left and Right joins were accomplished using a similar language construct, depending on the database server...

    INNER JOIN:  =
    LEFT JOIN:  *= or +=
    RIGHT JOIN:  =* or =+
    

    -RJ-



  • Yes, it's a bad thing.  This join syntax was old ten years ago.  The biggest problem with it is that is doesn't fail gracefully.  With an ANSI join, if you forget the join criteria, you get an error and you fix it.  With the non-ANSI syntax, you get a cartesian product.  If your lucky, you kill the query because it is taking so long and fix it.  If you are unlucky, your sample data one has one row in one of the tables and you release it to production.  Once in production, it proceeds to return inaccurate data and if you are really unlucky, renders the server unusable due to too many trillion-row result sets being returned.



  • @Jaime said:

    Yes, it's a bad thing.  This join syntax was old ten years ago.  The biggest problem with it is that is doesn't fail gracefully.  With an ANSI join, if you forget the join criteria, you get an error and you fix it.  With the non-ANSI syntax, you get a cartesian product.  If your lucky, you kill the query because it is taking so long and fix it.  If you are unlucky, your sample data one has one row in one of the tables and you release it to production.  Once in production, it proceeds to return inaccurate data and if you are really unlucky, renders the server unusable due to too many trillion-row result sets being returned.
     

    Been there, done that, haven't you?


Log in to reply