Join? Who needs a join?



  • As WTFs go, this one ranks pretty low I guess, partly because it is an
    easy mistake to make and partly because it comes from a setting where
    WTFs are the rule, not the exception. Said setting is  "extending
    ancient PHP code" (beware of this. depending on the quality of the
    comments, it's frequently easier to rewrite the thing).



    ---



    The thing about relational databases is that the information you want
    is usually stored in several different tables. So how do you get at
    this information?



    Simple. You first query the first table that has all the entries: (note: table/field names changed)



    $sql = "SELECT * FROM table1 WHERE 1 ORDER BY Db";



    Then, you use a simple while loop to go through the resulting rows,
    sending additional queries to the database to get the other table's
    data for each of the rows...



     while ($row = mysql_fetch_object($result)) {

    ...

       
              $sql2 = "SELECT nname,
    vname FROM table2 WHERE t_id=$row->table2_foreign_key";

    ...

    }



    Clear, no?



    Now if only you could get around the tremendous increase in loading
    time that comes from the repeated passage of the bottleneck between the
    web server and the mysql database. Clearly, normalization is overrated
    with its way of dividing your information over so many different
    tables...



  • Its even weaker than your WTF, but today I came across something alike. I didn't say wtf, but I thought it and looked up who wrote it :)

    It was along the lines of

    while select * from mytable
        if mytable.field = x
            all code


    I replaced it with a where clause immediately and went on with the real issue I was going for.
    Your post just reminded me :)



  • @t-bone said:

    Its even weaker than your WTF, but today I came across something alike. I didn't say wtf, but I thought it and looked up who wrote it :)

    It was along the lines of

    while select * from mytable
        if mytable.field = x
            all code


    I replaced it with a where clause immediately and went on with the real issue I was going for.
    Your post just reminded me :)


    those wondering what language that is, http://en.wikipedia.org/wiki/X++

    jup sql directly in your code, and tables react like objects, sweetness (though i left out the while and if bracket delimiters)



  • I feel your pain...

    I inherited a "Classic ASP" project that had reports generated as HTML.  One of the reports looped over a recordset and then opened up more recordsets based on the current row inside the loop very similar to your situation.

    I'm not trying to one-up you, but there were a few differences:

    1) There were a total of NINE recordsets opened at each loop iteration.  The main table had abot 2000 records in it, so do that math...that's 18,000 recordsets opened and closed by one ASP page.  Needless to say, the page had one of those "Please wait for the report to open; this may take several minutes" messages at the top.

    2) All the tables in question had no primary keys or indexes defined.  I wanted to add them, but that was impossible because the application created duplicate records due to coding errors so I couldn't define primary key fields without a massive data cleanup.

    3) The original coder called the recordset variables "rsRec1" through "rsRec9", even though they opened specific tables containing a specific class of data.  That was always fun to debug; I had a post-it on my monitor with a key, like "rsRec3 = Owners, rsRec5 = Parcels" etc.

    4) The recordsets were not explicitly closed inside the loop; they stayed in memory in "EOF" state until they were garbage collected.



  • I learned the hard way why most PHP coders avoid SQL joins. A week ago,
    one of our Web applications stopped working properly. The culprit turns
    out  to be an otherwise innocuous double-join statement (select
    ... from (x join y) join z ...). All of a sudden, it was returning
    garbage. Something had happened with the server, which in not under our
    control. I remembered the MySQL documentation saying that if the client
    library doesn't match the server version exactly,
    strange things will happen. Obviously, the server's admin had upgraded
    part of the software - the wrong part, that is (and of course, they
    wouldn't admit it). And nobody else had noticed the problem because the
    simpler SQL queries were working just fine!



    So, let's count the WTFs:


    1. MySQL having such a compatibility issue.
    2. Server administrators not knowing it, though it's clearly documented.
    3. Most PHP coders not knowing it either because they never go beyond the basics of  SQL anyway?



      Web development. Aarrgh...




  • I just thought I had found a legitimate reason for why one might do the
    join in this fashion - making a comma-separated list of all attached
    entries for an 1:n relationship (say, multiple persons working on a
    project).



    This was because the programmer had used "while" to loop through all
    the attached entries, so if there were several of them, all would be
    displayed.



    Then I noticed that according to the specification, it was a 1:1
    relation. There would only ever be ONE entry; the while loop was
    completely obsolete.



    Ah well, it is hardly a miracle anymore that all the forms have the
    attribute "Methode=GET" (Methode is German for "method"). The only
    thing the author probably wondered about was why the attribute
    "Methode=POST" would have no effect and the form would always be sent
    with get.



    Post forms were disabled on the server, he must have reasoned..



  • Sorry for the double post.



    This is it, I'm tossing the whole thing and rewriting from scratch. I'm
    almost in tears from reading that code. This program manages the
    requests for databases on our server. The part I'm looking at is an
    intermediate form between entering the data and executing the generated
    sql.



    Apparently, someone thought that generating the sql twice - once for
    displaying and once for executing - would be a waste, and instead we
    have this.



    Excerpt:


    <input type='hidden' name=root_pass value=xxxx>
    <input type="hidden" name="sql" value=CREATE+DATABASE+db_name%3BGRANT+ALL+ON+ [...]>

    (and yes, the lack of quotes around the value is real. Why do you think the special characters are masked?)



    Oh, and the mysql connection id is also passed through a hidden field. Because reconnecting to the server would be a waste.



  • I'll admit that this isn't a great way to do this, but if you don't know how joins work, you obviously won't use them.  Many PHP/MySQL tutorials skip joins, and once something works, why change it? (Not my line of thought, but one that is undoubtedly employed here).

    Last year, I wrote a Web site for my school.  It was my first PHP/MySQL project, so I just hacked things together however they happened to go.  It was much later that I realized how bad my code was (looked like the stuff here), and I went back and re-wrote the whole thing. ;)

    Another question: why put a WHERE clause in there at all, if you're going to select all WHERE 1 (always true)?  MySQL automatically picks everything if you skip the WHERE clause, no?  Waste of 7 bytes, and probably a little processing power...



  • "where 1" is a bit like "PleaseCompile", I guess. Once something didn't
    work as expected, and they added "where 1" and perhaps fixed some typo
    along the way, and now it is standard procedure to add this voodoo
    charm.



    Or perhaps it is supposed to make the query more understandable. But that's what comments are for.



  • @Arancaytar said:

    "where 1" is a bit like "PleaseCompile", I guess. Once something didn't
    work as expected, and they added "where 1" and perhaps fixed some typo
    along the way, and now it is standard procedure to add this voodoo
    charm.



    Or perhaps it is supposed to make the query more understandable. But that's what comments are for.


    Another possibility: They build a query string based on search parameters.It's a bit easier if you always start with a where clause, so you can always add " and ...".
    Pseudocode:
    query = "select * from foobar where 1"
    if searchblaa<>"" then query += " and blaa='"+searchblaa+"'"
    if searchbluu<>"" then query += " and bluu='"+searchbluu+"'"
    if searchblii<>"" then query += " and blii='"+searchblii+"'"

    query += " order by barfoo"




  • @warmachine said:

    Due to a sucky table design I inherited, the screen to display the list of outstanding transactions links in the parent account and 3, other tables. The performance of the JOIN for 20 records is not lightning fast.


    Ummm.  You must have an exceptionally sucky design, because any sensible database should shrug a situation like that off like water off a duck's back.  As an example, on one project I worked on, getting a list of transactions for a customer for a particular query involved joining upwards of 10 table, many of which were left joins, and two of which were on computed columns.  The source data consisted of millions of rows on the parent table.  But webserver and net performance outscaled the database lag - clients never noticed.  Yes, it involved some careful query optimising, but performance was more than acceptable, which is what counts.

    So I'd have to suggest that what you've probably done is to replace something relatively easily fixed by a DBA with something difficult to maintain.  But I can't be sure without seeing an example.

    Simon


Log in to reply