Join sandwich



  • So, I have a table of objects (widgets, what-have-you) that has a foreign key relationship to a table of 1-D ranges. These 1-D ranges, in turn, do not define their endpoints directly, but by a foreign key relationship to a table of endpoint-objects.

    As SQL:
    [code]
    create table widgets (
    id varchar primary key,
    endpoints varchar foreign key ranges (id),
    widget_data varchar);
    create table ranges (
    id varchar primary key,
    left varchar foreign key endpoints (id),
    right varchar foreign key endpoints (id));
    create table endpoints (
    id varchar primary key,
    endpoint_data varchar);
    [/code]

    What I need is each widget's left and right endpoints along with the widget ID and widget data, with one row per widget. Unfortunately, I don't even know what to call this sort of join, and writing it in the obvious-to-me way, namely SELECT w.id w.widget_data, l.endpoint_data, r.endpoint_data FROM widgets w, ranges, endpoints l, endpoints r WHERE w.endpoints = ranges.id AND (ranges.left = l.id OR ranges.right = r.id); just causes the DBMS to bog down (I suspect it returns an excess of rows, which is why the database coughs and sputters).



  • Almost. You need AND between the two ranges joins. I find this notation is easier to read for complicated joins:

    SELECT w.id, w.widget_data, l.endpoint_data, r.endpoint_data
    FROM widgets w
    INNER JOIN ranges ra ON w.endpoints = ra.id
    INNER JOIN endpoints l ON ra.left = l.id
    INNER JOIN endpoints r ON ra.right = r.id



  • Lo and behold:

    1. I was dead on right with the suspicion that my initial query returned way too many rows (Cartesian products are a fantastic way to get 3 billion rows out of 30k-100k row tables).
    2. Changing the OR to an AND worked.


  • This is why you should always use ANSI style joins (as @PleegWat says) . And check explain plans, because shit like Cartesian joins tend to jump out at you for nontrivial data.



  • @boomzilla said:

    This is why you should always use ANSI style joins (as @PleegWat says) . And check explain plans, because shit like Cartesian joins tend to jump out at you for nontrivial data.

    Yeah, you don't even need an explain plan to tell you that you're doing an unwanted Cartesian join. A count(*) version of the query usually suffices.

    Agreed that I probably should be using ANSI join syntax (I have in the past, it seems like I wobble between it and old-style FROM joins rather aimlessly though).



  • @tarunik said:

    Yeah, you don't even need an explain plan to tell you that you're doing an unwanted Cartesian join. A count() version of the query usually suffices.

    The explain plan will tell you before you even need to run it for the count(
    ) though.

    I tend to use ANSI except we use Oracle 10g and there's a bug in our specific version where if you use ANSI and the sum of the amount of columns in the joined tables (whether the columns are actually selected or not) is higher than some number then it throws an error, and you need to do it the other way.



  • @loopback0 said:

    except we use Oracle 10g

    The I-hate Oracle club is crying gathering over there :twisted_rightwards_arrows:


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.