Some SQL advice, maybe?



  • So, my sql skills are sub-par.  I'm learning.  Scripting in
    ColdFusion MX 6.1 with MSSQL 2000.  I'm hoping some of you more
    experienced developers can point me in the right direction. 
    Here's a query I've been working with.  The intention is to select
    users that belong to a specific forum but not to a specific group
    within that forum.  Is there a better/faster/more efficient way to
    do this?

    SELECT a.iuserid AS iuserid, b.vchfirstname AS first, b.vchlastname AS last, b.vchnickname as nickname  
    FROM ftdb_forumusers AS a
    LEFT JOIN ftdb_users AS b ON (a.iuserid = b.iuserid)
    WHERE a.iforumid = #form.forum# AND
    NOT (a.iuserid IN (SELECT iuserid FROM ftdb_groupusers WHERE igroupid = #form.group#))
    ORDER BY b.vchlastname, b.vchfirstname







  • Sorry for the double post, but the post came out unintentionally screwey.  Is there not an edit button?



  • Hey Jeremy, I think the query engine will optimize this pretty much any way you go. But, two comments for ya:

    1) in T-SQL you can do "AND a.iuserid NOT IN (SELECT ...)" which makes it easier to read
    2) you could do an additional join instead of the IN, but I'd bet that wouldn't make a lick of difference ... and I'm not sure it's better ... go with whatever makes the most sense to you ... it's just another way to do it.
    SELECT
      a.iuserid AS iuserid,
      b.vchfirstname AS first,
      b.vchlastname AS last,
      b.vchnickname as nickname 
    FROM ftdb_forumusers AS a
      LEFT JOIN ftdb_users AS b ON (a.iuserid = b.iuserid)
      LEFT JOIN ftdb_groupusers g ON (a.iuserid <> g.iuserid)
    WHERE a.iforumid = #form.forum#
      AND g.igroupid = #form.group#
    ORDER BY b.vchlastname, b.vchfirstname

    When you come down to it, there's three index scans (a.iuserid, b.iuserid, and igroupid), a merge of the indexed results, and a bookmark lookup to ftdb_users. Pretty simple query.

    When in doubt, run the query in Query Analyzer to see the execute plan. It's an option (show explain plan). Just know full table scans are bad (unless you actually need to use every single row).



  • @Alex Papadimoulis said:

    SELECT
      a.iuserid AS iuserid,
      b.vchfirstname AS first,
      b.vchlastname AS last,
      b.vchnickname as nickname 
    FROM ftdb_forumusers AS a
      LEFT JOIN ftdb_users AS b ON (a.iuserid = b.iuserid)
      LEFT JOIN ftdb_groupusers g ON (a.iuserid <> g.iuserid)
    WHERE a.iforumid = #form.forum#
      AND g.igroupid = #form.group#
    ORDER BY b.vchlastname, b.vchfirstname

    Uhh...

    This alternate query may not produce correct results if a group can span multiple forums.  Consider the case where a single user exists for the given forum, and there are 10000 users in the given group, one of which is this user.

    As I read it, the query should return 0 rows, but instead it will return 9999. Or am I going crazy?



  • @Sean Reilly said:

    This alternate query may not produce correct results if a group can span multiple forums.  Consider the case where a single user exists for the given forum, and there are 10000 users in the given group, one of which is this user.

    As I read it, the query should return 0 rows, but instead it will return 9999. Or am I going crazy?

    #forum.group# is a constant expression, I'm assuming sent from the middle tier. I won't guarntee that the alternate query will actually work .. .actually I wouldn't even bet money on it. But, you get the idea; you can usually avoid an IN subquery with another join .... but in SQL2k, it doesn't really make a difference how ya do it ... now Oracle ... don't even get me started!



  • Please post DDL for these tables.

    Mr. Angry DBA



  • I think that is close, Alex, but beware of using inequalities in a join.   Also, you said the group belongs to a forum, so I will assume there needs to be a relationship expressed between those two as well.  As the mean DBA says, DDL of the tables involved (just relevant columns to the problem)  and some sample data would help a lot.  

    (I also want to plug http://www.sqlteam.com as the best place to get help with MS SQL.)

    Try:

    SELECT
      a.iuserid AS iuserid,
      b.vchfirstname AS first,
      b.vchlastname AS last,
      b.vchnickname as nickname 
    FROM ftdb_forumusers a 
      INNER JOIN ftdb_users b ON (a.iuserid = b.iuserid)
      LEFT JOIN ftdb_groupusers g ON a.iuserid = g.iuserid AND g.iGroupID = #Form.Group# and g.iForumID = a.iForumdID
    WHERE a.iforumid = #form.forum# and g.IUserID IS NULL
    ORDER BY b.vchlastname, b.vchfirstname

    The keys are:

    1) do an INNER JOIN between uers and forumusers, since the whole point is to return users in that particular forum.  (unless you don't have good RI and not all of your users are in the Users table)

    2) the second join is an OUTER JOIN just to that one specific group within that one specific forum for that 1 specific user.  If it returns a value (not null), the user is in that group.  If it returns NULL, then the user is NOT in that group.   Note the inclusion of the criteria for that one group in the join itself, not in the WHERE clause.

    by the way -- don't prefix your column names with the datatypes .... that's SO 1997 ....



  • Thanks for the input guys.  The tables were put together by a
    forum package, so I'm not terribly sure how to dig up the ddl; forgive
    my ignorance.



  • I think I can guess which part of the query is taking up the time, so I thought I'd take a crack at it:

    <FONT face="Courier New" size=2></FONT> 

    <FONT face="Courier New" size=2>[code language="sql"]</FONT>

    <FONT face="Courier New" size=2>SELECT a.iuserid AS iuserid, b.vchfirstname AS first, b.vchlastname AS last, b.vchnickname as nickname 
    FROM ftdb_forumusers AS a
    LEFT JOIN ftdb_users AS b ON (a.iuserid = b.iuserid)
    WHERE a.iforumid = #form.forum#
    AND NOT EXISTS (SELECT 1 FROM ftdb_groupusers WHERE iuserid=a.iuserid AND igroupid = #form.group#)
    ORDER BY b.vchlastname, b.vchfirstname
    [/code]

    </FONT>


  • How the heck do you do quoting with this editor?



    <font face="Courier New" size="2"><font face="Courier New" size="2">

    [code language="c#"]... AND NOT EXISTS (SELECT 1 FROM ...[/code]
    </font></font>



    The most commonly used idiom for the EXISTS clause is ... EXISTS
    (SELECT * FROM ...). The database isn't going to retrieve ANY rows for
    that subquery when it sees an EXISTS clause, so SELECT * is as good as
    anything else.



  • <FONT size=2>The most commonly used idiom for the EXISTS clause is ... EXISTS (SELECT * FROM ...). The database isn't going to retrieve ANY rows for that subquery when it sees an EXISTS clause, so SELECT * is as good as anything else.</FONT>

    Good point.  Made me want to dig up my Joe Celko book, but apparently an officemate has made off with it.  Curse you, officemate!  Curse youuu!!!



  • It depends on how MSSQL interprets this statement but NOT IN can be slow.  Look into not exists for another option.

    NOT EXISTS (SELECT iuserid FROM ftdb_groupusers WHERE igroupid = #form.group# AND iuserid = a.iforumid)

    Anothe approach (one that is less readable IMO) is to do an outer join on the 'not' table and add a and field is null on a column from the table you don't want.

    What is better depends on the DB and the relative size of the tables and whether fields are indexed.  You should consult your DBA if you really need help.

    Also, be careful building SQL as you have below.  I'm not sure about the syntax, but if somone can get form.fourm to evaluate to something like:

    1 AND 1 = 1

    They can change the structure of your SQL and get it to return things you did not want.  Generally it's best to use prepared statements.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.