SQL aggrivating two seperate table results in one query



  • I'm quite sure i'm asking for a WTF, but at the moment i just can't rewrite the surrounding code to allow for multiple queries.



    The problem is like this.

    I have a basic user table with some user data, and two tables that i need to count.



    let's call them branches and regions.

    Now the query kinda looks as follows:

    SELECT
    user.*,
    COUNT(`koppeling`.`region_id`) as total_regions,
    SUM(IF(`branches_user_koppel`.`branche_id` IN ('7'),true,false)) as `branches_match`
    FROM
    `user`
    LEFT JOIN `region_user_koppel` ON `user`.`id`=`region_user_koppel`.`user_id`
    LEFT JOIN `branches_user_koppel` ON `user`.`id`=`branches_user_koppel`.`user_id`
    GROUP BY user.id

    the table layout could be thought of as:

    user
    idname
    1tester
    region_user_koppel
    user_idregion_id
    11
    12
    branches_user_koppel
    user_idbranche_id
    11
    17

    the result set would look like this without the group by and all the fields

    user.id user.name region.user_id region.region_id branches.user_id branches.branche_id
    1 tester 1 1 1 1
    1 tester 1 2 1 1
    1 tester 1 1 1 7
    1 tester 1 2 1 7

    So the aggrivation will tell me that there are 4 regions instead of 2.

    Now i'm hoping it's possible to get a result set that looks like the following.

    user.id user.name region.user_id region.region_id branches.user_id branches.branche_id
    1 tester 1 1 NULL NULL
    1 tester 1 2 NULL NULL
    1 tester NULL NULL 1 7
    1 tester NULL NULL 1 7

    Please note that this is a simplified version of what i'm actually doing,



  • I think the word you are looking for is 'aggregating'. :)

    But...why do you want to do that?  I mean, it seems like you are asking for a single result set that has two distinct queries embedded in it that you want to process separately (you're even going to need 'if region_id is NULL' logic in the code that handles the result set)...so why not do two queries that you process separately?

    -cw



  • As far as I know you can't do what you're asking without using subselects.  You want to aggregate your data based on two different sets of criteria, which you can't do in a single select.

    Personally, I would just write 2 queries.  However you could write something like:

    SELECT user.,

        (SELECT COUNT() FROM Regions R WHERE R.userID = U.userID) as totalRegions,

        (SELECT COUNT(*) FROM Branches B WHERE B.userID = U.userID) as branchesMatch

    FROM user U

    WHERE userID = @userID 


     



  • Ohh, I missed where he said he wanted to count them.  There you go :)



  • @rmr said:

    As far as I know you can't do what you're asking without using subselects.  You want to aggregate your data based on two different sets of criteria, which you can't do in a single select.

    Personally, I would just write 2 queries.  However you could write something like:

    SELECT user.,

        (SELECT COUNT() FROM Regions R WHERE R.userID = U.userID) as totalRegions,

        (SELECT COUNT(*) FROM Branches B WHERE B.userID = U.userID) as branchesMatch

    FROM user U

    WHERE userID = @userID 

    Not recommended, but possible (just to show that subqueries are not inevitably necessary):

    select u.id, u.name, count(distinct r.region_id), count(distinct b.branche_id)
      from user u, region_user_koppel r, branche_user_koppel b
      where u.id = r.userId
         and u.id = b.userId
      group by u.id, u.name;

    That said, after you have confirmed it does what I promised, consider how it does it, then use subqueries ;-)
     

     



  • Ah, I see, very nice.  I guess you can't select * from the user table though, you have to group by each column that you're selecting in the table right?



  • @rmr said:

    Ah, I see, very nice.  I guess you can't select * from the user table though, you have to group by each column that you're selecting in the table right?

    Either that, or you use some aggregate function, e.g. max, on all columns but the key column(s).

    E.g. if "id" is the primary key column,

    select u.id, u.name from user u, ...
      group by u.id, u.name;

    gives the same result like

     

    select u.id, max(u.name) from user u, ...

      group by u.id;



     



  • Yeah, i ended up solving it with a subquery.

    SELECT
    	user.*,
    	COUNT(`region`.`user_id`) as total_matches
    FROM 
    	(
    	SELECT 
    		*,
    		SUM(IF(`branches_user_koppel`.`branche_id` IN (1,2,3),true,false)) as `branches_match`
    	FROM
    		`user`
    		LEFT JOIN `branches_user_koppel` ON `user`.`id`=`branches_user_koppel`.`user_id` 
    	GROUP BY 
    		`user`.`id`
    	) as `user`
    	LEFT JOIN region  ON region.user_id = user.id
    GROUP BY `user`.`id`
    
    

    it's quite a bit anonomized though, the real query has quite some more stuff in it, and is something like 90 lines or so :(

    I also planned some time in, to re-write the entire code piece.

    Because it started out as a little function to do a non-trival query, however it feature creeped more then your average office application.
    The surrounding code however did not let me do it in smaller steps, nor did i have time to rewrite the entire function. 

    I also was a bit mentally tired at the time. (and the query was quite aggravating :P )

     


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.