SQL aggrivating two seperate table results in one query

stratos

I'm quite sure i'm asking for a WTF, but at the moment i just can't rewrite the surrounding code to allow for multiple queries.

The problem is like this.
I have a basic user table with some user data, and two tables that i need to count.

let's call them branches and regions.
Now the query kinda looks as follows:

SELECT
	user.*,
	COUNT(`koppeling`.`region_id`) as total_regions,
	SUM(IF(`branches_user_koppel`.`branche_id` IN ('7'),true,false)) as `branches_match`
FROM 
	`user`
	LEFT JOIN `region_user_koppel`   ON `user`.`id`=`region_user_koppel`.`user_id`
	LEFT JOIN `branches_user_koppel` ON `user`.`id`=`branches_user_koppel`.`user_id`
GROUP BY user.id

the table layout could be thought of as:

user
id	name
1	tester

region_user_koppel
user_id	region_id
1	1
1	2

branches_user_koppel
user_id	branche_id
1	1
1	7

the result set would look like this without the group by and all the fields

user.id	user.name	region.user_id	region.region_id	branches.user_id	branches.branche_id
1	tester	1	1	1	1
1	tester	1	2	1	1
1	tester	1	1	1	7
1	tester	1	2	1	7

So the aggrivation will tell me that there are 4 regions instead of 2.

Now i'm hoping it's possible to get a result set that looks like the following.

user.id	user.name	region.user_id	region.region_id	branches.user_id	branches.branche_id
1	tester	1	1	NULL	NULL
1	tester	1	2	NULL	NULL
1	tester	NULL	NULL	1	7
1	tester	NULL	NULL	1	7

Please note that this is a simplified version of what i'm actually doing,

CodeWhisperer

I think the word you are looking for is 'aggregating'. :)

But...why do you want to do that? I mean, it seems like you are asking for a single result set that has two distinct queries embedded in it that you want to process separately (you're even going to need 'if region_id is NULL' logic in the code that handles the result set)...so why not do two queries that you process separately?

-cw

rmr

As far as I know you can't do what you're asking without using subselects. You want to aggregate your data based on two different sets of criteria, which you can't do in a single select.

Personally, I would just write 2 queries. However you could write something like:

SELECT user.*,

(SELECT COUNT(*) FROM Regions R WHERE R.userID = U.userID) as totalRegions,

(SELECT COUNT(*) FROM Branches B WHERE B.userID = U.userID) as branchesMatch

FROM user U

WHERE userID = @userID

CodeWhisperer

Ohh, I missed where he said he wanted to count them. There you go :)

ammoQ

@rmr said:

As far as I know you can't do what you're asking without using subselects. You want to aggregate your data based on two different sets of criteria, which you can't do in a single select.
Personally, I would just write 2 queries. However you could write something like:
SELECT user.,
(SELECT COUNT() FROM Regions R WHERE R.userID = U.userID) as totalRegions,
(SELECT COUNT(*) FROM Branches B WHERE B.userID = U.userID) as branchesMatch
FROM user U
WHERE userID = @userID

Not recommended, but possible (just to show that subqueries are not inevitably necessary):

select u.id, u.name, count(distinct r.region_id), count(distinct b.branche_id)
from user u, region_user_koppel r, branche_user_koppel b
where u.id = r.userId
and u.id = b.userId
group by u.id, u.name;

That said, after you have confirmed it does what I promised, consider how it does it, then use subqueries ;-)

rmr

Ah, I see, very nice. I guess you can't select * from the user table though, you have to group by each column that you're selecting in the table right?

ammoQ

@rmr said:

Ah, I see, very nice. I guess you can't select * from the user table though, you have to group by each column that you're selecting in the table right?

Either that, or you use some aggregate function, e.g. max, on all columns but the key column(s).

E.g. if "id" is the primary key column,

select u.id, u.name from user u, ...
group by u.id, u.name;

gives the same result like

select u.id, max(u.name) from user u, ...
group by u.id;

stratos

Yeah, i ended up solving it with a subquery.

SELECT
	user.*,
	COUNT(`region`.`user_id`) as total_matches
FROM 
	(
	SELECT 
		*,
		SUM(IF(`branches_user_koppel`.`branche_id` IN (1,2,3),true,false)) as `branches_match`
	FROM
		`user`
		LEFT JOIN `branches_user_koppel` ON `user`.`id`=`branches_user_koppel`.`user_id` 
	GROUP BY 
		`user`.`id`
	) as `user`
	LEFT JOIN region  ON region.user_id = user.id
GROUP BY `user`.`id`

it's quite a bit anonomized though, the real query has quite some more stuff in it, and is something like 90 lines or so :(

I also planned some time in, to re-write the entire code piece.

Because it started out as a little function to do a non-trival query, however it feature creeped more then your average office application.
The surrounding code however did not let me do it in smaller steps, nor did i have time to rewrite the entire function.

I also was a bit mentally tired at the time. (and the query was quite aggravating :P )