SQL - Multiple counts
-
I'm responsible for a report the gets all of the demographic counts of our clients in a year.
I have to group the data in various ways.
There has to be better way than:
/* gender */ update d set total = (select count(ID) from #tempCust c where c.group = d.group and gender = d.label) from #demoCts where category = 'Gender' update d set total = (select count(ID) from #tempCust c where c.subgrouptype1= d.subgrouptype1 and gender = d.label) from #demoCts where category = 'Gender'
Repeat with other demographic data and subgroups.
#tempCust is a table basically , ID, Group, SubgroupType1, SubgroupType2, Gender, Hispanic, Disabled, etc
sample data:
ID,Group, SubgroupType1, SubgroupType2, Gender, Hispanic, Disabled.
1,Green,Apple,Car,Male,Yes,No
2,Yellow,Apple,Train,Female,No,No
3,Red,Kiwi,Car,Male,Yes,Yes
4,Green,Grape,Train,Female,No,Yes#demoCts holds the totals for each demographic point for all, for groups, and for subgroups. (There is a second table for the other subgrouptype)
total,category_id,label,group,subgrouptype1
XX,Gender,Male,Green,Apple
XX,Gender,Male,Green,Kiwi
XX,Gender,Male,Green,Grape
XX,Gender,Male,Green,ALL
XX,Gender,Female,Green,Apple
XX,Gender,Female,Green,Kiwi
XX,Gender,Female,Green,Grape
XX,Gender,Female,Green,ALLEDIT: Clarification, sample data
EDIT2: There are 20 or so categories of demographic data.
-
Have you tried:
update d set total = (select count(ID) from #tempCust c where c.group = d.group and gender = d.Label) from #demoCts where category = 'Gender'
???
I do that sort of thing all the time in Oracle, although it would look more like
update table_d d ...
. That is, I alias the updating table and use that alias inside the query.
-
Might be T-SQL specific.
WITH GroupedCounts AS (SELECT COUNT(*) AS Total, c.Group, c.Gender FROM #tempCust c GROUP BY c.Group, c.Gender) UPDATE d SET d.Total = c.Total FROM #demoCts d JOIN GroupedCounts c ON c.Group = d.Group and c.Subgroup = d.Gender
-
Not sure this is what you need, but something like:
update #demoCts set genderMale = count(*) from genderMale, genderFemale = count(*) from genderFemale (etc) from #demoCts d join #tempCust genderMale on 1=1 and genderMale.gender = 'male' join #tempCust genderFemale on 1=1 and genderFemale.gender = 'female' (etc)
I didn't run it but it should work fine.
As a note, questions like this would be a lot easier to answer if you could give us a few rows of sample data in both the temp tables, because I'm confused about what the original example is doing.
-
@boomzilla said in SQL - Multiple counts:
Have you tried:
update d set total = (select count(ID) from #tempCust c where c.group = d.group and gender = d.Label) from #demoCts where category = 'Gender'
???
I do that sort of thing all the time in Oracle, although it would look more like
update table_d d ...
. That is, I alias the updating table and use that alias inside the query.I'm doing that. Screw up on the attempting to anon. It is the 50 repetitive statements i'm trying to avoid.
-
@karla said in SQL - Multiple counts:
There has to be better way than:
I would consider something like:
insert into #demoCtsResultInterrim values ( select group, gender, SUM(CASE WHEN c.group = d.group and gender = d.gender THEN 1 ELSE 0 END) total FROM #tempCust c GROUP BY group, gender )
And I probably botched everything because I'm not reading comprehensively...
-
@unperverted-vixen said in SQL - Multiple counts:
Might be T-SQL specific.
WITH GroupedCounts AS (SELECT COUNT(*) AS Total, c.Group, c.Gender FROM #tempCust c GROUP BY c.Group, c.Gender) UPDATE d SET d.Total = c.Total FROM #demoCts d JOIN GroupedCounts c ON c.Group = d.Group and c.Subgroup = d.Gender
Subgroups are completely unrelated to demographic categories. I will need counts of all genders for each group and each subgroup.
-
So it's a full combinatoric? You need a count for every possible combination of the source data?
You could put every column of the source data into a string, then count distinct strings. (Pick a string delimiter that won't conflict with the source data.)
-
@karla This almost sounds like a job for
WITH CUBE
orWITH ROLLUP
.
-
-
@masonwheeler That's a cheerful image for
rollup
, I was picturing a cheap rented office being stripped to the walls and some people going for a long swim with carpets.
-
@karla said in SQL - Multiple counts:
@boomzilla said in SQL - Multiple counts:
Have you tried:
update d set total = (select count(ID) from #tempCust c where c.group = d.group and gender = d.Label) from #demoCts where category = 'Gender'
???
I do that sort of thing all the time in Oracle, although it would look more like
update table_d d ...
. That is, I alias the updating table and use that alias inside the query.I'm doing that. Screw up on the attempting to anon. It is the 50 repetitive statements i'm trying to avoid.
You're not doing that in your statement. You're repeating the statements for each gender instead of letting the query do it for you. Here, I'll reformat and maybe it will be clearer:
update d set total = ( select count(ID) from #tempCust c where c.group = d.group and gender = d.Label) from #demoCts where category = 'Gender'
Note this bit in particular:
and gender = d.Label
. Which looking at it again, you must be able to do since you were already doing it with thegroup
columns.
-
@blakeyrat said in SQL - Multiple counts:
So it's a full combinatoric? You need a count for every possible combination of the source data?
Ultimately, I only need overall totals but in order to validate the numbers, from other reports I need to see the groups/subgroups numbers.
You could put every column of the source data into a string, then count distinct strings. (Pick a string delimiter that won't conflict with the source data.)
Trying to wrap my head around this.
-
@karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:
select d.*, count( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled ) as 'count' group by d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled
That gives you a count of each combination of columns you listed in your string.
(Again I haven't run this myself, syntax may be wrong. My T-SQL is rusty as shit. Etc.)
-
@blakeyrat said in SQL - Multiple counts:
@karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:
select d.*, count( distinct( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled )) group by d.*
That gives you a count of each combination of columns you listed in your string.
(Again I haven't run this myself, syntax may be wrong.)
OK, now I see what you are saying.
I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc
(individually).Groups and subgroups to help validate the numbers. As we have no other report that goes across all groups.
-
@boomzilla said in SQL - Multiple counts:
@karla said in SQL - Multiple counts:
@boomzilla said in SQL - Multiple counts:
Have you tried:
update d set total = (select count(ID) from #tempCust c where c.group = d.group and gender = d.Label) from #demoCts where category = 'Gender'
???
I do that sort of thing all the time in Oracle, although it would look more like
update table_d d ...
. That is, I alias the updating table and use that alias inside the query.I'm doing that. Screw up on the attempting to anon. It is the 50 repetitive statements i'm trying to avoid.
You're not doing that in your statement. You're repeating the statements for each gender instead of letting the query do it for you. Here, I'll reformat and maybe it will be clearer:
update d set total = ( select count(ID) from #tempCust c where c.group = d.group and gender = d.Label) from #demoCts where category = 'Gender'
Note this bit in particular:
and gender = d.Label
. Which looking at it again, you must be able to do since you were already doing it with thegroup
columns.I realized that and corrected. But I need like 20 different demographic counts (gender being only one).
-
-
@karla said in SQL - Multiple counts:
@blakeyrat said in SQL - Multiple counts:
@karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:
select d.*, count( distinct( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled )) group by d.*
That gives you a count of each combination of columns you listed in your string.
(Again I haven't run this myself, syntax may be wrong.)
OK, now I see what you are saying.
I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc
(individually).Groups and subgroups to help validate the numbers. As we have no other report that goes across all groups.
Ah. I get it now. I read the code but not so much the words. Well, with this approach you only need one per category, at least. If you have a mapping of categories to columns you might be able to generate the list of queries automatically. Because I think it's the repetition that you're fighting, not having to run multiple queries which you could batch up into a single run.
-
@karla said in SQL - Multiple counts:
I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc.
So you need a count of:
Group + Subgroup + Male
Group + Subgroup + Female
Group + Subgroup + Disabled
etc?Or phrased more generally, a count of each:
Group + Subgroup + (Each option in each column other than ID, Group, Subgroup)?
In T-SQL there's a
PIVOT
operator that turns columns into rows. You could probably automatically generate the desired table using that functionality.
-
@boomzilla said in SQL - Multiple counts:
@karla said in SQL - Multiple counts:
@blakeyrat said in SQL - Multiple counts:
@karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:
select d.*, count( distinct( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled )) group by d.*
That gives you a count of each combination of columns you listed in your string.
(Again I haven't run this myself, syntax may be wrong.)
OK, now I see what you are saying.
I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc
(individually).Groups and subgroups to help validate the numbers. As we have no other report that goes across all groups.
Ah. I get it now. I read the code but not so much the words. Well, with this approach you only need one per category, at least.
Plus the WITH ROLLUP
If you have a mapping of categories to columns you might be able to generate the list of queries automatically. Because I think it's the repetition that you're fighting, not having to run multiple queries which you could batch up into a single run.
Yeah, because obviously that is mistake prone and this report is hard to validate and now that our BA is retiring...I need to be very careful.
Generation was what I was thinking before I posted the thread. I was hoping I was missing something else to make it a easier.
-
@blakeyrat said in SQL - Multiple counts:
PIVOT
Yeah, I think this may do it (or at least do much of it).
-
I have to admit I made a scratch table for this and I'm having a lot of trouble getting it to work without using "reflection" (aka. information_schema table queries) to yank out the column names. It's trickier than it seems.
-
Scratch table:
/****** Object: Table [dbo].[scratch] Script Date: 7/18/2018 2:40:10 PM ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[scratch]( [id] [bigint] IDENTITY(1,1) NOT NULL, [group] [nvarchar](50) NOT NULL, [subgroup] [nvarchar](50) NOT NULL, [gender] [nvarchar](50) NOT NULL, [disabled] [nvarchar](50) NOT NULL, CONSTRAINT [PK_scratch] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO SET IDENTITY_INSERT [dbo].[scratch] ON GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (1, N'animal', N'human', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (2, N'animal', N'human', N'female', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (3, N'animal', N'dog', N'female', N'yes') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (4, N'mineral', N'graphite', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (5, N'mineral', N'diamond', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (6, N'animal', N'human', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (7, N'animal', N'dog', N'female', N'yes') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (8, N'animal', N'dog', N'female', N'yes') GO SET IDENTITY_INSERT [dbo].[scratch] OFF GO
Query:
select s.[group], s.subgroup, s.colname, s.colval, count(*) as 'total' from scratch s unpivot ( colval for colname in ( gender, [disabled] ) ) s group by s.[group], s.subgroup, s.colname, s.colval
Output of query:
group subgroup colname colval total animal dog disabled yes 3 animal dog gender female 3 animal human disabled no 3 animal human gender female 1 animal human gender male 2 mineral diamond disabled no 1 mineral diamond gender male 1 mineral graphite disabled no 1 mineral graphite gender male 1
Turns out
PIVOT
wasn't the answer, butUNPIVOT
was. You still need to list the names of your columns if the query (although if you wanted to get really fancy you could grab those from a dynamic SQL statement).
-
@blakeyrat said in SQL - Multiple counts:
Scratch table:
/****** Object: Table [dbo].[scratch] Script Date: 7/18/2018 2:40:10 PM ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[scratch]( [id] [bigint] IDENTITY(1,1) NOT NULL, [group] [nvarchar](50) NOT NULL, [subgroup] [nvarchar](50) NOT NULL, [gender] [nvarchar](50) NOT NULL, [disabled] [nvarchar](50) NOT NULL, CONSTRAINT [PK_scratch] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO SET IDENTITY_INSERT [dbo].[scratch] ON GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (1, N'animal', N'human', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (2, N'animal', N'human', N'female', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (3, N'animal', N'dog', N'female', N'yes') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (4, N'mineral', N'graphite', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (5, N'mineral', N'diamond', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (6, N'animal', N'human', N'male', N'no') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (7, N'animal', N'dog', N'female', N'yes') GO INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (8, N'animal', N'dog', N'female', N'yes') GO SET IDENTITY_INSERT [dbo].[scratch] OFF GO
Query:
select s.[group], s.subgroup, s.colname, s.colval, count(*) as 'total' from scratch s unpivot ( colval for colname in ( gender, [disabled] ) ) s group by s.[group], s.subgroup, s.colname, s.colval
Output of query:
group subgroup colname colval total animal dog disabled yes 3 animal dog gender female 3 animal human disabled no 3 animal human gender female 1 animal human gender male 2 mineral diamond disabled no 1 mineral diamond gender male 1 mineral graphite disabled no 1 mineral graphite gender male 1
Turns out
PIVOT
wasn't the answer, butUNPIVOT
was. You still need to list the names of your columns if the query (although if you wanted to get really fancy you could grab those from a dynamic SQL statement).This helps because I was having trouble with it.
I'll probably have a better idea tomorrow after having slept on it.
-
@karla One interesting quirk about this query is it breaks
group by all
functionality in SQL Server. And you can't unpivotby all
apparently.If a particular combination has a total of zero, you just can't get it in the output. AFAICT. Hopefully that's not a problem for you.
-
@blakeyrat said in SQL - Multiple counts:
@karla One interesting quirk about this query is it breaks
group by all
functionality in SQL Server. And you can't unpivotby all
apparently.If a particular combination has a total of zero, you just can't get it in the output. AFAICT. Hopefully that's not a problem for you.
Never used "group by all" and since I'm doing an update rather than insert zeros shouldn't be a problem.
-
If your database software supports
COUNT(*) OVER (PARTITION BY ...)
, you can do this without the subqueries.Here's the PostgreSQL documentation:
I'm pretty sure it's also supported in Oracle and MSSQL.
-
@ben_lubar said in SQL - Multiple counts:
I'm pretty sure it's also supported in ... MSSQL.
Can confirm.
-
@ben_lubar Oracle has window functions as well. It also has unpivot which is probably a better fit here.