SQL - Multiple counts



  • I'm responsible for a report the gets all of the demographic counts of our clients in a year.

    I have to group the data in various ways.

    There has to be better way than:

    /* gender */
    update d set total = (select count(ID) from #tempCust c where  c.group = d.group and gender = d.label)
    from #demoCts 
    where  category = 'Gender' 
    
    update d set total = (select count(ID) from #tempCust c where  c.subgrouptype1= d.subgrouptype1 and gender = d.label)
    from #demoCts 
    where  category = 'Gender' 
    
    

    Repeat with other demographic data and subgroups.

    #tempCust is a table basically , ID, Group, SubgroupType1, SubgroupType2, Gender, Hispanic, Disabled, etc
    sample data:
    ID,Group, SubgroupType1, SubgroupType2, Gender, Hispanic, Disabled.
    1,Green,Apple,Car,Male,Yes,No
    2,Yellow,Apple,Train,Female,No,No
    3,Red,Kiwi,Car,Male,Yes,Yes
    4,Green,Grape,Train,Female,No,Yes

    #demoCts holds the totals for each demographic point for all, for groups, and for subgroups. (There is a second table for the other subgrouptype)
    total,category_id,label,group,subgrouptype1
    XX,Gender,Male,Green,Apple
    XX,Gender,Male,Green,Kiwi
    XX,Gender,Male,Green,Grape
    XX,Gender,Male,Green,ALL
    XX,Gender,Female,Green,Apple
    XX,Gender,Female,Green,Kiwi
    XX,Gender,Female,Green,Grape
    XX,Gender,Female,Green,ALL

    EDIT: Clarification, sample data

    EDIT2: There are 20 or so categories of demographic data.


  • ♿ (Parody)

    Have you tried:

    update d set total = (select count(ID) from #tempCust c where  c.group = d.group and gender = d.Label)
    from #demoCts 
    where  category = 'Gender'
    

    ???

    I do that sort of thing all the time in Oracle, although it would look more like update table_d d .... That is, I alias the updating table and use that alias inside the query.


  • And then the murders began.

    Might be T-SQL specific.

    WITH GroupedCounts AS (SELECT COUNT(*) AS Total, c.Group, c.Gender FROM #tempCust c GROUP BY c.Group, c.Gender)
    UPDATE d
    SET d.Total = c.Total
    FROM #demoCts d
    JOIN GroupedCounts c ON c.Group = d.Group and c.Subgroup = d.Gender
    


  • Not sure this is what you need, but something like:

    
    update #demoCts
    set genderMale = count(*) from genderMale,
        genderFemale = count(*) from genderFemale
        (etc)
    from #demoCts d
    join #tempCust genderMale on 1=1 and genderMale.gender = 'male'
    join #tempCust genderFemale on 1=1 and genderFemale.gender = 'female'
    (etc)
    
    

    I didn't run it but it should work fine.

    As a note, questions like this would be a lot easier to answer if you could give us a few rows of sample data in both the temp tables, because I'm confused about what the original example is doing.



  • @boomzilla said in SQL - Multiple counts:

    Have you tried:

    update d set total = (select count(ID) from #tempCust c where  c.group = d.group and gender = d.Label)
    from #demoCts 
    where  category = 'Gender'
    

    ???

    I do that sort of thing all the time in Oracle, although it would look more like update table_d d .... That is, I alias the updating table and use that alias inside the query.

    I'm doing that. Screw up on the attempting to anon. It is the 50 repetitive statements i'm trying to avoid.


  • Notification Spam Recipient

    @karla said in SQL - Multiple counts:

    There has to be better way than:

    I would consider something like:

    insert into #demoCtsResultInterrim values (
    select group, gender, SUM(CASE WHEN c.group = d.group and gender = d.gender THEN 1 ELSE 0 END) total
    FROM #tempCust c
    GROUP BY group, gender
    )
    

    And I probably botched everything because I'm not reading comprehensively...



  • @unperverted-vixen said in SQL - Multiple counts:

    Might be T-SQL specific.

    WITH GroupedCounts AS (SELECT COUNT(*) AS Total, c.Group, c.Gender FROM #tempCust c GROUP BY c.Group, c.Gender)
    UPDATE d
    SET d.Total = c.Total
    FROM #demoCts d
    JOIN GroupedCounts c ON c.Group = d.Group and c.Subgroup = d.Gender
    

    Subgroups are completely unrelated to demographic categories. I will need counts of all genders for each group and each subgroup.



  • So it's a full combinatoric? You need a count for every possible combination of the source data?

    You could put every column of the source data into a string, then count distinct strings. (Pick a string delimiter that won't conflict with the source data.)



  • @karla This almost sounds like a job for WITH CUBE or WITH ROLLUP.


  • Impossible Mission - B

    @groaner said in SQL - Multiple counts:

    WITH ROLLUP

    ❓


  • Considered Harmful

    @masonwheeler That's a cheerful image for rollup, I was picturing a cheap rented office being stripped to the walls and some people going for a long swim with carpets.


  • ♿ (Parody)

    @karla said in SQL - Multiple counts:

    @boomzilla said in SQL - Multiple counts:

    Have you tried:

    update d set total = (select count(ID) from #tempCust c where  c.group = d.group and gender = d.Label)
    from #demoCts 
    where  category = 'Gender'
    

    ???

    I do that sort of thing all the time in Oracle, although it would look more like update table_d d .... That is, I alias the updating table and use that alias inside the query.

    I'm doing that. Screw up on the attempting to anon. It is the 50 repetitive statements i'm trying to avoid.

    You're not doing that in your statement. You're repeating the statements for each gender instead of letting the query do it for you. Here, I'll reformat and maybe it will be clearer:

    update d set total = (
      select count(ID)
      from #tempCust c 
      where  c.group = d.group 
        and gender = d.Label)
    from #demoCts 
    where  category = 'Gender'
    

    Note this bit in particular: and gender = d.Label. Which looking at it again, you must be able to do since you were already doing it with the group columns.



  • @blakeyrat said in SQL - Multiple counts:

    So it's a full combinatoric? You need a count for every possible combination of the source data?

    Ultimately, I only need overall totals but in order to validate the numbers, from other reports I need to see the groups/subgroups numbers.

    You could put every column of the source data into a string, then count distinct strings. (Pick a string delimiter that won't conflict with the source data.)

    Trying to wrap my head around this.



  • @karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:

    
    select d.*, count( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled ) as 'count'
    group by d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled
    
    

    That gives you a count of each combination of columns you listed in your string.

    (Again I haven't run this myself, syntax may be wrong. My T-SQL is rusty as shit. Etc.)



  • @blakeyrat said in SQL - Multiple counts:

    @karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:

    
    select d.*, count( distinct( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled ))
    group by d.*
    
    

    That gives you a count of each combination of columns you listed in your string.

    (Again I haven't run this myself, syntax may be wrong.)

    OK, now I see what you are saying.

    I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc (individually).

    Groups and subgroups to help validate the numbers. As we have no other report that goes across all groups.



  • @boomzilla said in SQL - Multiple counts:

    @karla said in SQL - Multiple counts:

    @boomzilla said in SQL - Multiple counts:

    Have you tried:

    update d set total = (select count(ID) from #tempCust c where  c.group = d.group and gender = d.Label)
    from #demoCts 
    where  category = 'Gender'
    

    ???

    I do that sort of thing all the time in Oracle, although it would look more like update table_d d .... That is, I alias the updating table and use that alias inside the query.

    I'm doing that. Screw up on the attempting to anon. It is the 50 repetitive statements i'm trying to avoid.

    You're not doing that in your statement. You're repeating the statements for each gender instead of letting the query do it for you. Here, I'll reformat and maybe it will be clearer:

    update d set total = (
      select count(ID)
      from #tempCust c 
      where  c.group = d.group 
        and gender = d.Label)
    from #demoCts 
    where  category = 'Gender'
    

    Note this bit in particular: and gender = d.Label. Which looking at it again, you must be able to do since you were already doing it with the group columns.

    I realized that and corrected. But I need like 20 different demographic counts (gender being only one).



  • @groaner said in SQL - Multiple counts:

    WITH CUBE or WITH ROLLUP

    These will help.


  • ♿ (Parody)

    @karla said in SQL - Multiple counts:

    @blakeyrat said in SQL - Multiple counts:

    @karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:

    
    select d.*, count( distinct( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled ))
    group by d.*
    
    

    That gives you a count of each combination of columns you listed in your string.

    (Again I haven't run this myself, syntax may be wrong.)

    OK, now I see what you are saying.

    I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc (individually).

    Groups and subgroups to help validate the numbers. As we have no other report that goes across all groups.

    Ah. I get it now. I read the code but not so much the words. Well, with this approach you only need one per category, at least. If you have a mapping of categories to columns you might be able to generate the list of queries automatically. Because I think it's the repetition that you're fighting, not having to run multiple queries which you could batch up into a single run.



  • @karla said in SQL - Multiple counts:

    I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc.

    So you need a count of:

    Group + Subgroup + Male
    Group + Subgroup + Female
    Group + Subgroup + Disabled
    etc?

    Or phrased more generally, a count of each:

    Group + Subgroup + (Each option in each column other than ID, Group, Subgroup)?


    In T-SQL there's a PIVOT operator that turns columns into rows. You could probably automatically generate the desired table using that functionality.



  • @boomzilla said in SQL - Multiple counts:

    @karla said in SQL - Multiple counts:

    @blakeyrat said in SQL - Multiple counts:

    @karla Well I'm frankly still unclear what exactly the requirements are (sorry maybe I'm dense), but I was thinking something like this:

    
    select d.*, count( distinct( d.Group + '|||' + SubgroupType1 + '|||' + SubgroupType2 + '|||' + Gender + '|||' + Hispanic + '|||' + Disabled ))
    group by d.*
    
    

    That gives you a count of each combination of columns you listed in your string.

    (Again I haven't run this myself, syntax may be wrong.)

    OK, now I see what you are saying.

    I need counts for how many Males, Females, Disabled, Not Disabled, Hispanic, Not Hispanic, etc (individually).

    Groups and subgroups to help validate the numbers. As we have no other report that goes across all groups.

    Ah. I get it now. I read the code but not so much the words. Well, with this approach you only need one per category, at least.

    Plus the WITH ROLLUP

    If you have a mapping of categories to columns you might be able to generate the list of queries automatically. Because I think it's the repetition that you're fighting, not having to run multiple queries which you could batch up into a single run.

    Yeah, because obviously that is mistake prone and this report is hard to validate and now that our BA is retiring...I need to be very careful.

    Generation was what I was thinking before I posted the thread. I was hoping I was missing something else to make it a easier.



  • @blakeyrat said in SQL - Multiple counts:

    PIVOT

    Yeah, I think this may do it (or at least do much of it).



  • I have to admit I made a scratch table for this and I'm having a lot of trouble getting it to work without using "reflection" (aka. information_schema table queries) to yank out the column names. It's trickier than it seems.



  • @blakeyrat

    Scratch table:

    
    /****** Object:  Table [dbo].[scratch]    Script Date: 7/18/2018 2:40:10 PM ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE TABLE [dbo].[scratch](
    	[id] [bigint] IDENTITY(1,1) NOT NULL,
    	[group] [nvarchar](50) NOT NULL,
    	[subgroup] [nvarchar](50) NOT NULL,
    	[gender] [nvarchar](50) NOT NULL,
    	[disabled] [nvarchar](50) NOT NULL,
     CONSTRAINT [PK_scratch] PRIMARY KEY CLUSTERED 
    (
    	[id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
    SET IDENTITY_INSERT [dbo].[scratch] ON 
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (1, N'animal', N'human', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (2, N'animal', N'human', N'female', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (3, N'animal', N'dog', N'female', N'yes')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (4, N'mineral', N'graphite', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (5, N'mineral', N'diamond', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (6, N'animal', N'human', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (7, N'animal', N'dog', N'female', N'yes')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (8, N'animal', N'dog', N'female', N'yes')
    GO
    SET IDENTITY_INSERT [dbo].[scratch] OFF
    GO
    
    

    Query:

    
    
    select s.[group], s.subgroup, s.colname, s.colval, count(*) as 'total'
    from scratch s
    unpivot
    (
        colval
    	for colname in ( gender, [disabled] )
    ) s
    group by s.[group], s.subgroup, s.colname, s.colval
    
    

    Output of query:

    
    group	subgroup	colname	        colval	total
    animal	dog	        disabled	yes	3
    animal	dog	        gender	        female	3
    animal	human	        disabled	no	3
    animal	human	        gender	        female	1
    animal	human	        gender	        male	2
    mineral	diamond	        disabled	no	1
    mineral	diamond	        gender	        male	1
    mineral	graphite	disabled	no	1
    mineral	graphite	gender	        male	1
    
    

    Turns out PIVOT wasn't the answer, but UNPIVOT was. You still need to list the names of your columns if the query (although if you wanted to get really fancy you could grab those from a dynamic SQL statement).



  • @blakeyrat said in SQL - Multiple counts:

    @blakeyrat

    Scratch table:

    
    /****** Object:  Table [dbo].[scratch]    Script Date: 7/18/2018 2:40:10 PM ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE TABLE [dbo].[scratch](
    	[id] [bigint] IDENTITY(1,1) NOT NULL,
    	[group] [nvarchar](50) NOT NULL,
    	[subgroup] [nvarchar](50) NOT NULL,
    	[gender] [nvarchar](50) NOT NULL,
    	[disabled] [nvarchar](50) NOT NULL,
     CONSTRAINT [PK_scratch] PRIMARY KEY CLUSTERED 
    (
    	[id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
    SET IDENTITY_INSERT [dbo].[scratch] ON 
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (1, N'animal', N'human', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (2, N'animal', N'human', N'female', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (3, N'animal', N'dog', N'female', N'yes')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (4, N'mineral', N'graphite', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (5, N'mineral', N'diamond', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (6, N'animal', N'human', N'male', N'no')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (7, N'animal', N'dog', N'female', N'yes')
    GO
    INSERT [dbo].[scratch] ([id], [group], [subgroup], [gender], [disabled]) VALUES (8, N'animal', N'dog', N'female', N'yes')
    GO
    SET IDENTITY_INSERT [dbo].[scratch] OFF
    GO
    
    

    Query:

    
    
    select s.[group], s.subgroup, s.colname, s.colval, count(*) as 'total'
    from scratch s
    unpivot
    (
        colval
    	for colname in ( gender, [disabled] )
    ) s
    group by s.[group], s.subgroup, s.colname, s.colval
    
    

    Output of query:

    
    group	subgroup	colname	        colval	total
    animal	dog	        disabled	yes	3
    animal	dog	        gender	        female	3
    animal	human	        disabled	no	3
    animal	human	        gender	        female	1
    animal	human	        gender	        male	2
    mineral	diamond	        disabled	no	1
    mineral	diamond	        gender	        male	1
    mineral	graphite	disabled	no	1
    mineral	graphite	gender	        male	1
    
    

    Turns out PIVOT wasn't the answer, but UNPIVOT was. You still need to list the names of your columns if the query (although if you wanted to get really fancy you could grab those from a dynamic SQL statement).

    This helps because I was having trouble with it.

    I'll probably have a better idea tomorrow after having slept on it.



  • @karla One interesting quirk about this query is it breaks group by all functionality in SQL Server. And you can't unpivot by all apparently.

    If a particular combination has a total of zero, you just can't get it in the output. AFAICT. Hopefully that's not a problem for you.



  • @blakeyrat said in SQL - Multiple counts:

    @karla One interesting quirk about this query is it breaks group by all functionality in SQL Server. And you can't unpivot by all apparently.

    If a particular combination has a total of zero, you just can't get it in the output. AFAICT. Hopefully that's not a problem for you.

    Never used "group by all" and since I'm doing an update rather than insert zeros shouldn't be a problem.



  • If your database software supports COUNT(*) OVER (PARTITION BY ...), you can do this without the subqueries.

    Here's the PostgreSQL documentation:

    I'm pretty sure it's also supported in Oracle and MSSQL.


  • Fake News

    @ben_lubar said in SQL - Multiple counts:

    I'm pretty sure it's also supported in ... MSSQL.

    Can confirm.


  • Java Dev

    @ben_lubar Oracle has window functions as well. It also has unpivot which is probably a better fit here.


Log in to reply