SQL: creating a query for recent activity in foreign key



  • I have a table structure with module rows, each of which has a one-to-zero-or-more relationship with completionRecords. Each completionRecord row has a completedOn date column. I would like to display all module rows that have at least one completionRecord, ordered by the recency of the last completionRecord. Instead of displaying all records, I want to summarize and only show the aggregate number of completionRecord rows for each displayed module. How should I go about structuring this query?

    My basic guess is something like

    SELECT title, 
         (SELECT COUNT(c.id) AS row_count WHERE c.assignment_id = m.id), 
         subject
    FROM module m
    LEFT JOIN completion 
         ON c.assignment_id = m.id 
    ORDER BY c.completedOn
    

    but I've probably mangled the syntax.

    Requirements:

    • All rows returned have at least one completionRecord.
    • The total number of completionRecords that match that module should be returned.
    • The returned rows should be ordered by the date of the last completionRecord for each row.

    What's the best way to do this? RDBMS is MySQL (a undefined, to be sure, but what I'm stuck with) if it matters.

    Thanks!



  • @benjamin-hall Direct from brain to keyboard with no testing:

    select m.title as Title,
      count( c.assignment_id ) as AssignmentCount,
      max( c.completedOn ) as LastCompleted
    from module m
      join completion c on c.assignment_id = m.id
    order by max( c.completedOn )
    group by m.title
    

    I'm rusty, also only know T-SQL, let me know if it works.

    Note your first requirement:

    All rows returned have at least one completionRecord.

    Is simply taken care of by doing an inner join.

    EDIT: I forgot to add subject. If you add subject to the select, make sure you also add it to the group by.

    EDIT EDIT: also you want to order by in desc order, if I'm reading your requirements right. So add desc. Assuming by "last" you mean "most recent", and not like... last by some other criteria.



  • @benjamin-hall Does MySQL support ROW_NUMBER and windowed functions in general? If so, I'd use a derived table like this:

    select *
    from modules m
    inner join
    (
    select m.*,
     row_number() over (partition by m.assignment_id order by m.completedon desc) as ordinal,
    count (*) over (partition by m.assignment_id) as completioncount
    from completion m
    ) t on
    (t.assignment_id = m.id)
    where t.ordinal = 1
    

    That lets you get at more data than just the timestamp for the most recent completion.



  • @blakeyrat Thanks. I'll try that. Yes, I mean "last" as "most recent", so desc is probably what I want.

    @Groaner , I'm not too concerned if the comparison is slightly inconsistent. If two completions come in with basically identical timestamps, I don't really care which comes first, just that they come before any that are much further back in time. This is for a reporting interface.



  • @blakeyrat said in SQL: creating a query for recent activity in foreign key:

    select m.title as Title,
      count( c.assignment_id ) as AssignmentCount,
      max( c.completedOn ) as LastCompleted
    from module m
      join completion c on c.assignment_id = m.id
    order by max( c.completedOn )
    group by m.title
    

    Turns out the only real difference was that I needed to rearrange the clauses--MySQL wants the ORDER BY to be last.

    SELECT
        m.title AS Title,
        COUNT(c.assignment_id) AS AssignmentCount,
        MAX(c.completed_on) AS LastCompleted,
        m.subject AS Subject
    FROM
        module m
    JOIN completion
        c
    ON
        c.assignment_id = m.id
    GROUP BY
        m.title,
        Subject
    ORDER BY
    	LastCompleted DESC
    

    Thanks @blakeyrat !



  • @benjamin-hall said in SQL: creating a query for recent activity in foreign key:

    Turns out the only real difference was that I needed to rearrange the clauses--MySQL wants the ORDER BY to be last.

    That actually might be true in SQL Server too, like I said I did it from memory.

    NP.

    Your syntax is a bit sloppy. Half of your group by refers to the column name, and the other half refers to the select alias for it. I'd always refer to column names, as it's quite likely you'll want to change the select aliases sooner or later and the way it's written now you could break the query if you decided the report should say "Subjecterino" instead of "Subject".

    Also SQL with all-caps keywords is cancer. But you've heard that rant before.

    And since I'm ranting and wasting a few minutes before I go home from work, why does "assignment_id" link to a "module" table and not an "assignment" table. Pfft!



  • @blakeyrat thanks for the guidance. I'm new to this whole programming (especially web and database) thing and entirely self-taught, so best-practices advice is always welcome.

    The module vs assignment is due to a nomenclature change half-way through the project. In reality, the actual table name is assignment, but I'm not calling them that anymore...and so forget what I originally set it up as. The formatting is due to the phpMyAdmin's auto-format button where I was trying it out.



  • @benjamin-hall You can rename tables and columns. SQL isn't that scary.



  • @blakeyrat but then I'd have to change my queries...and that requires effort. 🤷♂


  • Impossible Mission - B

    SQL is like chess: it's conceptually very simple, and you can literally pick up about 90% of the rules in a single day. But learning to grasp and effectively apply the implications thereof can take years and years.



  • I'd probably go for

    select m.title, m.subject, c.count, c.last_completed
    from module m
    inner join (
        select count(*) count, max(c.completedOn) last_completed
        from completion
        group by assignment_id
    ) c on m.id=c.assignment_id
    order by last_completed desc;
    

    This removes any concern that records from module may end up getting grouped together.



  • @pleegwat said in SQL: creating a query for recent activity in foreign key:

    This removes any concern that records from module may end up getting grouped together.

    If that's a valid concern, just group by module's primary key instead of module.title. Ludicrous to make a subquery to solve that.

    I swear to God, it's like there's a whole group of people who are like "the problem with SQL is that simple queries are too simple, we have to nest queries inside queries inside queries. Me love complicated!" Just write the simple version! When you need a subquery, and you won't for years and years and years, then you write one.

    You are right that my version would munge together two modules if they had the same title and subject, I blithely assumed those had unique constraints, but maybe they don't. Whatever. It's hard to write queries without the database definition, and when people invariably are super-vague in describing what they're supposed to actually do.


  • Impossible Mission - B

    @blakeyrat said in SQL: creating a query for recent activity in foreign key:

    When you need a subquery, and you won't for years and years and years, then you write one.

    You've never come across a situation that requires the rank() function? Because that's practically impossible to use without a subquery, and I've needed them 3 or 4 times in the last 4 months.



  • @masonwheeler Not since like... 2011? Ish.



  • @blakeyrat said in SQL: creating a query for recent activity in foreign key:

    I swear to God, it's like there's a whole group of people who are like "the problem with SQL is that simple queries are too simple, we have to nest queries inside queries inside queries. Me love complicated!" Just write the simple version! When you need a subquery, and you won't for years and years and years, then you write one.

    What is readable is pretty subjective though. I prefer the subquery over the overgrown group by clause, and unlike mysql, in oracle you can't get away with just a unique key of the non-aggregating table in group by.


  • Impossible Mission Players - A

    @pleegwat
    Also, the sub-query method is much more likely to scale performantly, since you only have a single group by column that doesn't cross tables, meaning that you can have a covering index (ordered by completion.assignment_id, and including completion.completedOn) for the sub-query, removing the need for SQL to sort your data in memory before it can do the aggregation functions (which the multi-table group by would inflict on you).

    Though your syntax won't work, without an AS c after the FROM completion in the sub-query 😛 (and yes, the AS is optional, but for the love of dog, it's so much more readable to just include the AS when you're aliasing table names people, seriously...)



  • @izzion Well, yes, but I know blakey well enough to not counter his readability argument with a performance argument.



  • @blakeyrat said in SQL: creating a query for recent activity in foreign key:

    When you need a subquery, and you won't for years and years and years, then you write one.

    And then write it as a common table expression (i.e., with...). So much clearer and useful!


  • Impossible Mission Players - A

    @pleegwat
    🤷♂ I would rather be subject to a blakeyrant than let someone take poorly written SQL code from a help thread, without understanding in what cases it's going to perform poorly.



  • @izzion said in SQL: creating a query for recent activity in foreign key:

    Though your syntax won't work, without an AS c after the FROM completion in the sub-query (and yes, the AS is optional, but for the love of dob, it's so much more readable to just include the AS when you're aliasing table names people, seriously...)

    I never use AS.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.