Did Discourse just fall over?



  • So I was just reading the Likes thread, and scrolling stopped happening, I got this:

    and then nginx 502 errors when I tried to reload.

    Did I kill it by reading the Likes thread, or was that an unhappy coincidence?


  • Banned

    Yeah its me ... sorry I am doing some pg tuning at the moment, hold tight, I may need another minor outage soon.



  • Thanks, so it wasn't me. That's good then.
    it'd be really sad if a single user on a mobile device could kill forum software just by reading.



  • Thanks for digging into it!


  • Banned

    OK my minor outages are out, perf should be slightly better across the board even @boomzilla has a loading user page ... albeit slightly slow, but way better than it was.



  • Nice...about twice as fast as it was.


  • SockDev

    @lightsoff said:

    Did I kill it by reading the Likes thread, or was that an unhappy coincidence?

    probably not (given @sam's reply) but if you want to see if thes sort of things are just you or everyone there's an appwebsite for that:

    https://servercooties.com/


  • Banned

    @PJH I am just about to upgrade for

    which is very urgent here.

    Been noticing a flood of very expensive queries each time people refresh the front page. turns out we are issuing are "reload" for all the letter avatars and they were not bypassing some very expensive queries.

    For next release we are going to clean up our pattern of preloading so errors like this can not happen, but for current release this will have to do.


  • Banned

    @apapadimoulis we got to get a CDN here, it would have avoided a big class of issues we have seen here (which I have fixed)

    https://www.maxcdn.com/pricing/ would be fine and really would only add up to less than 10 bucks a month at current traffic.


  • Banned

    Sorry all, outage was slightly longer than I expected, you were due for your 3 monthly db vacuum.

    All is back and faster now.

    @boomzilla enjoy your g b shortcut.


  • Banned

    @pjh this is running still a bit too frequently for my liking

    INSERT INTO user_badges(badge_id, user_id, granted_at, granted_by_id, post_id)
    	            SELECT 169, q.user_id, q.granted_at, -1, q.post_id
    	            FROM ( WITH replies AS (
    		SELECT user_id, id, topic_id, post_number, reply_to_post_number
    		FROM posts
    		WHERE raw LIKE '%6ab865b7-1b48-403b-b126-25fe3e0a8fa6%' AND
    		reply_to_post_number IS NOT NULL AND
    		user_id IN  (
    			SELECT gu.user_id
    			FROM group_users gu
    			WHERE group_id IN(
    				SELECT g.id
    				FROM groups g
    				WHERE g.name IN ('staff')
    			)
    		)
    	)
    	SELECT p.user_id, p.id post_id, p.created_at granted_at, r.id
    	FROM posts p, replies r
    	WHERE r.reply_to_post_number = p.post_number AND
    	r.topic_id = p.topic_id and
    	('f' OR r.id IN (424195) ) ) q
    	            LEFT JOIN user_badges ub ON
    	              ub.badge_id = 169 AND ub.user_id = q.user_id
    	              AND (q.post_id = ub.post_id OR NOT 't')
    	            WHERE (ub.badge_id IS NULL AND q.user_id <> -1)
    	            RETURNING id, user_id, granted_at
    

    its a 600ms query and like 10 of them run every minute if people post.

    Its forcing a table scan, due to the structure.


  • BINNED

    Custom like number badges?

    We could move this to daily IMHO, don't think anyone is in that big of a rush to get their mediocre+ poster badges.


  • Discourse touched me in a no-no place

    @Onyx said:

    Custom like number badges?

    No - that's the whoosh one.

    [pjh@sofa ~]$ psql -d discourse -c 'select description from badges where id=169'
                            description
    ------------------------------------------------------------
     Badge of shame when someone seriously Didn't Get The Joke™
    (1 row)
    
    [pjh@sofa ~]$
    
    
    

  • Banned

    @sam said:

    WHERE raw LIKE '%6ab865b7-1b48-403b-b126-25fe3e0a8fa%' AND

    Caused the bleeding cause its in a WITH clause.

    I would either restructure the pattern or move to daily.


    Mod - PJH: Editing to munge the UUID - http://what.thedailywtf.com/t/heeeey/48794?u=pjh


  • BINNED

    Oh, I didn't know there is a UUID version for those.

    I'd still say daily. They can be awarded manually anyway either way, yes?


  • Banned

    I think its the general pattern ...seeing a bunch of those

    INSERT INTO user_badges(badge_id, user_id, granted_at, granted_by_id, post_id)
    	            SELECT 121, q.user_id, q.granted_at, -1, q.post_id
    	            FROM ( WITH replies AS (
    		SELECT user_id, id, topic_id, post_number, reply_to_post_number
    		FROM posts
    		WHERE raw LIKE '%c4c39f34-ba9b-47bb-b7b8-dd4fed8619b%' AND
    		reply_to_post_number IS NOT NULL AND
    		user_id IN  (
    			SELECT gu.user_id
    			FROM group_users gu
    			WHERE group_id IN(
    				SELECT g.id
    				FROM groups g
    				WHERE g.name IN ('staff')
    			)
    		)
    	)
    	SELECT p.user_id, p.id post_id, p.created_at granted_at, r.id
    	FROM posts p, replies r
    	WHERE r.reply_to_post_number = p.post_number AND
    	r.topic_id = p.topic_id and
    	('f' OR r.id IN (424198) ) ) q
    	            LEFT JOIN user_badges ub ON
    	              ub.badge_id = 121 AND ub.user_id = q.user_id
    	              AND (q.post_id = ub.post_id OR NOT 't')
    	            WHERE (ub.badge_id IS NULL AND q.user_id <> -1)
    	            RETURNING id, user_id, granted_at
    
    INSERT INTO user_badges(badge_id, user_id, granted_at, granted_by_id, post_id)
    	            SELECT 170, q.user_id, q.granted_at, -1, q.post_id
    	            FROM ( WITH replies AS (
    		SELECT user_id, id, topic_id, post_number, reply_to_post_number
    		FROM posts
    		WHERE raw LIKE '%b3fe22f0-a01d-11e4-bcd8-0800200c9a6%' AND
    		reply_to_post_number IS NOT NULL AND
    		user_id IN  (
    			SELECT gu.user_id
    			FROM group_users gu
    			WHERE group_id IN(
    				SELECT g.id
    				FROM groups g
    				WHERE g.name IN ('staff')
    			)
    		)
    	)
    	SELECT p.user_id, p.id post_id, p.created_at granted_at, r.id
    	FROM posts p, replies r
    	WHERE r.reply_to_post_number = p.post_number AND
    	r.topic_id = p.topic_id and
    	('f' OR r.id IN (424198) ) ) q
    	            LEFT JOIN user_badges ub ON
    	              ub.badge_id = 170 AND ub.user_id = q.user_id
    	              AND (q.post_id = ub.post_id OR NOT 't')
    	            WHERE (ub.badge_id IS NULL AND q.user_id <> -1)
    	            RETURNING id, user_id, granted_at
    

    all take 600ms to 1 second to run.


    mod - PJH, more munging of UUIDs...



  • @Onyx said:

    They can be awarded manually anyway either way, yes?

    Not how it is done. They fling the GUID in and let the query assign the :badger:


  • Discourse touched me in a no-no place

    @sam said:

    its a 600ms query and like 10 of them run every minute if people post.

    A job run/trigger more frequently than 'Daily' but less frequently than 'when people post' would mitigate this a lot.

    Any chance of at least one piece of granularity between the two? If restricted to just one I'd suggest hourly, if more than one, 1h, 4h, 12h?


  • Banned

    For the record, our slow query logs are WAY quieter post this change.

    Maybe just change that to use fulltext search, it should find the guid quick smart without a table scan.


  • Notification Spam Recipient

    @sam said:

    LIKE '%6ab865b7-1b48-403b-b126-25fe3e0a8fa6%'

    Why are you using a like operator on a GUID?


  • Discourse touched me in a no-no place

    @Vault_Dweller said:

    Why are you using a like operator on a GUID?

    Because I'm searching for the GUID in a post.


  • Banned

    Does it show up in cooked? cause cooked is indexed. searching it is ultra fast.


  • Discourse touched me in a no-no place

    @sam said:

    fulltext search

    It's normally deliberately hidden, usually in a <-- comment.


  • Notification Spam Recipient

    @sam said:

    raw

    Ah, I see. I was just jumping to the conclusion that the field was a guid type, instead of looking at the actual query


  • Banned

    I am open to adding hourly/6 hourly etc. but will not happen till post 1.3



  • @PJH said:

    A job run/trigger more frequently than 'Daily' but less frequently than 'when people post' would mitigate this a lot.

    Would be ideal but doesn't daily suffice? Everybody has to wait longer for the :badger: to be assigned but it will get there ...

    That :badger: is so funny I can't stop using it ...



  • Alternatively, join to the post in question in the CTE:

    WITH replies AS (
    	SELECT p.user_id, p.id, p.topic_id, p.post_number, p.reply_to_post_number
    	FROM posts p
    	join posts bp on bp.post_number = p.reply_to_post_number and bp.topic_id = p.topic_id
    	WHERE p.raw LIKE '%87c69ccc-3c6f-47d3-8c66-4f3fdb9xyz26%' AND
    	(:backfill OR bp.id IN (:post_ids) ) AND
    	p.user_id IN  (
    		SELECT gu.user_id
    		FROM group_users gu
    		WHERE group_id IN(
    			SELECT g.id
    			FROM groups g
    			WHERE g.name IN ('staff')
    		)
    	)
    

    Locally, the explain plan went from:

    Nested Loop (cost=51427.99..51436.79 rows=1 width=20)

    to

    Nested Loop (cost=32.27..40.32 rows=1 width=20)



  • Simpler?

    SELECT p.user_id, p.id post_id, p.created_at granted_at, ap.id
    FROM posts p
    join posts ap -- admin's post
      on ap.reply_to_post_number = p.post_number
      and ap.topic_id = p.topic_id
    WHERE ap.raw LIKE '%87c69ccc-3c6f-47d3-8c66-4f3fdb9ghi26%'
      AND (:backfill OR ap.id IN (:post_ids) )
      AND ap.user_id IN  (
    		SELECT gu.user_id
    		FROM group_users gu
    		WHERE group_id IN(
    			SELECT g.id
    			FROM groups g
    			WHERE g.name IN ('staff')
    		))
    ;
    

    Nested Loop (cost=1.27..26.58 rows=34 width=20)


  • SockDev

    @PJH said:

    It's normally deliberately hidden, usually in a <-- comment.

    There's an HTML comment below me

    <!-- html comments get rendered? -->

    There's an HTML comment above me

    let's see if it shows up in cooked.

    EDIT: bugger. i could have sworn that used to work.


  • SockDev

    @accalia said:

    i could have sworn that used to work.

    You might be thinking of the old quoting behaviour, where it HTML-encoded < and >


  • SockDev

    no, i'm thinking of back when i furst made sockbot sign posts with that HTML comment. i could have sworn that was showing up in cooked.


  • SockDev

    Can't say I've ever seen that happen…


  • SockDev

    i must be misremembering then.


  • Banned

    hmm why not use the new badge assign UI that allows you to pick a post, doesn't it cover the original use case of the guid trick?


  • SockDev

    @sam said:

    hmm why not use the new badge assign UI that allows you to pick a post

    AFAICT because it didn't exist when the guid trick was created.

    using the GUID also allows for automatic promotions (get three of this bronze badge earn one silver badge sort of thing)



  • @accalia said:

    using the GUID also allows for automatic promotions (get three of this bronze badge earn one silver badge sort of thing)

    We haven't done that, though.



  • @sam said:

    new badge assign UI that allows you to pick a post

    How does this work? Is there a meta topic somewhere explaining it? A quick search didn't show it up for me.

    EDIT: Hmm...if you mean the improved thing in a user / admin page, that's still more work than replying with the GUID.


  • SockDev

    @boomzilla said:

    We haven't done that, though.

    no, but we could with the GUID trick. ;-)


  • I survived the hour long Uno hand

    Couldn't we do that with a nightly badge query? Select users with three bronze badges and give them a silver one?



  • I'm sure we could.


  • SockDev

    @Yamikuronue said:

    Couldn't we do that with a nightly badge query?

    yes we could!

    but i'm not putting together the SQL for that. :-P


  • I survived the hour long Uno hand

    Wimp. It's less complicated than the spoon I'll bet :)


  • SockDev

    you do it then!

    :-P


  • I survived the hour long Uno hand

    ERR_POSTGRES_NOT_FOUND

    <err_body_not_invalid


  • SockDev

    :laughing: and you call me a wimp!


  • I survived the hour long Uno hand

    A craftsman cannot work without the proper tools :)


  • BINNED

    Yeah, what else are you going to blame when it goes wrong?


  • I survived the hour long Uno hand

    Discourse, obviously.

    BTW, Discourse totally broke the staging server I was fiddling with this morning. :)



  • Did Discourse just fall over?


    Filed under: yes, php.net, no, don't ask


  • SockDev

    @Maciejasjmj said:


    Leaning Tower of Discourse


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.