So... I am no longer blind about performance here ...


  • kills Dumbledore

    I suspect it was @Onyx's too


  • FoxDev

    @accalia said:

    all those bots are caught up now though right?

    Only @NetBot; @MathBot and @RPBot aren't yet, and then there's my own account to get caught up too.
    @accalia said:
    I should make version 2.0 have some persistent storage so it knows the posts it's already liked and skips asking for their JSON entirely.

    Assuming we can get reliable responses from the server, yes, that would be worth doing.


  • BINNED

    @Jaloopa said:

    I suspect it was @Onyx's too

    Yup. I'm allowing for T-1000 being a special snowflake and should be treated differently, for now.

    Let's be honest, aggregating any kind of data on that kind of topic in realtime will be an issue no matter what kind of software is managing it. Discourse is just better at showing it than most. But I don't think it's an unsolvable problem. Little things we're slowly uncovering and that are getting fixed are, IMHO, all steps in the right direction, that will benefit us as users and Discourse in the long run.


  • ♿ (Parody)



  • @sam said:

    I guess technically I could just add 1 and queue a proper refresh in 15 minutes, but I need to design a backend to support this (queue a job unless already queued)

    As in counting in redis?

    @sam said:

    750

    7.5 methinks




  • Discourse touched me in a no-no place

    @sam said:

    create inded idxTemp on posts(topic_id, like_count) where deleted_at is null

    does not help.

    I'll just bet it doesn't.



  • You mean... Only cause server load whenever we suspect that the server is having a hard time? Yeah, that makes sense.


  • :belt_onion:

    @Salamander said:

    when it comes to counting things

    they should add the likes instead :trollface:


  • :belt_onion:

    @sam said:

    what is Paula up to?

    Brillant things, clearly.


  • Banned

    @boomzilla FYI



  • @Salamander said:

    Is there a reason you don't seem to have a corresponding topics.like_count ?

    To remain accurate, that column would have to be updated via trigger or sproc every time someone issued a like, which is going to add a not insignificant amount of traffic on that table. Or, that column could be updated by a job, but then it's no longer instantaneously accurate.

    Neither of those options seems like a better alternative, unless there's a third one I'm missing...



  • @Groaner said:

    unless there's a third one

    Do it like YouTube does view counting, real time for low numbers, delayed updates for higher counts?


  • Banned

    That's the plan, just need to build infrastructure to support that.

    https://twitter.com/mperham/status/611170579463323649



  • @Groaner said:

    that column could be updated by a job, but then it's no longer instantaneously accurate.

    I don't have a big problem with that. Does anybody really care whether t/1000 displays 1540362 likes right now instead of 5 minutes from now? Unlike the post numbers, where (a few) people care very much that they be instantaneously accurate, AFAIK there's no easy way to determine who submitted the 1600000th like, so no competition to be the individual who does that, so no need for quick updating.



  • @ChaosTheEternal said:

    Do it like YouTube does view counting, real time for low numbers, delayed updates for higher counts?

    My understanding is that the reason behind that is more based on replication across a distributed network and anti-gamification of viewcount rather than performance considerations. That could work, but as we saw above, Postgres ain't great at counting, and it would have to count to decide the cutoff point....



  • @Groaner said:

    Postgres ain't great at counting

    That would make the argument for storing the count on the topic level, at which point you could refer to it when a post is liked to determine if you should go ahead and update the count in real-time or defer for a later job.



  • @HardwareGeek said:

    I don't have a big problem with that. Does anybody really care whether t/1000 displays 1540362 likes right now instead of 5 minutes from now?

    IT MUST BE AT ALL TIMES ACCURATE*!

    @HardwareGeek said:

    AFAIK there's no easy way to determine who submitted the 1600000th like, so no competition to be the individual who does that, so no need for quick updating.

    select *
    from
    (
    select l.*, row_number() over (order by like_id /* or like_date, if you prefer that to be the criterion */) as like_ordinal
    from likes l
    where l.topic_id = 1000
    ) t
    where t.like_ordinal = 1600000
    

    This works in SQL Server. No guarantees for other RDBMSes (though you could probably do something similar with TOP or LIMIT).

    *wherever possible


  • FoxDev

    Nice SQL. Useful for the half-dozen who have access to the DB 😛



  • Ok, you can, in theory, find out after the fact. I'm not surprised by this. However, it is not something that is available to a non-admin, much less in real time during the rush to hit a milestone number, so still no need for quick updating.



  • @ChaosTheEternal said:

    That would make the argument for storing the count on the topic level, at which point you could refer to it when a post is liked to determine if you should go ahead and update the count in real-time or defer for a later job.

    This is certainly possible, but that would make for some interesting code.

    All this is making me think that if I need to design a high-throughput (say, 20k+ TPS) banking system, I should definitely choose PostgreSQL!



  • @RaceProUK said:

    Nice SQL. Useful for the half-dozen who have access to the DB

    As a Software Developer Engineer Architect of Enterprisey Architecture Development Systems™, my role is to provide solutions to technical problems. Political problems, like actually getting access to the server, are outside my domain.


  • Banned

    Updated report for last day with analysis: (removed ip addresses and sensitive info from report)

    analyzed: /var/log/nginx/access.log
    ----------------------------------------------------------------------------------------------------
    17/Jun/2015:07:35:19 +0000 - 18/Jun/2015:02:33:23 +0000
    ----------------------------------------------------------------------------------------------------
    Total Requests: 1301220 ( MessageBus: 646536 )
    ----------------------------------------------------------------------------------------------------
    
    Top 30 users by Server Load
    
    Username             Duration Reqs   Routes                                                                          
    --------             -------- ----   ------                                                                          
    
    [Anonymous]          46487.53 144376 -(23600.11) topics/show(9351.47) user_avatars/show(3862.74)                      
                                         session/csrf(2360.15) user_avatars/show_letter(2116.66)                          
    
    
    PaulaBean             3544.90  12959 topics/show(3500.04) list/category_latest(39.51) posts/destroy(4.44)             
                                         topics/status(0.91)                                                              
    
    
    RaceProUK             2634.88   8636 topics/show(1413.84) topics/timings(365.74) posts/show(269.85)                   
                                         posts/create(145.17) post_actions/create(123.53)                                 
    
    
    obeselymorbid         2580.25   3440 topics/timings(1026.86) user_avatars/show(468.78) topics/show(225.52)            
                                         post_actions/create(177.49) clicks/track(133.43)                                 
    
    
    dcon                  1811.37   1177 post_actions/create(993.20) topics/timings(458.43) posts/show(234.54)            
                                         user_avatars/show(29.73) topics/show(28.68)                                      
    
    
    abarker               1633.54   2901 topics/timings(1164.73) posts/show(128.40) posts/create(109.95)                  
                                         post_actions/create(61.70) topics/show(59.99)                                    
    
    
    dse                   1500.33    410 user_avatars/show_letter(1188.91) user_avatars/show(268.61)                      
                                         topics/timings(29.20) topics/posts(3.83) post_actions/create(3.06)               
    
    
    HardwareGeek          1471.02   4842 topics/timings(373.76) user_actions/show(248.99) draft/update(208.33)            
                                         post_actions/create(150.81) posts/show(141.20)                                   
    
    
    coldandtired          1458.04   1235 user_avatars/show(624.82) topics/timings(466.58)                                 
                                         user_avatars/show_letter(244.86) topics/show(57.09) list/latest(44.50)           
    
    
    accalia               1420.06   5559 topics/timings(392.48) topics/posts(268.35) posts/show(201.48)                   
                                         topics/show(144.04) posts/create(138.82)                                         
    
    
    xaade                 1392.96   2308 topics/timings(522.22) user_avatars/show(116.15) draft/update(107.08)            
                                         topics/show(101.80) posts/create(96.34)                                          
    
    
    boomzilla             1235.77   3672 topics/timings(398.40) topics/show(195.38) post_actions/create(140.12)           
                                         posts/show(130.58) posts/create(94.37)                                           
    
    
    PJH                   1183.18   4749 topics/timings(565.85) posts/show(356.33) topics/show(120.65)                    
                                         topics/posts(90.57) user_avatars/show(11.87)                                     
    
    
    flabdablet            1024.49    855 topics/timings(558.37) draft/update(201.24) posts/create(68.46)                  
                                         composer_messages/index(58.42) post_actions/users(51.02)                         
    
    
    Zoidberg               990.96   2341 topics/posts(557.75) posts/show(157.74) post_actions/create(126.16)              
                                         notifications/index(82.16) exceptions/not_found(50.87)                           
    
    
    sockbot                954.29   2596 topics/posts(557.17) posts/show(168.78) post_actions/create(134.19)              
                                         notifications/index(47.42) exceptions/not_found(28.04)                           
    
    
    PleegWat               890.97   1990 topics/timings(752.26) topics/show(80.20) list/latest(17.31)                     
                                         user_avatars/show_letter(5.64) posts/create(5.62)                                
    
    
    NetBot                 888.31   1362 topics/posts(711.59) notifications/index(100.84) post_actions/create(37.58)      
                                         exceptions/not_found(21.17) topics/show(14.67)                                   
    
    
    Onyx                   885.58   3149 topics/timings(635.86) posts/create(49.65) topics/show(43.29)                    
                                         draft/update(38.44) users/show(22.11)                                            
    
    
    Placeholder            853.11    511 topics/timings(551.70) user_avatars/show(73.35) notifications/index(50.32)       
                                         user_avatars/show_letter(40.97) post_actions/create(31.49)                       
    
    
    cartman82              848.74   2151 user_avatars/show_letter(278.65) topics/timings(177.24) search/query(137.43)     
                                         posts/show(60.64) list/latest(50.31)                                             
    
    
    loose                  840.18   2811 topics/timings(361.31) list/latest(148.71) user_avatars/show_letter(74.33)       
                                         topics/show(59.16) user_avatars/show(45.87)                                      
    
    
    OffByOne               828.02   1972 posts/show(608.23) post_actions/create(87.12) topics/posts(69.76)                
                                         topics/timings(46.67) topics/show(6.47)                                          
    
    
    Zecc                   754.36   2022 topics/timings(577.93) topics/show(52.64) draft/update(44.19)                    
                                         posts/create(19.94) search/query(10.37)                                          
    
    
    MathBot                710.66    938 post_actions/create(528.34) notifications/index(97.77) topics/posts(64.86)       
                                         exceptions/not_found(19.38) topics/show(0.24)                                    
    
    
    ChaosTheEternal        695.34   3400 topics/timings(325.02) post_actions/create(109.53) topics/show(79.94)            
                                         posts/show(63.22) topics/posts(26.06)                                            
    
    
    Polygeekery            694.87   1595 topics/timings(296.28) posts/show(81.00) posts/create(73.64) draft/show(61.04)   
                                         post_actions/users(57.44)                                                        
    
    
    Bulb                   654.35    381 topics/timings(458.13) draft/update(168.32) list/latest(7.90) posts/create(6.65)  
                                         topics/show(4.49)                                                                
    
    
    dkf                    644.24   1525 topics/timings(327.70) draft/update(119.00) posts/create(74.43)                  
                                         list/unread(46.99) topics/show(39.17)                                            
    
    
    FullPointerException   642.23    192 topics/timings(419.62) draft/update(55.64) user_avatars/show(37.31)              
                                         user_avatars/show_letter(36.55) topics/show(29.46)                               
    
    ----------------------------------------------------------------------------------------------------
    
    
    Top 30 routes by Server Load
    
    Route                    Duration Reqs  
    -----                    -------- ----  
    -                        23600.11   1364 
    topics/timings           20924.97 101957 
    topics/show              18353.79  45061 
    user_avatars/show         6720.51  28936 
    user_avatars/show_letter  5381.20  10128 
    topics/posts              4930.55   9970 
    posts/show                4534.26  20092 
    post_actions/create       3807.55   3468 
    session/csrf              2360.28  40375 
    list/latest               2216.41  12565 
    posts/create              1990.64    948 
    draft/update              1747.32   6722 
    exceptions/not_found      1260.53  29458 
    users/show                1133.49    434 
    notifications/index        898.24   3045 
    search/query               340.32    164 
    clicks/track               322.23   1173 
    site_customizations/show   293.48   1684 
    composer_messages/index    279.81   1316 
    posts/update               274.24    187 
    user_actions/show          258.63    854 
    draft/show                 158.65    584 
    post_actions/users         148.01    545 
    posts/replies              147.84    652 
    list/unread                143.37    558 
    site/settings              134.87   4195 
    list/category_latest       110.26    432 
    user_actions/index         108.12    222 
    about/index                 85.32     37 
    topics/feed                 77.75    368 
    ----------------------------------------------------------------------------------------------------
    
    
    Top 30 not found urls (404s)
    
    Url                                                                              Count
    ---                                                                              -----
    GET /session HTTP/1.1                                                            27124 
    POST /notifications/reset-new HTTP/1.1                                            1905 
    
    Top 30 urls by Server Load
    
    Url                                                                              Duration Reqs  
    ---                                                                              -------- ----  
    POST /topics/timings HTTP/1.1                                                    30831.49 102111 
    POST /post_actions HTTP/1.1                                                       4199.74   3476 
    POST /draft.json HTTP/1.1                                                         3017.17   6730 
    POST /posts HTTP/1.1                                                              2271.47    954 
    POST /logs/report_js_error HTTP/1.1                                               1996.50    641 
    GET /session/csrf.json HTTP/1.1                                                   1453.87  13218 
    GET / HTTP/1.1                                                                    1174.92   5379 
    GET /session HTTP/1.1                                                             1021.15  27125 
    GET /notifications HTTP/1.1                                                        637.07   1904 
    GET /t/an-exciting-startup-job-offer/49473 HTTP/1.1                                566.95    472 
    GET /t/2/last HTTP/1.1                                                             527.00   4203 
    
    GET                                                                                375.97   1535 
    /site_customizations/7e202ef2-56d7-47d5-98d8-a9c8d15e57dd.css?target=desktop&v=1                 
    db28a558ce825962e9acdec1c839e07&__ws=what.thedailywtf.com HTTP/1.1                               
    
    GET /t/category-definition-for-meta/2/99999999 HTTP/1.1                            368.57   4203 
    GET /t/enlightened/8795 HTTP/1.1                                                   309.53    226 
    GET /latest.json HTTP/1.1                                                          293.26   4207 
    GET /t/1000/last.json?include_raw=1&track_visit=true HTTP/1.1                      284.48     97 
    POST /notifications/reset-new HTTP/1.1                                             213.60   1905 
    GET /t/bethesdas-game-announcements-another-blakeytweets-topic/49427 HTTP/1.1      204.32    172 
    GET /user_avatar/what.thedailywtf.com/accalia/64/23399.png HTTP/1.1                180.75     19 
    GET /t/random-rant-of-the-night/47727 HTTP/1.1                                     147.00     24 
    GET /t/defensive-programming/49469 HTTP/1.1                                        145.02    178 
    GET /t/1673/last.json?include_raw=1&track_visit=true HTTP/1.1                      137.78     84 
    GET /t/botcore-release-mott555s-bot-pack/4628 HTTP/1.1                             136.67      3 
    GET /site/settings.json HTTP/1.1                                                   134.87   4195 
    
    
    

    The report spans about 19 hours ... cc @apapadimoulis

    • In this interval @PaulaBean made almost 13K web requests to TDWTF, vast majority are to topic/show... costing us almost 1 hour of processing time. I would recommend halving her traffic or quartering it. Poll more aggressively on recent topics, tone it all the way down for old topics. In particular ... https://github.com/tdwtf/WtfWebApp/blob/26539752f732bbe311f6769c6d5e54fa8d302120/TheDailyWtf/Common/Discourse/DiscourseApi.cs#L100 needs to be called less frequently.

    • @RaceProUK and other highly active users has standard usage patterns, a lot of topics were read :)

    • topics/timing counts and durations are WAY too high, we need to tune this method heavily if possible its accounting for an enormous amount of background load

    • @dcon had a like binge, (will not be as bad once latest update is applied) like binge cost us about 20minutes of processing time

    • looks like there is lots of room for improving perf on letter avatar generation, 10k letters =~ 83 minutes of work, that is high. (note it is all cached, but first req slips through)

    • looks like site customizations is not cached and it should be.

    Anyway... plenty for me to look at, will run this on our customers as well and address common issues first. (topic/timings for example is very high on list)



  • @sam said:

    Updated report for last day with analysis

    Eighth place; I'm sure I can do better than that with a little effort. ;)


  • Banned

    well

    this was easy

    and this was not easy


  • Notification Spam Recipient

    @sam said:

    site customizations isare

    FTFY


  • FoxDev

    Based on those numbers, my three bots combined don't produce as much load as I do solo 😆

    It'll still be good to see if I can reduce their load, but I'm not all that worried, since there are 11 pure humans more active than my busiest bot 😌


  • Java Dev

    Yup, based on that list the only problematic bot is paula.


  • FoxDev

    She's also the only user busier than me :rofl:


  • Winner of the 2016 Presidential Election

    First of all: Thank you for not dismissing this forum as "those guys who break anything, anyways" and actually doing these tests and improvements.

    @sam said:

    will run this on our customers as well

    Can you give us some anonymized stats on other customers? Just to see how much better / worse we actually are speedwise?

    Filed Under: we like numbers



  • @Groaner said:

    Or, that column could be updated by a job, but then it's no longer instantaneously accurate.

    Neither of those options seems like a better alternative, unless there's a third one I'm missing...

    When the likes are rounded to the nearest hundred thousand, almost all likes are not going to trigger a change. Who cares if it's not instant on those topics? Even the x.yK topics, probably.

    Might it be possible to set a flag or criterion for whether to instantly update or update by a job? This is a case where the worse it is to update the less we care about the update.


  • ♿ (Parody)

    I optimized this topic by randomizing 9 posts to a new topic: Random numbers from Random posters


  • ♿ (Parody)

    I just changed it from 5 minutes cache to 10 minutes, so hopefully it will help.



  • @sam said:

    We count every god damn like on the topic so we can display "1.5M" on the front page

    Can you throttle this.

    Youtube does it.

    Just say, 100k+



  •  DELETE FROM likes WHERE l.topic_id = 1000
    

  • :belt_onion:

    i've abstained from t/1k-ing the last few days due to work, so i'm not doing my part to help sam find problems



  • Hahaha it's funny because that would work if there was an actual likes table



  • Same here. Busy week :barrier: breaking discourse.


  • Banned

    As a point of comparison with say "boing boing"

    Top 30 routes by Server Load
    
    Route                       Duration Reqs
    -----                       -------- ----
    topics/show                  3904.02 33897
    user_avatars/show_letter     2496.89 13970
    user_avatars/show            1714.12 14079
    topics/timings               1213.13 50220
    topics/wordpress             1101.98 40000
    forums/status                 463.54 68295
    posts/show                    356.56 19154
    users/show                    291.15  1375
    topics/posts                  271.23  6940
    list/latest                   257.17  3923
    posts/create                  252.01   634
    topics/feed                   163.95  3104
    post_actions/create           153.78  1363
    posts/expand_embed            133.19   550
    exceptions/not_found           94.59  1177
    draft/update                   92.69  7432
    user_actions/index             86.83   959
    posts/replies                  77.06  3317
    list/latest_feed               71.26   803
    notifications/index            39.07   608
    posts/update                   19.63   162
    composer_messages/index        17.86   974
    session/create                 17.25    61
    search/query                   15.61    73
    clicks/track                   15.44   775
    omniauth_callbacks/complete    13.66    27
    list/category_latest           10.66   128
    onebox/show                    10.02   111
    posts/reply_history             8.54   355
    post_actions/users              7.64   424
    
    

    So we are able to "show" 33K topics in 3804 secs ... that is about 115ms per topic show (avg)
    On TDWTF it was 407ms per topic show (avg)

    Keep in mind we have some front end caching, so this is only counting topics that slip into the rails stack and are not cached by middleware.

    Looking at another route say list/latest its

    BBS 65ms
    TDWTF 176ms

    So, our hardware runs about 2.7x faster than digital ocean. As expected long topics on TDWTF have some perf impact, but its not enormous when you look at avgs (topic show is 3.5x slower here - if it was the same type of topics as bbs it would probably be 2.7x as well)

    I would not say Digital Ocean has terrible performance, its just that our hosting has spectacular perf cause we have dedicated hardware that is finely tuned.


  • Banned


  • Banned

    @apapadimoulis said:

    I just changed it from 5 minutes cache to 10 minutes, so hopefully it will help.

    That is a good start but it needs a bit more sophistication, e.g. scale the interval with post age as well. On older TDWTF topics poll for comment topic changes even less often.



  • @sam said:

    Will see what happens when I run the report again in say 10 or so hours. Will be very interesting to see the results

    @codinghorror said:

    Clown computing!

    And now you see why we like @sam more.


  • FoxDev

    @Maciejasjmj said:

    @sam said:
    Will see what happens when I run the report again in say 10 or so hours. Will be very interesting to see the results

    @codinghorror said:

    Clown computing!

    And now you see why we like @sam more.

    That needs to be Quoted for Truth.

    oh look someone just did that.

    in all seriousness, @sam is much better at PR than @codinghorror, at least when the target audience is geeks. In this situation we care not for the jokes and the non sequitur (and as a master of the non sequitur saying that should mean something).

    we want to know that our issues are being heard and considered. We want to feel that we matter as users to the developers of the software.

    @sam makes us feel that way because he is professional and acknowledges our issues, he fixes our issues or explains why our issue is not going to be worked on now and when we can roughly expect it to be worked on (or why it's just not going to be worked on at all). @codinghorror posts image memes.

    so yes, there is a reason that we like @sam better.



  • TOXIC!TM clowns?



  • This doesn't happen anywhere else, because no where else has a topic likes this one. -- paraphrased.

    Can you guys for once just see that we're trying to make Discourse better than other forum software? When we're done here, everyone will want to use Discourse, because it can handle warehouse likes.


  • Winner of the 2016 Presidential Election

    @xaade said:

    Can you guys for once just see that we're trying to make Discourse better thanas good as other forum software?

    FTFY

    TBH, though: I could see myself actually liking Discourse more than other forum software if it was less opinionated and would actually work.


  • Banned

    @accalia said:

    we want to know that our issues are being heard and considered. We want to feel that we matter as users to the developers of the software.

    You guys definitely matter! 💕

    just not as much as the people that pay us actual money



  • Did you ever stop to condider that some of us might be contemplating setting up discussion fora in our day jobs? As such we are potential paying customers.

    Did you also consider what you would have to pay for the amount of qualified testing that you get for free here?


  • Banned

    I think that at this point its kind of unfair to expect @codinghorror not to be trolling you here, heck not a day goes by without a new topic about "yet another stupid decision the Discourse team made, again" ™ ... my method of dealing with it is simply ignoring, Jeff has a harder time ignoring constant abuse. Telling him off for being a poor rape victim is kind of rich.

    Trolling aside, we treat all the bug reports we get here very seriously and strongly appreciate the quality of bug reports and testing we get here. Jeff has communicated that multiple times.

    Similar to how many of you have TDWTF personas divorced from your real life personas, Jeff has been shell shocked into creating a custom one here.

    The clown computing thing, though trolly is relevant, there is a trend of seeking salvation in the cloud, people are giving up on the art of custom building hardware to match a performance problem. The cloud perf numbers reflect that.



  • I joined the forum after Jeff had essentialiy left it, so all I've ever seen are the horror stories about him. That's why I was a bit surprised to see that when he for the first time in a very long time make a post, it is to troll the uses here. I ignotred the first post as a troll, but the second is a bit ... disdainful... I guess I lack the history to put it into context, except that I agree that the discobashing gets old very quickly.

    As for cloud computing - it's currently the cheapest alternative. Custom building hardware will not be forgotten, and as soon as it is too expensive to use the cloud, people will revert back to this.


Log in to reply