You merely adopted the 500. I was born in it, molded by it. I didn't see a fully rendered Discourse page until I was already a man

Yamikuronue

not a place where we only come to ask for help

SE is much better at that than Discourse, even given all the ways in which SE is terrible.

ChaosTheEternal

users are by default limited to 5-10 likes per day

I thought it was 50, at least it was when WTDWTF was new. With a large enough user base (much larger than ours) you could hit the like count that caused issues with auto-closing within a few years of being on Discourse.

TwelveBaud

My mistake; users are limited to 50 likes at TL0-1, 75 at TL2, 100 at TL3, and 150 at TL4.

I could have sworn it used to be less than that though, since I used to be stingy with likes and still hit the limit.

apapadimoulis

@abarker said:

@apapadimoulis said:
So if you want modern software, it's gonna be built in on Ruby and run on Linux.

Really? I work on modern software (granted, it isn't forum software) and I use .NET.

@arantor works on what I assume is relatively modern software in PHP.

At work, we are adopting a new collaboration platform to help us work with an overseas team. It's written in Python.

For fucks sake man, you may as well be using sharpened sticks and COBOL to make software.

Though I think Python is ok... next time in Mecca (you know, that one co-working space in the bay area?), I'll check with those mustachioed growth-hackers building the next generation of disruptive game changers. They got like $40M in funding, so clearly they know more than all of us.

RaceProUK

@apapadimoulis said:

They got like $40M in funding, so clearly they know more than all of us.

Or they're just really good at bullshitting :P

apapadimoulis

@RaceProUK said:

Or they're just really good at ~~bullshitting~~ embezzling

FTFY

RaceProUK

And almost as if to prove my point, I made that post, then Discourse shat itself again. And we're polling the six-post-long t/2 now instead of the site-killing t/1000. And the bots are still off.

OK, it was momentary, but still...

TwelveBaud

@apapadimoulis, I just had a nice little chat with James over at vNucleus (I think he's @Thalagyrt here), and he's up for hosting and providing (platform-level, not forum-level) support for Discourse if we're willing to pay -- and at this point I'm willing to pay. You cool with that?

Hey Andrew,
I remember when they switched over to Discourse. Personally, I think Discourse gets quite a few things wrong with the way they recommend deploying. The major issue is the usage of Docker. Docker simply is not ready for primetime. There are a ton of unsolved problems, such as how do you update packages inside the container, how do you do application aware backups, etc. So first things first, I'm going to recommend not following Discourse's deployment guidelines based on the premise that Docker needs about another 5-10 years before it's ready to be used in a production environment. Deploying a Rails app within Puma is pretty straightforward - heck, our portal is Rails running in Puma behind Nginx.

Getting the application and its dependencies running is definitely something we can do, as is tuning for the level of traffic you receive. If we're going to add any custom indexes, I'd recommend following the open source best practices and forking Discourse, adding indexes via migrations, and making pull requests back to Discourse with an explanation of why the index was added. Have you really run into situations where you need to add indexes? I'd think that should be pretty rare - the Discourse folks have done a pretty good job with their schema from what I can see. In any case, we can deploy with log_min_duration_statement set to 250 milliseconds or so, which'll make it easier to track down any queries that have problems.

I think your biggest issue right now is going to be the droplet size [I told him I though we were on the $5 droplet]. If I'm understanding you correctly and you're using a 512MB instance running Rails (1 worker takes up about 250MB, and you probably have more than 1 worker), Postgres (will eat up as much RAM as it can get), and Redis (can run on a toaster when only used for Sidekiq), you're going to be swapping, and that's going to be hurting performance. Are you able to look at the output of free -m? I'd be curious to see how bad it is right now.

[...]

Cheers,
James

Onyx

@TwelveBaud said:

Have you really run into situations where you need to add indexes? I'd think that should be pretty rare - the Discourse folks have done a pretty good job with their schema from what I can see.

Poor bastard, he doesn't know the half of it...

Does whatever we would pay cover all the Ibuprofen he'll need, or do we have to pay extra for that?

RaceProUK

512MB instance

All this memory-hungry stuff running, and it's being given a sardine tin to run in.

No wonder this site's fucked so often...

blakeyrat

The AWS server I just set up to host the old forums has a gig. Hah. (And I'm purposefully doing an as-cheap-as-possible setup.)

TwelveBaud

We could actually be on a 1GB instance, but that's still the minimum recommended memory, probably suited for things like "My Favorite Minecraft Server's Forum of Things".

RaceProUK

$20/mo would get 2GB on Digital Ocean, and would also get a two-core processor; it may still not be enough, but fuck me, it'd be an order of magnitude better than the C64 we're currently on...

dkf

Liked for the very image of running this thing on the good old Commode

apapadimoulis

I definitely appreciate the offer. But, I'm not sure it's Docker TBH. The most likely thing IMO is those massive threads (16k, 40k) that the queries and the like probably weren't optimized for. I don't think it's a common thing (I've Never seen a "real" thread that long), and it's easy to miss in testing...

Anyway it's a 4GB droplet, and I did the free-m thing, it says...

total: 3953
used: 3783
free: 170
shared: 788
buffers: 22

-/+ buffers/cache 1945 (used), 2007 (free)
Swap is 0,0,0.

RaceProUK

@apapadimoulis said:

4GB droplet

Oh.

Well, there goes the low-memory theory, I guess…

Having said that, that free figure is very low…

Thalagyrt

Yeah, 4GB is plenty for what you're doing. The output of free looks extremely healthy, you're using less than 50% of your RAM.

My comments about Docker weren't from a performance perspective, but from a systems management perspective. There are problems that are solved in traditional ops that Docker hasn't solved yet. The biggest one is backups, which IMO are critical.

dkf

That looks very tight. Especially with the complete lack of swap; almost any memory spike will cause processes to be slaughtered (or perhaps mmap()ed files shrunk in their physical allocation).

Thalagyrt

Yeah, the lack of swap is rather concerning. That's easily fixable though.

RaceProUK

@dkf said:

almost any memory spike will cause processes to be slaughtered

Quick! Everyone to t/1000!

Yeah yeah, Evil Ideas thread is

TwelveBaud

@ben_lubar tells me we're under disk space pressure though, so there may not be enough room for swap.

RaceProUK

@TwelveBaud said:

we're under disk space pressure

I know we have a fair amount of content, but 60GB?

dkf

@Thalagyrt said:

There are problems that are solved in traditional ops that Docker hasn't solved yet.

That depends on where the persistent state is stored. If that's out of the docker image, the other issues aren't too much of a worry, as you'll be able to switch to an updated image relatively simply: just point it at the persistent store and you're done.

If it's inside the image though (which is common because it gives a “grab and go” setup for easy testing)…

Thalagyrt

Seems like Discourse's indexes are explicitly targeting MySQL's query optimizer, looking at them more in depth. Lots of things are missing that would allow Postgres to come up with more efficient optimization plans. There's next to no need for multi-column indexes in Postgres, and pretty much every index is a multi-column index. Postgres can use a multi-column index for a single column, but only if that single column is the first column in the multi-column index. More to the point, Postgres can combine multiple single-column indexes on the same table and gain benefits from both. The only time all these multi-column indexes make sense is when you're using the entire index in a query. i.e. with col_a and col_b indexed, your query is exactly where col_a=? and col_b=?.

Why the hell is topic_id not indexed? Is it just not used in queries at all, or was that an oversight? I'm not incredibly familiar with Discourse.

dkf

@Thalagyrt said:

Why the hell is topic_id not indexed? Is it just not used in queries at all, or was that an oversight?

Probably wasn't necessary for the use-cases supported by the first few versions of the code and they've never revisited the indices as the system complexity has grown. I've seen lots of people make that particular blunder.

RaceProUK

@Thalagyrt said:

Seems like Discourse's indexes are explicitly targeting MySQL's query optimizer

Yet from what I can tell people prefer to run it on Postgres…

TwelveBaud

@RaceProUK said:

I know we have a fair amount of content, but 60GB?

@ben_lubar said:

I'm not at my computer with the correct SSH key to check right now, but I remember it having 30GB of disk space, so probably the $10/month one. I suggest going for a bigger one, since I had to delete all the backups to make space for a new one and knowing TDWTF, we'll run out of space solely using the likes thread.

@dkf said:

That depends on where the persistent state is stored.

I'm told Discourse keeps it outside the image, so we should be good there.@Thalagyrt said:

Why the hell is topic_id not indexed? Is it just not used in queries at all, or was that an oversight?

It is used in queries, and it's probably an oversight. It is in two multi-column indices, but the query experiencing trouble doesn't use any of the other columns from them, and does use columns outside the index, so it's probably getting hit with a full table scan anyway. I'll check the query planner though.

Thalagyrt

Actually, this index is usable for queries on just topic_id.

posts | index_posts_on_topic_id_and_post_number | topic_id, post_number

TwelveBaud

From a virgin setup:

EXPLAIN SELECT "posts"."id" FROM "posts"
  WHERE ("posts"."deleted_at" IS NULL)
  AND "posts"."topic_id" = 1000
  ORDER BY "posts"."sort_order" ASC

                        QUERY PLAN                         
-----------------------------------------------------------
 Sort  (cost=3.18..3.19 rows=1 width=8)
   Sort Key: sort_order
   ->  Seq Scan on posts  (cost=0.00..3.17 rows=1 width=8)
         Filter: ((deleted_at IS NULL) AND (topic_id = 1000))
(4 rows)

Looks like the planner's not using indexes. Could be because there's not enough data though.

See why I want to host with you guys?

Thalagyrt

The planner won't use indexes unless the data is large enough that you'll see performance gains from using them afaik

TwelveBaud

@ben_lubar, still awake and got your SSH key? Wanna ./launcher enter app, su discourse, psql, \connect discourse, and run us that EXPLAIN?

Onyx

I'm still mostly worried about that "grab all the post IDs on each query" stuff.

I'm not sure if it was figurative "we use OFFSET, OFFSET is slow", or literal "we grab ALL THE THINGS and then filter in ORM".

Surely, not even OFFSET can be that slow for 40k records?

Adynathos

@TwelveBaud said:

and at this point I'm willing to pay. You cool with that?

I have a question for TDWTF forum community, as there is something I do not understand here.

For a long time, many users have expressed disappointment with Discourse - it's design decisions, bugs and recent instability. Compared to other forum communities, TDWTF focuses a lot of attention on the technical shortcomings of their forum software. The site owner does not want to change it, and this is perfectly understandable, as this is a hobby project and the community is not entitles to his additional work for free.

However, hosting a forum is neither very difficult nor very expensive - I host a forum for my game for about 2 euro per month. I have seen many users put a lot of effort in TDWTF forum - making bots, a status monitor etc. So, why has no one though about creating a new TDWTF forum using a popular and stable software, separately hosted - without additional work for the site owner? (It should be cheap anyway + if needed it could receive donations from users).
(Even I could do that, and the community has many better developers)

I am not writing to encourage that change or demand something. I am just curious why people repeatedly complain about a problem (often quite emotionally), yet the potential solution is never even discussed.

RaceProUK

@Adynathos said:

So, why has no one though about creating a new TDWTF forum using a popular and stable software, separately hosted - without additional work for the site owner?

Because you'd either have to import all the content into the new site (which is really difficult and time-consuming), or you'd have to maintain both old and new sites side-by-side.

TwelveBaud

Plus now you've got two old sites to import from, one of which already has a partial import of the other.

Adynathos

Thank you for the explanation.
A lazy solution could be to lock the old forum and keep it as a immutable archive, but I can imagine this would be inconvenient as well.

riking

@TwelveBaud said:

My mistake; users are limited to 50 likes at TL0-1, 75 at TL2, 100 at TL3, and 150 at TL4.

I could have sworn it used to be less than that though, since I used to be stingy with likes and still hit the limit.

Pretty sure that change isn't deployed yet.

TwelveBaud

Also, now that the old forums are back up (Thanks Blakey!), see this thread where Alex discusses his decision, and his rejection of exactly what you suggest.

riking

If you run a Minecraft server with 1G of RAM, you're going to need a player cap, otherwise the server would OOME.

RaceProUK

@dromed said:

I look forward to our little WTF group stress-testing their plaything beyond breaking point.

And it will break.

Understatement Of The Millenium.

ben_lubar

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Sort (cost=70087.58..70188.80 rows=40486 width=8)
Sort Key: sort_order
-> Bitmap Heap Scan on posts (cost=1239.17..66989.37 rows=40486 width=8)
Recheck Cond: (topic_id = 1000)
Filter: (deleted_at IS NULL)
-> Bitmap Index Scan on index_posts_on_topic_id_and_post_number (cost=0.00..1229.05 rows=40617 width=0)
Index Cond: (topic_id = 1000)
(7 rows)

TwelveBaud

Absolutely, and probably 5 or 6 players at most. Which is juuuust about the right size for a 1GB Discourse install too.

riking

@Onyx said:

I'm not sure if it was figurative "we use OFFSET, OFFSET is slow", or literal "we grab ALL THE THINGS and then filter in ORM".

Noo... it's "we grab all the things and filter in JS".

TwelveBaud

It looks like the check for deleted posts is what's driving that query up into the multiple-second range.

Adynathos

Even if a terribly inefficient query is done to display a big topic, isn't the result cached until the topic changes?

RaceProUK

@Adynathos said:

isn't the result cached until the topic changes?

Depends where it's cached…

Thalagyrt

I'm more than a bit surprised that the planner isn't using the index on topic_id before filtering on the unindexed deleted_at.

Edit: Doh. Read the query plan backwards. Hello, Tassimo...

ben_lubar

@TwelveBaud said:

still awake

It's Sunday at noon US Central Time, which means I just got done being licked by cats.

This one was the best one today: http://www.wihumane.org/adopt/animal?id=25313069

Onyx

@riking said:

Noo... it's "we grab all the things and filter in JS".

akjrg

ldghkfdgk k fdkj sadfsdakjfds sdafjdsf asdf

DSFGSD AFKLDSKJ SDKJ ESFSDL!

#FSDSHKFLSDF?

Fkds kdsaf:

RaceProUK

I do believe that's the literal result of a