You merely adopted the 500. I was born in it, molded by it. I didn't see a fully rendered Discourse page until I was already a man

RaceProUK

Before:

After:

Preserved for posterity.

And as expected, a hard refresh fixes the old images.

flabdablet

@Onyx said:

If you're reading this Jeff, we're just having a laugh man. Chill out.

Stewart Lee and Top Gear – 14:01
— richardc1975

blakeyrat

I'd love to see a benchmark.

Either way, it's save MORE time to have the server rebake on-demand, since 90% of those posts will never actually be viewed between one rebake and the next.

But again: too fucking obvious to Ruby programmers. Must... overcomplicate... everything...

boomzilla

@blakeyrat said:

Either way, it's save MORE time to have the server rebake on-demand, since 90% of those posts will never actually be viewed between one rebake and the next.

For the case of avatars, I totally agree with you. The concept in general is not horrible, though, if say, the broken quoting markup gets fixed and now quotes render properly.

It's darkly funny that in a system that rate limits so many weird things, it doesn't bother to rate limit obvious DOS stuff that it does to itself.

flabdablet

So it turns out there does seem to be only one set of avatar pngs stored for me, whose canonical URL is currently /user_avatar/what.thedailywtf.com/flabdablet/{size}/20265.png, to which any URL of the form /user_avatar/what.thedailywtf.com/flabdablet/{size}/{number}.png gets 302 redirected.

Somebody is wearing complicator's gloves.

sam

Images are cached forever, so I would need to log in to all clients and nuke browser cache

We can serve redirects

sam

We got it fixed in master now it is rate limited

RaceProUK

Why not set the Cache-Expiry header?

flabdablet

You're supposed to be on holiday. How do you expect us to complain about your work behind your back if you keep checking in?

sam

I wanted all images to be served from can indefinitely I guess we could set a one month expiry , but still I prefer to have the canonical urls out there then one that redirects

RaceProUK

That does though raise the question of why you're changing the URL in the first place; the whole mess could be avoided by just reusing the same URL

sam

But then it would take N days till you see brand new avatars we can not set expiry to 0

RaceProUK

True, but then no other forum software shows this issue, so there must be a way to fix it

flabdablet

https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching_FAQ#What_other_factors_influence_revalidation.3F

abarker

@accalia said:

why do i get the feeling that user was me?

let's see.... i'm one of the highest post count users here... check... i change my avatar frequently.... check...

damnit! not again!

Now, now, it might also be me! We each have a lot of posts and long avatar histories.

@RaceProUK said:

Hedgehog post count: 9840
Fox post count: 18998

Hatter post count: 12664

We each change at least once a week. I'd say that there's a good chance that any one of us could be triggering the problem.

blakeyrat

Why not? Super "easy" open sores frameworks give no access to the headers?

flabdablet

My browser, when it sees <img src=http://what.thedailywtf.com/user_avatar/what.thedailywtf.com/flabdablet/120/18181.png>, is going to check its own cache to see whether it already has that image regardless of whether or not you would 302 it on a subsequent GET request because it can't possibly know you're going to do that until it actually makes such a request.

My avatar-change post above does still have the old 18181.png links, even though you're now 302ing those to the now-canonical 20265.png links, and it works - I do see new images even though the links inside the post have not been altered.

In other words, the fact that you've got a change of avatar URLs and a 302 happening at your end has no bearing on the caching behavior of my browser. You could leave all that out and stuff would work exactly the same as it does right now. It's doing the right thing not because you're changing the URLs, but because browser caching actually works the way it's supposed to as long as your Date: headers are not wildly inaccurate.

rad131304

@sam said:

But then it would take N days till you see brand new avatars we can not set expiry to 0

Have you not heard of ETag?

HTTP ETag - Wikipedia

blakeyrat

Look, the real problem here is that all of those crappy old forums that Discourse devs make fun of have NO PROBLEM changing avatars.

This is just yet another embarrassing bug. Community Server even did this right, and it didn't do anything right. It doesn't matter how or why CS got this right; the point is: THEY DID.

Extremely embarrassing. (Not that Discourse developers have any sense of shame whatsoever. But to a normal person it'd be embarrassing.)

rad131304

@blakeyrat said:

Look, the real problem here is that all of those crappy old forums that Discourse devs make fun of have NO PROBLEM changing avatars.

This is just yet another embarrassing bug. Community Server even did this right, and it didn't do anything right.

Extremely embarrassing.

BUT WE MUST REBAKE ALL THE THINGS!

abarker

@sam said:

But then it would take N days till you see brand new avatars we can not set expiry to 0

There are a few things that you can do here:

Set the expiration headers. Once the expiration hits, then the clients just asks the server if the avatar image has been modified since last retrieved. If it has, you get the new headers. If you go this route, you probably want to use a short period around a day or two. The problem with this approach is that it increases requests to the server as the client asks to validate the version in its cache. With something like avatars on a busy forum, that could get expensive.
Use the cache-control: must-revalidate header. With this is one you have a bit more control over, though it would be trickier to properly implement. You would somehow need to determine whether a user needs to receive that header for a specific avatar, which means somehow figuring out when they last got the avatar for a specific user.
Use both. Maybe the best approach for Discourse is to set the expiration headers to something like a month, and then send the must-revalidate headers for a week or two after an avatar has been updated. Sure, users who visit less often might hit a gap in the update channel, but that isn't too likely.

Anyway, just spit balling some ideas that you might be able to use.

blakeyrat

<- THAT IS A FUCKING 4 DISCOURSE FUCK YOU! Even if you're committed to the dumb concept of "rebaking" to fix avatars, and even if you have to use your inefficient broken memory hog 500-error generating dumb code to do it, do it "on-demand" instead of all at once. Because I guarantee 90% of those posts will never be loaded between one rebake and the next. So you're not just killing your server, but you're killing your server doing work that doesn't even need to be done.

Also if you implement my ideas, pay me money.

OffByOne

It seems you infected Discourse with the brainworms from you avatar.

That's what your avatar is, right? A wormy brain?

flabdablet

I believe it's actually a dolphin's stomach. But it looks more like brainworms than actual brainworms do. Actual brainworms look more like bad cauliflower:
http://discovermagazine.com/~/media/import/images/0/7/e/brainworms.jpg
If you have more Japanese than I do, perhaps you could translate this caption:
https://blog.insureandgo.com/wp-content/uploads/2012/07/the-worlds-weirdest-museums-parasites.jpg

abarker

GIS suggests it has something to do with pinworms.

Magus

The only word there I know is dolphin. No idea what 'anisakisu' is, or what it's language of origin is.

mott555

I think that last picture will give me nightmares.

TwelveBaud

@PJH said:

I'm waiting for @apapadimoulis to sort out the disk size, before we start getting backups sorted again.

We ready to move to vNucleus yet?

accalia

@TwelveBaud said:

We ready to move to vNucleus yet?

If it gets us better performance and actually allows backups to work.

i think @sam wants us to stay where we are for now though as he wants to try to get us performant at our current size.

Maybe we can get him to figure out where all the disk space went and liberate some for us?

Arantor

How about all these avatars? ;)

RaceProUK

It's probably more accurate to blame all my Amy Rose images instead ;)

loopback0

@RaceProUK said:

It's probably more accurate to blame ~~all my Amy Rose images instead~~ Discourse.

FTFE.

TwelveBaud

Favor to ask anyone with shell (I guess this means just @apapadimoulis right now?):

du /var/discourse/shared | sort -nr | head -n25

That'll tell us the 25 biggest directory trees/files; I fully expect to see "/standalone" and "/standalone/uploads" at the top, but what comes after could be surprising.

ben_lubar

I think the SSH key changed when the container's hard drive space quota was increased. Which is weird.

riking

~~I suspect the IP changed instead...~~

~~what.thedailywtf.com. 447 IN A 162.243.208.23~~

Unless Alex did a Discourse restore, which I don't think we saw happen.

ben_lubar

I have 162.243.208.23 in my ssh client and it hasn't changed.

PJH

@accalia said:

Maybe we can get him to figure out where all the disk space went and liberate some for us?

A 2G discrepancy between [Disk size] and [Used+available]
10G for the database
6G on images/avatars and other uploaded stuff
12G on Docker
2.5G on logs
20G because DO didn't provision the disk correctly.

@TwelveBaud said:

du /var/discourse/shared | sort -nr | head -n25

root@what:~# du /var/discourse/shared | sort -nr | head -n25
23955024        /var/discourse/shared
23955020        /var/discourse/shared/standalone
10098632        /var/discourse/shared/standalone/postgres_data
9758408 /var/discourse/shared/standalone/postgres_data/base
9739528 /var/discourse/shared/standalone/postgres_data/base/16384
6064448 /var/discourse/shared/standalone/uploads
5933048 /var/discourse/shared/standalone/uploads/default
5527116 /var/discourse/shared/standalone/backups
5527112 /var/discourse/shared/standalone/backups/default
2053056 /var/discourse/shared/standalone/uploads/default/_optimized
1571296 /var/discourse/shared/standalone/log
1337252 /var/discourse/shared/standalone/log/var-log
1336504 /var/discourse/shared/standalone/log/var-log/nginx
445464  /var/discourse/shared/standalone/redis_data
247912  /var/discourse/shared/standalone/import_scripts
234040  /var/discourse/shared/standalone/log/rails
197052  /var/discourse/shared/standalone/postgres_data/pg_log
131080  /var/discourse/shared/standalone/postgres_data/pg_xlog
63912   /var/discourse/shared/standalone/uploads/letter_avatars
63908   /var/discourse/shared/standalone/uploads/letter_avatars/3_90a587a04512ff220ac26ec1465844c5
45312   /var/discourse/shared/standalone/uploads/stylesheet-cache
41084   /var/discourse/shared/standalone/uploads/default/17150
41084   /var/discourse/shared/standalone/uploads/default/17149
41084   /var/discourse/shared/standalone/uploads/default/17148
41084   /var/discourse/shared/standalone/uploads/default/17147

@ben_lubar said:

I think the SSH key changed when the container's hard drive space quota was increased. Which is weird.

@Mail from Alex said:

... so I just heard back from DO, and I basically need to take a snapshot, destroy the droplet, then make a new one in order to get the disk size right.

root@what:~# cat ~/.ssh/authorized_keys

root@what:~#

Well that worked well... :/

Luhmann

@PJH said:

root@what

seems strangely appropriate ...

PleegWat

So that makes:

All 23.9 GB in standalone
10.1 GB in postgres_data
6.1 GB in uploads
5.5 GB in backups
1.5 GB in logs
0.4 GB in redis_data
0.2 GB in import_scripts

accalia

@PJH said:

20G because DO didn't provision the disk correctly.

have we corrected that? or do we now just have 20 GB of disc space that we can't use?

TwelveBaud

And who's uploading a bunch of 41MB files?

Onyx

@TwelveBaud said:

And who's uploading a bunch of 41MB files?

Didn't we find some kind of bug with SVGs getting inflated or something?

PJH

@accalia said:

have we corrected that? or do we now just have 20 GB of disc space that we can't use?

That's what last night's downtime was to sort out - to increase the visible size from 40G to 60G.

@TwelveBaud said:

And who's uploading a bunch of 41MB files?

Pass. The first one that's listed is actually ascii with a .gif extension:

/17150$ head 938f3371f66b4dc2.gif; echo; tail 938f3371f66b4dc2.gif ; echo; wc -l 938f3371f66b4dc2.gif
" fill="white"/>
  <circle cx="2" cy="0" r="1" fill="white"/>
  <circle cx="3" cy="0" r="1" fill="white"/>
  <circle cx="4" cy="0" r="1" fill="white"/>
  <circle cx="5" cy="0" r="1" fill="white"/>
  <circle cx="6" cy="0" r="1" fill="white"/>
  <circle cx="7" cy="0" r="1" fill="white"/>
  <circle cx="8" cy="0" r="1" fill="white"/>
  <circle cx="9" cy="0" r="1" fill="white"/>
  <circle cx="10" cy="0" r="1" fill="white"/>

  <circle cx="891" cy="899" r="1" fill="white"/>
  <circle cx="892" cy="899" r="1" fill="white"/>
  <circle cx="893" cy="899" r="1" fill="white"/>
  <circle cx="894" cy="899" r="1" fill="white"/>
  <circle cx="895" cy="899" r="1" fill="white"/>
  <circle cx="896" cy="899" r="1" fill="white"/>
  <circle cx="897" cy="899" r="1" fill="white"/>
  <circle cx="898" cy="899" r="1" fill="white"/>
  <circle cx="899" cy="899" r="1" fill="white"/>
</svg>

810000 938f3371f66b4dc2.gif

Looks like some abortive attempt to copy/paste a SVG (the second is similar.)

Onyx

That's one insane SVG.

accalia

@PJH said:

That's what last night's downtime was to sort out - to increase the visible size from 40G to 60G.

ah... well then. ;-)

TwelveBaud

900x900 individually-specified white pixels? Which don't even show up because they're missing a <svg><circle cx="0" cy="0" r="1? Do we even need to ~~retain~~permit these?

sam

I have been looking at logs and perf and stuff seems a lot calmer here, not seeing 503 central in the logs anymore

On meta we moved to ruby 2.2 a few weeks ago which is much more friendly to the CPU. I plan to change our official image to be Ruby 2.2 very soon, so that is another big perf gain.

cc @apapadimoulis

TwelveBaud

When is the next "we should get managed hosting so @sam only has to worry about Discourse and not fighting all our fires" discussion?

TwelveBaud

@accalia said:

grump grumble. mutter.

what is with these bad backups?!

rassum frassum grumble

hungrier

Maybe that's the excess white space that's causing our performance problems.