Cooties
-
If I had a lot of free time, I'd start a blog and publish a post a day about something Discourse's developers didn't understand and therefore ruined.
The database is like fifteen gigabytes uncompressed. The uploads folder is 8.6GB.
In fact, let's look at the uploads directory.
5.8M /var/discourse/shared/standalone/uploads/default/avatars 2.6M /var/discourse/shared/standalone/uploads/default/_emoji 739M /var/discourse/shared/standalone/uploads/default/optimized 2.2G /var/discourse/shared/standalone/uploads/default/_optimized 1.9G /var/discourse/shared/standalone/uploads/default/original du: cannot access ‘/var/discourse/shared/standalone/uploads/default/_original’: No such file or directory
And then I was all
root@what:~# du -sh /var/discourse/shared/standalone/uploads/default/<tab>
... wait, why isn't tab completion working?
[fifteen seconds later]
Display all 21492 possibilities? (y or n)
Fun fact: there are no files in the uploads directory. Of the 21492 directories inside, only 5 have multiple children.
50630 directories 70346 files
So let's look at the data we've collected.
- 15 gigabyte database. Let's call it 16 gigabytes just to be nice to Discourse.
- 8.6 gigabyte uploads directory tree, but with 0.7 + 2.2 gigabytes of "optimized" images. So that's 5.7 gigabytes of uploads.
- Using my incredible math skills and the ability to type two digit numbers into a calculator, I have discovered that the entirety of this forum's data takes up 21.7 gigabytes of disk space.
- The server has a 60GB root filesystem.
- According to Ubuntu's system requirements, Ubuntu Server takes up about a gigabyte.
- Therefore, Discourse has managed to fill up 60 gigabytes of space with 22.7 gigabytes of data. If that isn't amazing, I don't know what is.
-
-
If I had a lot of free time
Actually, maybe I should write up some abstracts and let the good people at http://thedailywtf.com/submit-wtf do the blogging.
-
sam.saffron@[redacted because fuck spam].com
Should be
sam.saffron@[redacted%20because%20fuck%20spam].com
, surely?
-
```
root@what:~# du -s /var/lib/docker
34178240 /var/lib/docker
root@what:~# du -s /var/discourse/shared/standalone
37164992 /var/discourse/shared/standalone@ben_lubar <a href="/t/via-quote/53615/96">said</a>:<blockquote>``` root@what:/# du --max-depth=1 /var/www/discourse/tmp/backups/default/ 3720040 /var/www/discourse/tmp/backups/default/2015-12-19-042932 3722508 /var/www/discourse/tmp/backups/default/2015-12-20-035253 7282576 /var/www/discourse/tmp/backups/default/2015-12-22-044854 ```</blockquote> Given that information, plus the additional information that: - Discourse doesn't lose all of its information when you "rebuild the docker instance", which is a thing that only Discourse developers have ever told anyone to do. - `/var/www/discourse` is inside `/var/lib/docker` because magic. - The Ubuntu running Discourse is inside the Ubuntu running Docker and uses separate system files. We can figure out: - The contents of `/var/lib/docker` minus the three failed backups' temporary files is about 20GB. At least 1GB of that is Ubuntu. I'm not sure what the 19GB of Discourse is for, but that's what it is.
-
19GB of Discourse
Well, yeah, but shirley it resides something logical, right? Like, is it cache? What is the nature of the temp files?
-
root@what:/# du -h --max-depth=1 /var/www/discourse/tmp/
8.0K /var/www/discourse/tmp/backups
20K /var/www/discourse/tmp/pids
1.9M /var/www/discourse/tmp/ember-rails
46M /var/www/discourse/tmp/stylesheet-cache
65M /var/www/discourse/tmp/cache
4.0K /var/www/discourse/tmp/miniprofiler
4.0K /var/www/discourse/tmp/sockets
112M /var/www/discourse/tmp/root@what:/# ls /var/www/discourse/vendor/bundle/ruby/2.0.0/gems | wc -l 247 root@what:/# du -sh /var/www/discourse/vendor/bundle/ruby/2.0.0/gems 363M /var/www/discourse/vendor/bundle/ruby/2.0.0/gems
The entire filesystem, apart from the mounted host directory, is about 2GB. Something's wrong with that taking up 20GB.
-
Something's wrong with that taking up 20GB.
I'll say!
Maybe it's all the Sparse nginx logs I heard mentioned in another thread...Edit: Nope, it was this thread. How do we detect if invisible files are taking up space?
-
Interesting ServerFault question about this:
What's
lsof
claim is being used?
-
How do we detect if invisible files are taking up space?
What's lsof claim is being used?
(nb: I replaced the outside-of-container username with the inside-of-container username for the same user ID)
root@what:~# lsof | head -n 1; lsof | grep deleted COMMAND PID TID USER FD TYPE DEVICE SIZE/OFF NODE NAME postmaste 1966 postgres 287u REG 253,1 16777216 3155775 /shared/postgres_data/pg_xlog/000000010000075200000073 (deleted) postmaste 8500 postgres 175u REG 253,1 16777216 3155894 /shared/postgres_data/pg_xlog/00000001000007520000008B (deleted) postmaste 12803 postgres 260u REG 253,1 16777216 3155816 /shared/postgres_data/pg_xlog/000000010000075200000082 (deleted)
-
From that question: http://serverfault.com/questions/275206/disk-full-du-tells-different-how-to-further-investigate/581521#comment882057_275233
@PJH, how many times have you updated Discourse since the last "rebuild of the container"? Docker containers are supposed to be built, used with an external data storage location, and then discarded when there's an update. They're not VMs. Discourse is .
-
-
-
I'm not sure what the 19GB of Discourse is for
Don't worry, neither are the Discodevs I'd imagine.
-
-
@PJH, how many times have you updated Discourse since the last "rebuild of the container"?
None, the last rebuild was the last update. (Or rather the last update was a rebuild. )
-
Before today's backupcrash that I have to clean up after every day, can you disable the automated backup in the admin panel?
Better late than never...
Seems the main backup only tries once a week.
Will hobble @shadowmod later.
-
maybe I should write up some abstracts and let the good people at http://thedailywtf.com/submit-wtf do the blogging.
I'd be happy to help! :)
-
I'm not sure what the 19GB of Discourse is for, but that's what it is.
More temporary file leak?
-
Did anyone check the contents of the tar file?
-
Yeah, @tar, what're you made of?
-
This is definitely fascinating. Keep up the good work, @ben_lubar!
Also, why does it take several seconds for the name suggestion popup to show? If I can write an LDAP query which crawls through 10 AD domains across 6 continents in less than 5 seconds, then surely Discourse should be able to do a
SELECT TOP(5) username FROM users WHERE username LIKE '@query%'
in less than 1 second?
-
Your LDAP queries aren't going through 29174 layers of JS and Ruby hell presumably.
-
Your LDAP queries aren't going through
2917431415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679 layers of JS and Ruby hell presumably.FTFMFY
-
Updates..
db_logging_collector: on
Now offDISCOURSE_DEVELOPER_EMAILS: 'sam.saffron@[redacted because fuck spam].com'
Adjusted
RBTRACE: 1
Now 0.
Of course, changing all of those won't take effect until the next rebuild.
@PJH how much of these can we get rid of?
root@what:/var/discourse/shared/standalone# (df; df -h) | grep vda /dev/vda1 61796348 43769624 14864612 75% / /dev/vda1 59G 42G 15G 75% / root@what:/var/discourse/shared/standalone/postgres_data/pg_log# find /var/discourse/shared/standalone/postgres_data/pg_log -mtime +5 -exec rm {} \; root@what:/var/discourse/shared/standalone/postgres_data/pg_log# (df; df -h) | grep vda /dev/vda1 61796348 37316488 21317748 64% / /dev/vda1 59G 36G 21G 64% / root@what:/var/discourse/shared/standalone# rm /var/discourse/shared/standalone/log/var-log/nginx/*.gz root@what:/var/discourse/shared/standalone# (df; df -h) | grep vda /dev/vda1 61796348 36581764 22052472 63% / /dev/vda1 59G 35G 22G 63% / root@what:/var/discourse/shared/standalone# find /var/discourse/shared/standalone/postgres_data/pg_log -mtime +1 -exec rm {} \; root@what:/var/discourse/shared/standalone# (df; df -h) | grep vda /dev/vda1 61796348 36345348 22288888 62% / /dev/vda1 59G 35G 22G 62% / root@what:/var/discourse/shared/standalone# du -h /var/discourse/shared/standalone/{log,postgres_data/pg_*log} | sort -rh 822M /var/discourse/shared/standalone/log 461M /var/discourse/shared/standalone/log/var-log/nginx 461M /var/discourse/shared/standalone/log/var-log 361M /var/discourse/shared/standalone/log/rails 273M /var/discourse/shared/standalone/postgres_data/pg_xlog 48M /var/discourse/shared/standalone/postgres_data/pg_clog 2.3M /var/discourse/shared/standalone/postgres_data/pg_log 12K /var/discourse/shared/standalone/log/var-log/apt 4.0K /var/discourse/shared/standalone/postgres_data/pg_xlog/archive_status
docker images -a
I don't know enough about docker to say which of those can be removed, or what the effect may be, though I agree, only one of them appears to be being used (some might be snapshots? Virtual Size isn't actual size etc.):root@what:/var/discourse/shared/standalone# docker images -a REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE local_discourse/app latest 1f19af41e12e 10 weeks ago 1.932 GB samsaffron/discourse 1.0.13 27f52292c186 3 months ago 1.238 GB <none> <none> 157f6a775410 3 months ago 1.238 GB <none> <none> 7ea991d02f7e 3 months ago 1.238 GB <none> <none> 65746d67224e 3 months ago 1.238 GB <none> <none> dd46ba35af06 3 months ago 821.1 MB <none> <none> d3a1f33e8a5a 4 months ago 188.2 MB root@what:/var/discourse/shared/standalone# docker ps -s CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES SIZE 899929e6af7a local_discourse/app:latest "/sbin/boot" 10 weeks ago Up 14 hours 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp app 537.9 MB
The database is like fifteen gigabytes uncompressed. The uploads folder is 8.6GB.
root@what:/var/discourse/shared/standalone/uploads# du -ah /var/discourse/shared/standalone/uploads | sort -rh | head -n50 8.7G /var/discourse/shared/standalone/uploads 8.6G /var/discourse/shared/standalone/uploads/default 2.2G /var/discourse/shared/standalone/uploads/default/_optimized 1.9G /var/discourse/shared/standalone/uploads/default/original 1.7G /var/discourse/shared/standalone/uploads/default/original/3X 740M /var/discourse/shared/standalone/uploads/default/optimized 603M /var/discourse/shared/standalone/uploads/default/optimized/3X 122M /var/discourse/shared/standalone/uploads/default/original/3X/e 122M /var/discourse/shared/standalone/uploads/default/original/3X/8 113M /var/discourse/shared/standalone/uploads/default/original/3X/1 111M /var/discourse/shared/standalone/uploads/default/original/3X/b <snip>
Yeah. No.
DiscoOptimized maybe...
crash that I have to clean up after every day
What needs to be cleaned up?
Also, why does it take several seconds for the name suggestion popup to show?
postgres@what:~$ psql -d discourse -c "select count(username) from users" count -------- 141157 (1 row) postgres@what:~$
-
Don't touch the pg_clog or pg_xlog. Those are transaction logs, not log-logs.
-
-
I don't know enough about docker to say which of those can be removed, or what the effect may be, though I agree, only one of them appears to be being used (some might be snapshots? Virtual Size isn't actual size etc.):
I have used the following commands in my little instance, worked for me:
https://meta.discourse.org/t/low-on-disk-space-cleaning-up-old-docker-containers/15792/2
-
I have used the following commands in my little instance, worked for me:
Saw that earlier in my investigations. Just did a repeat to show my conclusions...
docker rm `docker ps -a | grep Exited | awk '{print $1 }'`
root@what:~# docker ps -a | grep Exited root@what:~#
Ok - none of those around.
docker rmi `docker images -aq`
root@what:~# docker images -aq 1f19af41e12e 27f52292c186 7ea991d02f7e 157f6a775410 65746d67224e dd46ba35af06 d3a1f33e8a5a
Wait - what? What's the first one again?....
root@what:~# docker images -a REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE local_discourse/app latest 1f19af41e12e 10 weeks ago 1.932 GB samsaffron/discourse 1.0.13 27f52292c186 3 months ago 1.238 GB <none> <none> 157f6a775410 3 months ago 1.238 GB <none> <none> 7ea991d02f7e 3 months ago 1.238 GB <none> <none> 65746d67224e 3 months ago 1.238 GB <none> <none> dd46ba35af06 3 months ago 821.1 MB <none> <none> d3a1f33e8a5a 4 months ago 188.2 MB root@what:~#
Er - no. That looks like very bad advice. I think I'd rather like to keep
1f19af41e12e
.Perhaps something other than our running instance needs to be deleted.
-
@AlexMedia said:
Also, why does it take several seconds for the name suggestion popup to show?
postgres@what:~$ psql -d discourse -c "select count(username) from users" count -------- 141157 (1 row) postgres@what:~$ ```</blockquote> That's quite a lot, but still... shouldn't it be faster than it is right now? It's just one column that you have to go through, and the query is a "starts with" query. Also: yay discoursistency! Without space after the 'b':smile: <img src="/uploads/default/original/3X/6/e/6ee57ca01b2eaf6cdbf84b8d39301aad20e2d089.png" width="690" height="172"> With a space after the 'b': <img src="/uploads/default/original/3X/9/d/9db0632968f5380ce4ebbf4f2f317170b8788caa.png" width="690" height="154">
-
That's quite a lot, but still... shouldn't it be faster than it is right now? It's just one column that you have to go through, and the query is a "starts with" query.
It's worse.
-
Aha, so they do a "contains". That explains why it's so slow...
But why do they do that? :/
-
But why do they do that?
Sanitizing user input. Or not in this case.
_
is a valid username character (as can be seen from the image.)Anyone deliberately using it will be (absent knowledge of this particular foible) searching for a literal
_
- they seem to pass it through unescaped to the SQL query, where it becomes a single character wildcard.
-
Is there ever going to be a front page article about our experiences with this shitty ass-forum?
-
To list images with no tag (for example, old versions of a tagged image when the image gets rebuilt):
docker images -f dangling=true
And to delete:
docker images -f dangling=true -q | xargs docker rmi
-
docker images -f dangling=true
root@what:~# docker images -f dangling=true REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE root@what:~#
-
That's because all of the images on there are part of the Discourse image. They're like a singly-linked list.
-
Is there ever going to be a front page article about our experiences with this shitty ass-forum?
Depending on the detail, it could probably keep the front-page occupied for at few months.
Very unlikely to happen however, not least for the reason we're using it to begin with.
-
@fbmac said:
I have used the following commands in my little instance, worked for me:
Saw that earlier in my investigations. Just did a repeat to show my conclusions...
docker rm `docker ps -a | grep Exited | awk '{print $1 }'`
root@what:~# docker ps -a | grep Exited root@what:~#
Ok - none of those around.
docker rmi `docker images -aq`
root@what:~# docker images -aq 1f19af41e12e 27f52292c186 7ea991d02f7e 157f6a775410 65746d67224e dd46ba35af06 d3a1f33e8a5a
Wait - what? What's the first one again?....
root@what:~# docker images -a REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE local_discourse/app latest 1f19af41e12e 10 weeks ago 1.932 GB samsaffron/discourse 1.0.13 27f52292c186 3 months ago 1.238 GB <none> <none> 157f6a775410 3 months ago 1.238 GB <none> <none> 7ea991d02f7e 3 months ago 1.238 GB <none> <none> 65746d67224e 3 months ago 1.238 GB <none> <none> dd46ba35af06 3 months ago 821.1 MB <none> <none> d3a1f33e8a5a 4 months ago 188.2 MB root@what:~#
Er - no. That looks like very bad advice. I think I'd rather like to keep
1f19af41e12e
.Perhaps something other than our running instance needs to be deleted.
Apparently,
The errors are fine for this rough script, docker will not delete images that are in use, so its complaining (correctly) that you are using these images.
So........ They're relying on Docker being smart enough to not kill itself. Ohkay then
-
Lovely formatting in that quote btw - looks nothing like the original...
-
It will, however, untag the images, which means you'll have to redownload them if you ever start another instance.
-
Lovely formatting in that quote btw - looks nothing like the original...
See also the screenshots that I posted before. Looks like the buggy rendering isn't constrained to the preview pane.
Adding a space to the end of the line underneath the quote might help. Or not, because Discourse.
-
Or not, because Discourse.
It's apparently quoted the cooked, not the raw - I had to escape a few backticks in my OP to get it to render properly - they didn't make it through the wash...
-
Nope, just tried copy-pasting the raw and it did the same thing.
How do you get this so wrong?
-
How do you get
FTFYthis so wrong?*<a???????????????????????????????
-
+ *twitch
-
-
Look at the difference between how much code Dell has to write to support phpBB (actually, it's more than they should have written, since after each RUN command, a filesystem snapshot is taken and cached, so running
apt-get update
alone is a terrible idea and runningapt-get clean
doesn't actually free up disk space if it's not in the same RUN that created the files) versus how much code the DiscoDevs wrote for Discourse's docker image.Also keep in mind that Dell's base image is used by multiple child images, whereas Discourse's base image is used by multiple child images only if you count the images that are compiled on the client machine.
-
They also look at the long names.
-
They also look at the long names.
*cough* I have NO idea whom that could affect. Who would even misuse the long name? That person must be
the worsta genius at abusing Discourse