Infiniscrollgate and the accompanying performance discussion reminded me of a WTF from a few years back:
A tool was responsible for displaying varying amounts of HTML in an embedded browser control. The source data was stored in a different format elsewhere but converted to HTML for editing purposes. The workflow being, essentially
- load
- convert to HTML
- user edits
- convert to other format
- save
99% of the time, performance was great (as great as can be expected with the data conversion going on in both directions, at least). But that one percent of the time, it could take 10-30 seconds to load even a paragraph of text. It didn't seem to depend on the length of the content at all. So I looked into the database and there was the paragraph of text, all three sentences... and 10.8 MB.
Turns out that part of the non-HTML format involved storing metadata along with the text. In non-HTML, this used escape sequences (think \n but on steroids). In HTML mode, the metadata was maintained inside . The parser had an obscure off-by-one when handling the escape sequences, so instead of going from \stuff to , it left in the \ like so: .
Which was then dutifully converted back to \\stuff for storage. Which re-expanded to during the next user session. And saved as \\\\stuff...
The bad 1% of workflows involved three sentences of text, with a trailing HTML comment containing several million escape characters. And an execution time that doubled whenever anyone looked at it.
Filed under: [No wonder Discourse is so speedy][1], [I had to double up the \ characters to get them to display correctly, coincidence?][2]