Realy dynamic website



  • About a year ago I was asked to fix a very serious performance issue of a tourism-related website which normally had about a few thousand visits a day. When I first met the owner he was ranting for half an hour about how he doesn't trust developers because thery are all unreliable etc. The website was written in PHP and used PostrgreSQL as database backend. The problem was that the server kept crashing every few hours and was generally very slow. The sysadmin of the server suggested the owner to buy an even more powerful server with more RAM to solve the problem. Now I am not really a web developer and I don't have much experience with PHP and these kind of systems but I was the only programmer the owner could get. After digging through the problem and some testing I realized that the website actually couldn't handle more than 4-5 visit per minute, and the server was overwhelmed with only a few hundrends of visits easily. The first thing I noticed that actually the website pulled *every* *single* *line* of text on the website from the database. This sounds WTFish for me, but the sysadmin ensured me that that's the way websites work, and since I'm not experienced with web development, I don't know if that's true or not. None the less, PostgreSQL should be able to handle a few hundred queries a minute, so that shouldn't be a problem. Now the real WTF was that I realized that before every single page load the PHP scripts pulled every single line (>100000 lines) from the database, put them into an array (and if I remember correctly, PHP arrays are hashtables as well in the same time), and then looked them up from the array at the appropriate locations. If it were up to me I would have just put all the text inside the PHP files, but since it was a quite large website and this "logic" was deeply rooted in every PHP file, I had to find an other solution. Unfortunately time and cost constraints didn't allow me to rewrite the whole mess. My solution was to create an inherited class of the inbuilt PHP array class which only pulled any line of text from the database when it was required (or if it was required), and then cached them with memcached. There might be a better solution, I don't know, as I mentioned, I am not a web developer, but my solution worked perfectly and after I introduced my changes to the live system (there were no version control, of course), everything was fine.  No crashes and slowdowns anymore. On the side note, the owner of the website started blaming me for some other disasterous aspects of the website right after I fixed the speed issue and it started to get a bit personal, so I decided that the whole headache didn't worth it. Long story cut short, I have never got payed for the fix. Last time I heard about the guy he was trying to hire a web devlopment agency to do the job and clean up the website



  • @DrJokepu said:

    ...the website pulled every single line of text on the website from the database.

    For most dynamic sites I've dealt with, it makes sense to pick chunks of text or very similar items and throw them in a database.

    Every line of text, however, is a little extreme—I can see this being handy only for swapping out the language of the entire site easily. Otherwise, I would leave small bits of text that don't change in an include file.


    @DrJokepu said:
    ...before every single page load the PHP scripts pulled every single line from the database, put them into an array, and then looked them up from the array at the appropriate locations.

    You're right, THIS is the WTF. The website would be running just fine if each page queried only the content it needed from the database and displayed it. This is effectively maxing out PHP's memory usage on every page load.



  •  Theres a few situations where you would want all the text pulled from a backstore (e.g. i18n, message board) but this is then usually complimented by a design built to accommodate this AND/OR appropriate cache/proxy front ends. From experience you get into this sort of situation when the client specs one things and then decides to cut costs at a later date - however from your description it sounds as if the implimentation was half-assed as well.

     Anyways, nope in general websites arnt designed like this and, yup, youre better off out of it. Last time I allowed a client to get away without paying me it was 100% deliberate in order to ensure they would NEVER have the bollocks to contact me again for fear of being asked for payment so consider it money well spent :)

     

    P.S. Mister Paragraph has something he would like to talk to you about. 



  • For some reason the tourism industry loves to use oracle when SQLite would be overkill, thank god I've moved on from this.



  • I hope you added some sort of sane cache invalidation to your memcached layer, otherwise you introduced your own wtf... 



  •  @merreborn said:

    I hope you added some sort of sane cache invalidation to your memcached layer, otherwise you introduced your own wtf... 

    If I remember correctly I set up memcached to expire in a few hours and created an option in the admin pages to flush the cache manually as well so if they change some content on the page they don't need to wait hours to make them live.



  • New rule. Everyone who is expected to be able to write in computer languages well must also be able to write in English well.

     

    You fail. Epically.



  • @DrJokepu said:

    The first thing I noticed that actually the website pulled *every* *single* *line* of text on the website from the database
     

    Quickest fix for this situation is to sprinkle in some of that new fangled "WHERE" clause magic. Of course, that presumes the database was properly designed in the first place and those lines are identifiable within the DB without having to rummage around inside fields to derive idenfication data. 


  • Discourse touched me in a no-no place

    @Arenzael said:

    @Someone. Dunno who. I'm assuming it's not Arenzael said:

    New rule. Everyone who is expected to be able to write in computer languages well must also be able to write in English well.
     

    You fail. Epically.

    You aren't doing too well at quoting. Epically.

    It's probably GMail's fault, but I cannot find the alleged quote you, erm, quote..



  •  @Arenzael said:

    New rule. Everyone who is expected to be able to write in computer languages well must also be able to write in English well.

     

    Yeah well as you might have guessed English is not my first language. You must be American because here in the UK where I currently live everybody is very supportive and my British friends help me a lot to enhance my English skills. I am glad that you have an excellent command of the language yourself, even though you are probably a natural speaker, so that's not much of an achievement. I wonder how many other languages you speak btw. If that number is greater than zero, surely you understand the difficulties of writing in a foreign language. If it's not, why don't you try to learn an other language yourself? It worths the effort and is a really unique experience.



  • @MarcB said:

    @DrJokepu said:

    The first thing I noticed that actually the website pulled *every* *single* *line* of text on the website from the database
     

    Quickest fix for this situation is to sprinkle in some of that new fangled "WHERE" clause magic. Of course, that presumes the database was properly designed in the first place and those lines are identifiable within the DB without having to rummage around inside fields to derive idenfication data.

     

    CREATE TABLE site
    (
    site_text text
    )


  • @MarcB said:

    @DrJokepu said:

    The first thing I noticed that actually the website pulled *every* *single* *line* of text on the website from the database
     

    Quickest fix for this situation is to sprinkle in some of that new fangled "WHERE" clause magic. Of course, that presumes the database was properly designed in the first place and those lines are identifiable within the DB without having to rummage around inside fields to derive idenfication data. 

     

    That's exactly what I did. Unfortunately, as I already mentionned, everything was put into an array at the beginning and was later accessed like in the form "stings[344]". Since last time I checked PHP didn't have those fancy indexers like C#, my only choices were to subclass the existing inbuilt array class or replace each instance of strings[] with getString() or something similiar using regular expressions. I was afraid that the latter would introduce complications so I choose the former.



  • Hmm. So I'm thinking about this, and I think I see a potential win!

    See, you could take every line of every file on your website, and you could individually insert it into a database along with a hash value. Whenever a new file was added, you just look up all the lines individually, and save a list of all the hash values. Whatever doesn't exist, you insert in the database. Eventually, you have all the lines of all the files saved, and if anybody edits one you just save a new line.

    And then? If they change it back, hey presto, it's already saved! Performance win!

    Unfortunately, I know people who could be sold with that argument.



  • @DrJokepu said:

    ...I am glad that you have an excellent command of the language yourself, even though you are probably a natural speaker, so that's not much of an achievement...

    ...I wonder how many other languages you speak btw If that number is greater than zero, surely you understand the difficulties of writing in a foreign language. 

     

     

    @DrJokepu said:

     ...You must be American...

     Yeah because you can write off half of a continent based on some stereotype like that.

    @DrJokepu said:

    ...I am glad that you have an excellent command of the language yourself, even though you are probably a natural speaker, so that's not much of an achievement...

    Actually, it is an achievement. Just because you speak a language doesn't mean you can write well in it. 

    @DrJokepu said:

    ...I wonder how many other languages you speak btw If that number is greater than zero, surely you understand the difficulties of writing in a foreign language. 

    I'm not fluent in them (mainly because I don't speak them daily, which makes remaining fluent hard), but I have dabbled in 3 other languages, to include Latin, which gives me a solid base for understanding a good chunk of the romance languages except for French, which is just all messed up.

     

    For the record, I wasn't even referring to how you speak the language. In your native tongue, do they not have line-breaks? 

     

     



  • @Arenzael said:

    In your native tongue, do they not have line-breaks?

    Pre-emptive guess: Forum software. After the upgrade, I have to manually put either <br> tags to cause a line break, or surround paragraphs with <p> tags (I use the plain text editor -- not WYSIWYG). If people don't use preview, they won't realize that the breaks are no longer inserted.



    HOLY SHIT! WTF happened to the tags?! It's randomly inserting compound tags when I type one letter. This hacked version of Community Server SUCKS.



  •  DrJokepu,

     Not sure why everyone is busting your balls. Sounds like your solution made the best of a crappy situation. Good thing you didn't spend any more time on it than you did since he stiffed you anyway.



  • @Arenzael said:

    trolly trolly line break troll

    That point was already made. Thank you for wasting space.



  • @AbbydonKrafts said:

    HOLY SHIT! WTF happened to the tags?! It's randomly inserting compound tags when
    I type one letter.

    Actually tag autocompletion works better than a few months ago, when you couldn't use (for example) "why", because is would always autocomplete to "why are a large number of my tags not appearing?".

    @AbbydonKrafts said:

    This hacked version of Community Server SUCKS.

    Despite (somewhat) fixed autocompletion, I agree. I'm still waiting for my TT element.



  • @merreborn said:

    I hope you added some sort of sane cache invalidation to your memcached layer, otherwise you introduced your own wtf... 

     

    I hope he added some sort of insane cache validation instead.  It sounds like it ought to work exactly the same, but it's ever so much more fun! 

     



  • @Arenzael said:

    @DrJokepu said:

     ...You must be American...

     Yeah because you can write off half of a continent based on some stereotype like that.

     

    Since when is "America" only half a continent?

    Or have you written off the non-US half of it?  That would be a fairly good PKB.

     

     

     

    [ this space left blank for Arenzael's inevitable back-pedal. ]

     

     

     



  • @Spectre said:

    @AbbydonKrafts said:
    HOLY SHIT! WTF happened to the tags?! It's randomly inserting compound tags when
    I type one letter.

    Actually tag autocompletion works better than a few months ago, when you couldn't use (for example) "why", because is would always autocomplete to "why are a large number of my tags not appearing?".

    What browser are you using? I'm using Safari, and for me, entering a tag works like this:

    1. Type a letter.
    2. Wait for autocomplete to fill in the tag field.
    3. Highlight all the letters I don't want.
    4. Repeat until the tag I want is typed in.


  • @Carnildo said:

    What browser are you using?

    I'm using Opera 9.24. Normally, I'd have to type a letter, wait for the autocomplete, type another letter, wait for the autocomplete, etc. But, when I tried to do it for the reply above, after typing just one letter, it went nuts and filled in parts of two tags, so I had a single compound mutant tag. I tried it again, and it's back to the type-one/wait/type-one/wait routine. It's still screwed up.



  • @CDarklock said:

    Hmm. So I'm thinking about this, and I think I see a potential win!

    See, you could take every line of every file on your website, and you could individually insert it into a database along with a hash value. Whenever a new file was added, you just look up all the lines individually, and save a list of all the hash values. Whatever doesn't exist, you insert in the database. Eventually, you have all the lines of all the files saved, and if anybody edits one you just save a new line.

     

    Hey, that's brilliant, but I think I can see a way to make it even better.  Wouldn't it be great if you could avoid all the performance costs and bloat of having to run a full database server?  Well, if we adopt your scheme, we can!  Who needs all those big SQL servers and clusters and things?  Once you've got all the lines from all the files inserted into this database, along with their hashes, all you need do is just export the entire DB to a flat text file.


    That's right!  It's the ideal data format for use with SSDS! 



  • @AbbydonKrafts said:

    @Carnildo said:
    What browser are you using?
    I'm using Opera 9.24. Normally, I'd have to type a letter, wait for the autocomplete, type another letter, wait for the autocomplete, etc. But, when I tried to do it for the reply above, after typing just one letter, it went nuts and filled in parts of two tags, so I had a single compound mutant tag. I tried it again, and it's back to the type-one/wait/type-one/wait routine. It's still screwed up.

    I'm using Maxthon over IE7, and it's the same. I think that if the timeout was a bit lower, the autocompletion'll be okay. Fortunately, my pet peeve was fixed and I have the patience to wait for autocompletion.

    Now, if only someone would patch CS to support the TT element... (Hey, have I said it before?)



  • @Spectre said:

    I think that if the timeout was a bit lower, the autocompletion'll be okay.

    It's probably the AJAX latency. It should be in a list (like social bookmarking sites do) to prevent interference with the box itself. It's hard to let off the gas when I'm used to ~80WPM.

    @Spectre said:

    Now, if only someone would patch CS to support the TT element... (Hey, have I said it before?)

    I'll put my vote in on that one.



  • @DaveK said:

    Since when is "America" only half a continent?

    Or have you written off the non-US half of it?  That would be a fairly good PKB.

     

    Yes because historically the Canadians have also been refered to as "Americans."

    You + Semantics = Fail. 



  • @Arenzael said:

    You + Semantics = Fail. 
     

    The only one who fails here is the person who decided to revive a month old thread to blatantly troll.



  • And behold MasterAsshatPlan comes galloping in to save the thread!

     If this thread is so old, what the hell are you doing in it? Do you have some sort of sixth-sense flamedar? I could probably hide some flamebait behind Jupiter and you would find it somehow.



  • @Arenzael said:

     If this thread is so old, what the hell are you doing in it? Do you have some sort of sixth-sense flamedar? I could probably hide some flamebait behind Jupiter and you would find it somehow.
     

    Yeah. How intelligent. Dont you have some script kiddies to play with on ZDNet or something?


Log in to reply