How to handle error logs



  • This morning, one of our servers almost crashed: we have a process that monitors the error log from php and this error log was 10G in size (with 22 millions lines). So the monitoring process used all CPU to try to parse the error log.

    The boss proposed:
    a) try to find what caused the 22 millions errors in less than 3 hours
    b) send him an email when the log size is too big.

    I let you guess what was his suggestion.

     I'm glad TRWTF exists, so I know I am not alone.



  • Wow! 10GB. I've never seen one that large before. I've seen 2GB -- but that was over many months. The damn application didn't split them up. I've specifically designed my applications to do monthly logs so they can be purged, and NONE can generate 10GB. The only time I've seen a log swell up like that is when an application entered an unforseen infinite logging loop.



  • I thought infinite logging loops were, by definition, unforseen.  Unless, of course, you have some real sadistic developers. 



  • @KattMan said:

    Unless, of course, you have some real sadistic developers.

    That would describe the guy whose job I took over.



  • His log file probably had 20 million lines of:
    "TEST - Entered function foo_bar() - TODO: Delete me before production."

    I know I've been guilty of that many times before.

    What's fun is to run a load test against your application with log levels set to full, and logs being written to /tmp.  "OMFG! There's a memory leak!  Oh wait."



  • @vt_mruhlin said:

    His log file probably had 20 million lines of:
    "TEST - Entered function foo_bar() - TODO: Delete me before production."

    I know I've been guilty of that many times before.

    What's fun is to run a load test against your application with log levels set to full, and logs being written to /tmp.  "OMFG! There's a memory leak!  Oh wait."

    You need to put cuss words or comments about the users' moms.  That'll make sure you remember to take them out before the prod move.  Or it'll make some really interesting debugging. 



  • Try copying out only the last several days into a new log file.. Need to delete some data from the log that is too long to matter.



  • @belgariontheking said:

    You need to put cuss words or comments about the users' moms.  That'll make sure you remember to take them out before the prod move.  Or it'll make some really interesting debugging

    Not really.  I had a GUI Programming project in college where the grading came back with.....

    "-10pts : Cryptic message 'this loop sucks' repeatedly being printed to stdout."

    I's GUI Programming class.  Why are you looking at stdout?



  • @vt_mruhlin said:

    @belgariontheking said:

    You need to put cuss words or comments about the users' moms.  That'll make sure you remember to take them out before the prod move.  Or it'll make some really interesting debugging

    Not really.  I had a GUI Programming project in college where the grading came back with.....

    "-10pts : Cryptic message 'this loop sucks' repeatedly being printed to stdout."

    I's GUI Programming class.  Why are you looking at stdout?

    For stacktrace printout residue indicating improperly handled errors.



  • @KattMan said:

    Unless, of course, you have some real sadistic developers. 

     

    Sadistic. That's it. I thought they were incompetent. My bad.

    So hiring only kids that just got out of school is not the right thing to do? 

    My boss will be surprised.


Log in to reply