Unhappy New Year



  • Here's a RTFM WTF courtesy of me.

    In May last year I started work on a daily maintenance system for a large community driven website. To remain consistent with the rest of the code base, the maintenance system was written in php and invoked daily using cron and a local loopback address provided by our web host.

    The maintenance system performed a variety of operations, most of them involving data 'cleaning' and statistics recounting. One particular operation required the system to physically delete data which had been marked as deleted from the MySQL database after a three month data retention period had expired.

    The system was coded, tested briefly and was live a week later. It ran daily, and as far as we knew, it was performing it's operations without fault or failure.

    Now follow me to seven months later to one New Year's morning, where I woke to the sound of my phone ringing and a pounding headache from the night before. The call was from one of the other website administrators:

    Him: We've been hacked!
    Me: Er.. What happened?
    Him: Someone has hacked in and deleted a whole bunch of stuff!
    Me: Wait.. What's missing?
    Him: I went in to restore an article we deleted yesterday. They're all gone.

    I sat there for a few minutes wondering aloud - previous security issues has caused a major system update, so I was fairly sure we hadn't been hacked. Plus, what would a hacker gain from deleting some items which were effectively deleted anyway? It soon dawned on me that the daily maintenance system was almost certainly at fault, just mishandling dates. I could imagine the flawed logic immediately: December 31st 2007 is more than a year older than January 1st 2008, clearly outside of the three month retention period.

    I dug up the old maintenance code and found the MySQL query involved:

    DELETE FROM portal_items WHERE deleted = 1 AND deletedate < (NOW() - 00000300000000)

    I was confused for a moment before I remembered the method I had once used before learning of the INTERVAL keyword. The idea was to subtract three months from the current date by using the MySQL recognised YYYYMMDDHHMMSS syntax.

    It looks like MySQL had been converting the values and comparing integers, apparently unimpressed with my choice of syntax. The mistake seemed obvious to me now, but because we developed and deployed the daily maintenance system after the third month, it appeared to work as normal.

    I made a quick fix to the code, before heading back to bed.

    DELETE FROM portal_items WHERE deleted = 1 AND deletedate < (NOW() - INTERVAL 3 MONTH)


  • Good story, everyone should learn from it.

    I guess all of us more or less fear the date January 1.



  • Yeah.  Always spend some serious time making sure you know *exactly* what you're doing with date manipulations.

    My personal mysql date WTF:

     
    DELETE FROM table WHERE date > 2007-01-01;

    You'll notice I omitted the quotes (assuming that mysql recognized dates as constants, rather than expressions), causing the above to evaluate to:

    DELETE FROM table WHERE date > 2005;

     

    So, to reiterate: Always make sure you know *exactly* what you're doing with date manipulations.



  • @merreborn said:

     

    So, to reiterate: Always make sure you know exactly what you're doing with date manipulations.

    I think "always select before a non-trivial delete" probably comes higher. 



  • @Cattlyst said:

    I sat there for a few minutes wondering aloud - previous security issues has caused a major system update, so I was fairly sure we hadn't been hacked.


    That's always wrong assumption, because new bugs and security vulnerabilities is being found all the time.

    @Cattlyst said:


    Plus, what would a hacker gain from deleting some items which were effectively deleted anyway?



    That's much better reasoning.

     

    NB: good story
     



  • @asuffield said:

    @merreborn said:
     

    So, to reiterate: Always make sure you know exactly what you're doing with date manipulations.

    I think "always select before a non-trivial delete" probably comes higher. 

    How exactly would that have prevented this problem, given that it runs in a cron script?


Log in to reply