Overuse of RegEx in SQL



  • I just stumbled across this Code snippet:

    SELECT partner_id, SUM(price_eur) as sales
    FROM movieview
    WHERE user_id REGEXP ('^[0-9]')
    [...]
    

    Well, user_id is and was always defined as:

    user_id   INT(11)   NOT NULL
    

    The same RegEx for the same Field is used in every second SQL Query in this Script.

    Other parts of this script include 10-Dimensional Arrays and lots of lines like this:

    $result_a["producer"][$producer_a["row".$i]["producer_id"]]["sum"] += $result_a["matrix"][$producer_a["row".$i]["producer_id"]][$partner_a["row".$j]["partner_id"]];

    I really hate it when i can't easily see where one bracket ends and another starts.

    I sometimes wonder what - or better if - the original developer thought when he programmed this script.


  • @whoever said:

    I sometimes wonder what - or better if - the original developer thought when he programmed this script.

    "How can I make myself irreplaceable"

    I guess it didn't work though, since now you're the poor sap who has to maintain it...



  • @whoever said:

    I just stumbled across this Code snippet:

    SELECT partner_id, SUM(price_eur) as sales
    FROM movieview
    WHERE user_id REGEXP ('^[0-9]')
    [...]
    

    Well, user_id is and was always defined as:

    user_id   INT(11)   NOT NULL
    
    .

    Just to add to the fray, I have been looking at some MySQL code (from a published book) and was wondering what int(11) meant in a table definition. My SQL experience is mostly MSSQL so I had to google for this. This led me to Numeric data types and zerofill and made me start to question the quality of the authors work.



  • @whoever said:

    WHERE user_id REGEXP ('^[0-9]')

    I'm somewhat surprised that---what is that? MySQL?--doesn't complain when you try to apply a regex to a non-character data type. I have trouble imagining a real situation in which you'd have a good reason to check if the decimal representation of a number fits a certain string pattern.



  • @toth said:

    @whoever said:
    WHERE user_id REGEXP ('^[0-9]')

    I'm somewhat surprised that---what is that? MySQL?--doesn't complain when you try to apply a regex to a non-character data type. I have trouble imagining a real situation in which you'd have a good reason to check if the decimal representation of a number fits a certain string pattern.


    Well, i first thought it was a failed attempt at checking for negative numbers (on some tables in this database, negative IDs indicate a producer/partner, positive an admin) but after checking, every row was either 0 or a positive number.

    This application went through about 10 diffrent hands until it reached me and everyone else just worked around it's problems, adding more to the numerous problems of this code. Now i'm tasked with changing a core part of it to handle a new payout calculation system... this is not why i became a programmer :-/



  • @whoever said:

    This application went through about 10 diffrent hands until it reached me and everyone else just worked around it's problems, adding more to the numerous problems of this code. Now i'm tasked with changing a core part of it to handle a new payout calculation system... this is not why i became a programmer :-/

    That may be so, but its par for the course. No matter what position you end up in you will have to fix up other peoples screw ups. The only possibility is if you do everything from scratch, but as Carl Sagan once said "To bake an apple pie from scratch you must first create the universe."



  • @whoever said:

    Well, i first thought it was a failed attempt at checking for negative numbers (on some tables in this database, negative IDs indicate a producer/partner, positive an admin) but after checking, every row was either 0 or a positive number.

    I'd think that checking < 0 and > 0 would be better for that, though, wouldn't it?



  • @whoever said:

    This application went through about 10 diffrent hands until it reached me and everyone else just worked around it's problems, adding more to the numerous problems of this code. Now i'm tasked with changing a core part of it to handle a new payout calculation system... this is not why i became a programmer :-/

    What, you became a programmer so you could bang supermodels, drive a Ferrari and blow shit up?  Because I think you may have misread the job description.



  • @toth said:

    @whoever said:
    Well, i first thought it was a failed attempt at checking for negative numbers (on some tables in this database, negative IDs indicate a producer/partner, positive an admin) but after checking, every row was either 0 or a positive number.

    I'd think that checking 0 would be better for that, though, wouldn't it?

    Fucking CS.

    I'd think that checking < 0 or > 0 would be better for that, though, wouldn't it?



  • @morbiuswilters said:

    What, you became a programmer so you could bang supermodels, drive a Ferrari and blow shit up?  Because I think you may have misread the job description.
     

    Huh, I have those.



  • @dhromed said:

    @morbiuswilters said:

    What, you became a programmer so you could bang supermodels, drive a Ferrari and blow shit up?  Because I think you may have misread the job description.
     

    Huh, I have those.

    Yeah, that sounded like a pretty fair description of my career so far, and most of the other coders I know.  Poor ol' morbs must be just about the only one who isn't getting any.

     



  • @OzPeter said:

    @whoever said:
    I just stumbled across this Code snippet:
    SELECT partner_id, SUM(price_eur) as sales
    FROM movieview
    WHERE user_id REGEXP ('^[0-9]')
    [...]
    
    Well, user_id is and was always defined as:
    user_id   INT(11)   NOT NULL
    

    .

    Just to add to the fray, I have been looking at some MySQL code (from a published book) and was wondering what int(11) meant in a table definition. My SQL experience is mostly MSSQL so I had to google for this. This led me to Numeric data types and zerofill and made me start to question the quality of the authors work.

    I saw that and assumed it was like Oracle where it defined the number of digits.

    Now I can't decide which is worse.



  • @morbiuswilters said:

    What, you became a programmer so you could bang supermodels, drive a Ferrari and blow shit up?  Because I think you may have misread the job description.
     

    Instead, he bangs a Ferrari, blows up supermodels and drives shit, like the rest of us :D



  • @Anonymouse said:

    Instead, he bangs a Ferrari, blows up supermodels and drives shit, like the rest of us :D
     

    +10 opportunistic humor



  • Truth is:
    I bang shit, blow up Ferraris and have Supermodels on my Drive... close enough for me.



  • @Anonymouse said:

    @morbiuswilters said:

    What, you became a programmer so you could bang supermodels, drive a Ferrari and blow shit up?  Because I think you may have misread the job description.
     

    Instead, he bangs a Ferrari, blows up supermodels and drives shit, like the rest of us :D

    Sounds like a kickass GTA release.



  • @OzPeter said:

    Just to add to the fray, I have been looking at some MySQL code (from a published book) and was wondering what int(11) meant in a table definition. My SQL experience is mostly MSSQL so I had to google for this. This led me to Numeric data types and zerofill and made me start to question the quality of the authors work.
     

    MySQL adds that on its own, if you don't specify a width.  You say "int", and you get back "int(11)".  Hardly anyone i know of ever really uses that number.



  • @cHao said:

    @OzPeter said:

    Just to add to the fray, I have been looking at some MySQL code (from a published book) and was wondering what int(11) meant in a table definition. My SQL experience is mostly MSSQL so I had to google for this. This led me to Numeric data types and zerofill and made me start to question the quality of the authors work.
     

    MySQL adds that on its own, if you don't specify a width.  You say "int", and you get back "int(11)".  Hardly anyone i know of ever really uses that number.


    I'm scared now that that actually makes sense. The authors probably created the tables as expected but used some tool to pull the config out of MySQL for inclusion in the text. So you put in 'INT" and get out 'INT(11)'. That is another WTF IMHO



  • @OzPeter said:

    That is another WTF IMHO
    Boy, you sure don’t want to know what MySQL does to BOOLEAN columns, then…



  • @cHao said:

    MySQL adds that on its own, if you don't specify a width.  You say "int", and you get back "int(11)".  Hardly anyone i know of ever really uses that number.
    AFAIK, Rails, when it creates id columns, creates them as 11.  Maybe it's just because that's the default, as you point out, but I always thought that Rails was saying "well, if you have more than 10 billion items in your table, you'll have some refactoring to do, but we don't want to bother you before then."



  •  Could this code be leftover from a version of the app that used Sqlite, or some other DB that types data at the value level instead of the column level?



  • I’m surprised that int -> int(11) is confusing so many people. The longest string representation of a 32-bit signed integer is -2147483648, which is 11 characters long. Consider it a slightly weird extension of the NUMERIC data type.


Log in to reply