She didn't like 'like'



  • A junior developer I once worked with didn't like SQL like-clauses. The SQL she created was all over the code base and looked like this:


    select ...
      from ...
     where (someCol= 'AAA0001' or
                someCol= 'AAA0002' or
                someCol= 'AAA0003' or
                ...
                someCol= 'AAA9999')
         and rest-of-where-clause

     



  •  So, like, do you like her, or do you "like" like her? I've heard she's, like, one wild card, if you know what I mean! <nudge>



  • She was, like, paid by the line?



  • I had to read the title a few times before realizing it had nothing to do with Dwight Eisenhower's presidential campaign.



  • @cdosrun said:

    ...like her?

    I had just gotten married, and she was a little young for me even if I hadn't...

    @Vempele said:

    paid by the line
    Nope; permanent full time employee who just didn't understand the concept of regular expressions ("but there aren't any rows that look like 'AAA%'!")



  • @snoofle said:

    "but there aren't any rows that look like 'AAA%'!"
    *facepalm*

    Has anyone come up with a way to accurately convey to a colleague how retarded he/she is? Cause when you tell them "you're a complete moron" they think you're just doing it to insult them.



  • if the column was indexed she avoided a table scan probably :)

    That should be especially important with statements involving locking



  • @t-bone said:

    if the column was indexed she avoided a table scan probably :)

    That should be especially important with statements involving locking

     

    Since the wildcard is on the end, the index should be used instead of a full row scan.



  • And just to learn something. If you insist on enumerating all options, how about doing it with a in with

    where someCol in ('AAA001','AAA002','AAA003, .....);
    And write a loop to generate the strings, instead of doing it by hand for gods sake!

     



  • @tiller said:


    And write a loop to generate the strings, instead of doing it by hand for gods sake!

     

    No way! Then I would have so much left of the day that I might have to actually think.



  • @DOA said:

    @snoofle said:

    "but there aren't any rows that look like 'AAA%'!"
    *facepalm*

      does not appear in snoofle's original post.

     

    "Like 'AAA%'" would catch a lot of false positives where 'AAA' is followed by more or less than 4 digits, or the following characters include nonnumeric characters.

    The original solution performs well, is provably correct and can be generated automatically.

     Using a regular expression would save programmer time, but would perform more slowly. Older databases do not support regular expressions anyway.



  • @Alistair Wall said:

    "Like 'AAA%'" would catch a lot of false positives where 'AAA' is followed by more or less than 4 digits, or the following characters include nonnumeric characters.

    The original solution performs well, is provably correct and can be generated automatically.

     Using a regular expression would save programmer time, but would perform more slowly. Older databases do not support regular expressions anyway.

    I do not know how efficient this is.

     

     

    SELECT CONCAT(c1,c2) AS f FROM (SELECT LEFT('AAA0001', 3) AS c1, RIGHT('AAA0001', 4) AS c2 ) a WHERE c1 = 'AAA' AND (c2 >= 1 AND c2 <= 9999)

     

    SELECT .... 
    FROM ....
    WHERE (LEFT(someCol,3) = 'AAA' AND (RIGHT(someCol, 4) >= 1 AND RIGHT(someCol, 4) <= 9999))


  • @Alistair Wall said:

    @DOA said:

    @snoofle said:

    "but there aren't any rows that look like 'AAA%'!"
    facepalm

      does not appear in snoofle's original post.

     

    "Like 'AAA%'" would catch a lot of false positives where 'AAA' is followed by more or less than 4 digits, or the following characters include nonnumeric characters.

    The original solution performs well, is provably correct and can be generated automatically.

     Using a regular expression would save programmer time, but would perform more slowly. Older databases do not support regular expressions anyway.

    So just use underscores, which, in SQL Server and MySQL at least, match exactly one character:

    ...LIKE 'AAA____' (That's four of them.)

    Somehow, I think this will perform a bit more efficiently than jamming a quarter megabyte SQL query down the line and making the server compile it every time you need to run it. And don't pretend they're using this in a stored procedure - anybody dumb enough to "optimize" a query like this is using string concatenation to build and run queries, guaranteed.



  • @DOA said:

    @snoofle said:

    "but there aren't any rows that look like 'AAA%'!"
    *facepalm*

    Has anyone come up with a way to accurately convey to a colleague how retarded he/she is? Cause when you tell them "you're a complete moron" they think you're just doing it to insult them.

     

    "Your code is slightly worse than the mine" (said by a parrot in the office)



  • @Alistair Wall said:

    @DOA said:

    @snoofle said:

    "but there aren't any rows that look like 'AAA%'!"
    *facepalm*

      does not appear in snoofle's original post.

     

    "Like 'AAA%'" would catch a lot of false positives where 'AAA' is followed by more or less than 4 digits, or the following characters include nonnumeric characters.

    The original solution performs well, is provably correct and can be generated automatically.

     Using a regular expression would save programmer time, but would perform more slowly. Older databases do not support regular expressions anyway.

     

     

    WT(bloody)F?

     

    The performance difference between col = 'AAA' or col = 'AAAA' or col = 'AAAAA' vs col like 'AAA%' is absolutely minimal since no special index is needed. col = '%AAA%' needs a special index.

    If you have a list of values, you can auto generate that in the query string in the code:

     

    "select * from table where col in ('" + array.join("', '") + "')";

     

    To check the generated code in for nanosecond performance gains and justifying it is a major wtf.

     

     

    Opse I forgot oracle has regex filters.

    http://www.dba-oracle.com/t_regular_expressions.htm



  • @snoofle said:

    @cdosrun said:
    ...like her?
    I had just gotten married, and she was a little young for me even if I hadn't...
     

    "Young" as in, like, socially awkward, or as in, like, jailbait?

     


  • Garbage Person

     Why is everybody so certain the DBMS uses indexes for regexes at all. Lord knows I don't trust that MySQL does it without looking at the docs (I don't trust that MySQL supports ANYTHING without looking at the docs, I learned that one the hard way)



  • Regex isn't the right term.  The "like" clause is not a regex.

    If the "%" is at the end, but not at the beginning, and the column is indexed, the query will be an index seek.  If the "%" is at the beginning, it will be an index scan.  You'll only get a table scan if the column is not indexed, and in that case, you'd get a table scan regardless.  The corner case is when the column is part of the clustered index and you end up with a clustered index scan, which performance-wise is almost as bad as a table scan; if you're finding that you have to do that on a large table, then you probably need to reevaluate your design.

    I get the feeling that we're seeing some TopCod3r-esque posts filtering into this forum; either that or a lot of programmers are really, really horrible at data access.


  • Garbage Person

    @Aaron said:

    Regex isn't the right term.  The "like" clause is not a regex.

     If it looks like a duck, quacks like a duck and walks like a duck, I call it a duck. Just because it's not the same syntax or nearly as powerful as one traditionally expects doesn't mean that's what it is (and TRWTF is that it ISN'T).

     And yes, what you say makes sense- but I still don't trust MySQL to do anything properly (I also don't expect Postgresql to have a useful API or MSSQL (pre-2008) to have sane datatypes or Oracle to not be insane and convoluted)



  • @Weng said:

    @Aaron said:

    Regex isn't the right term.  The "like" clause is not a regex.

     If it looks like a duck, quacks like a duck and walks like a duck, I call it a duck. Just because it's not the same syntax or nearly as powerful as one traditionally expects doesn't mean that's what it is (and TRWTF is that it ISN'T).


    It's a "glob" or a "wildcard match", which is a subset of the pattern matching that regular expressions can do (which, in turn, is a subset of Perl regular expressions).



  • @Aaron said:

    I get the feeling that we're seeing some TopCod3r-esque posts filtering into this forum; either that or a lot of programmers are really, really horrible at data access.

    Thank God somebody said it.



  • @Aaron said:

    a lot of programmers are really, really horrible at data access.
    There's the answer for you



  • @Carnildo said:

    @Weng said:

    @Aaron said:

    Regex isn't the right term.  The "like" clause is not a regex.

     If it looks like a duck, quacks like a duck and walks like a duck, I call it a duck. Just because it's not the same syntax or nearly as powerful as one traditionally expects doesn't mean that's what it is (and TRWTF is that it ISN'T).

    It's a "glob" or a "wildcard match", which is a subset of the pattern matching that regular expressions can do (which, in turn, is a subset of Perl regular expressions).
    I'll add that regular expressions are a very clearly defined concept in computational theory.  They exactly correspond to finite state automata - any such automaton can be turned into a regex and vice versa.  Globs lack several important things, such as the ability to match a specific set of characters, or optional matches.  Perl regexes in turn are actually something more than regular expressions, since they allow matchings things like an arbitary number of balanced parentheses (this requires a pushdown automaton).


  • @morbiuswilters said:

    @Aaron said:

    I get the feeling that we're seeing some TopCod3r-esque posts filtering into this forum; either that or a lot of programmers are really, really horrible at data access.

    Thank God somebody said it.

     

    TopCod3r's or dumbasses, I hate you all for making me agree with morbiuswilters.



  • @cdosrun said:

    I hate you all for making me agree with morbiuswilters.
     

    But you are a bigger person because of it.


  • Garbage Person

    @tdb said:

    @Carnildo said:

    @Weng said:

    @Aaron said:

    Regex isn't the right term.  The "like" clause is not a regex.

     If it looks like a duck, quacks like a duck and walks like a duck, I call it a duck. Just because it's not the same syntax or nearly as powerful as one traditionally expects doesn't mean that's what it is (and TRWTF is that it ISN'T).

    It's a "glob" or a "wildcard match", which is a subset of the pattern matching that regular expressions can do (which, in turn, is a subset of Perl regular expressions).
    I'll add that regular expressions are a very clearly defined concept in computational theory.  They exactly correspond to finite state automata - any such automaton can be turned into a regex and vice versa.  Globs lack several important things, such as the ability to match a specific set of characters, or optional matches.  Perl regexes in turn are actually something more than regular expressions, since they allow matchings things like an arbitary number of balanced parentheses (this requires a pushdown automaton).
     

      Well, you learn something new every day.


Log in to reply