Apparently India has not learned about Regular Expressions yet



  • Doing some consulting work on a project that was developed overseas.

     I found this gem:

    //Validates a phone number string.
    //phone numbers can be comma separated values.
    //these comma separated values can be in turn be - separated.
    //the numbers can be included in ()

    function isValidPhoneNumber(phno) {
    var sphno= phno.split(",");
    var i=0;
    var j=0;
    for (i=0;i<sphno.length;i++) {
    temp=sphno[i];
    //alert("sphno length"+sphno.length);
    //alert("comma separated value is"+temp+"and i is"+i);
    if(!isBlank(temp)) {
    if (!hasBalancedParanthesis(temp)) {
    alert("Phone number has unbalanced paranthesis");
    return false;
    }//close of if
    else {
    eachNo= temp.split("-");
    for (j=0;j<eachNo.length;j++) {
    //alert("eachNumber is"+eachNo[j]);
    if (!isBlank(eachNo[j])) {
    var reachedEnd=false;
    var k=0;
    var firstClosePar=0;
    var lastClosePar=0;
    while(!reachedEnd) {

    lastOpenPar=firstClosePar;
    var found=false;
    while (!found && k<eachNo[j].length) {
    if (eachNo[j].charAt(k)== '(') lastOpenPar=k;
    if (eachNo[j].charAt(k)==')') {
    firstClosePar=k;
    found=true;
    }
    k++;
    }
    //alert("came out of 1 st while and k is"+k);
    if(firstClosePar==lastOpenPar) firstClosePar=eachNo[j].length;
    var sNum=eachNo[j].substring(firstClosePar,lastOpenPar+1);
    //alert("snum "+sNum);
    if ( found && isBlank(sNum)) {
    //alert("came here");
    alert("Phone number - Parenthesis must not be empty");
    return false;
    }
    else {
    if (!isSpaceOrNumber1(sNum)) {
    //alert("eachNo"+eachNo[j]);
    // alert("sNumis"+sNum);
    alert("Phone number - Parenthesis must have valid numbers only");
    return false;
    }
    }
    if(k>=eachNo[j].length-1) reachedEnd=true;
    }//end of while
    //alert("came out of 2 nd while loop");
    }//end of if
    else{
    alert("Invalid Phone Number");
    return false;
    }
    }//end of for
    }//end of else
    }//end of if
    else {
    alert("Invalid Phone Number");
    return false;
    }
    }//end of for
    //alert("was a valid phone number");
    return true;
    }//end of function

     



  • In Javascript no less



  • I know little about java or C but that to me looks like the spaghetti code i am forced to use to write programs on my graphing calculator that uses basic. maybe worse.



  • It gets better:

     

    //Validates Emailid
    //validates for "@" character and also atleast one "." character.

    function isValidEmail(var1) {
    if(var1.value.length!=0)
    if(var1.value.indexOf('@')<1){
    //alert("Invalid email-Id");
    return false;

    //var1.value="";
    //var1.focus();
    }
    else{
    if(var1.value.indexOf('.')>=(var1.value.length-1)) {
    return false;
    //alert("Invalid email-Id");
    //var1.value="";
    //var1.focus();
    }else{
    if(var1.value.lastIndexOf('.')<=(var1.value.indexOf('@'))){
    return false;
    //alert("Invalid email-Id");
    //var1.value="";
    //var1.focus();
    }else{
    if(var1.value.lastIndexOf('@')!=(var1.value.indexOf('@'))){
    //return false;
    //alert("Invalid email-Id");
    //var1.value="";
    //var1.focus();
    }
    }//end of else
    }//end of if
    }//end of else
    return true;
    }

     And:

     

    //To validate the date.
    //validates the date provided for being a valid date.

    function isValidDate (year, month, day) {
    // Explicitly change type to integer to make code work in both
    // JavaScript 1.1 and JavaScript 1.2.
    if(!(isNumber(year)&&isNumber(month)&&isNumber(day)))return false;
    if ((month.length == 2) && (month.charAt(0) == "0"))
    month = month.charAt(1);
    if ((day.length == 2) && (day.charAt(0) == "0"))
    day = day.charAt(1);

    var intYear = parseInt(year);
    var intMonth = parseInt(month);
    var intDay = parseInt(day);
    if(intDay<=0 || intMonth<=0 ||intYear<=0 ||intMonth>12) return false;
    // catch invalid days, except for February
    if (intDay > daysInMonth[intMonth]) return false;
    if ((intMonth == 2) && (intDay > daysInFebruary(intYear))) return false;
    return true;
    }

    // daysInFebruary (INTEGER year)
    //
    // Given integer argument year,
    // returns number of days in February of that year.

    function daysInFebruary (year){
    // February has 29 days in any year evenly divisible by four,
    // EXCEPT for centurial years which are not also divisible by 400.
    return ( ((year % 4 == 0) && ( (!(year % 100 == 0)) || (year % 400 == 0) ) ) ? 29 : 28 );
    }

    And finally, why use that pesky isNaN method, when you can write your own:

     

    // general purpose function to see if a suspected numeric input 
    // is a positive integer

    function isNumber(inputStr) {
    for (var i = 0; i < inputStr.length; i++) {
    var oneChar = inputStr.substring(i, i + 1)
    if (oneChar < "0" || oneChar > "9") {
    return false;
    }//end of if
    }//end of for
    return true;
    }


     



  • Hey now, you don't expect Habib Abdul Deshucuiemenk to know our stinky American ways for his $.50/hour salary do ya?  The injuns can write the codes just fine!



  • I just think they pay by the line. I mean no self-loving person would go though these pains when they can replace it with 1 or 2 lines of regex. I wonder if they code in notepad.



  • Plz sirs email teh codes, I like them a lot. And how use.



  • @trwww said:

    Plz sirs email teh codes, I like them a lot. And how use.

    ME TOO!



  • My understanding was that a lot of these overseas developers Google pretty much every task.  Guess they actually tried to work things out this time because the first search result that showed up for me was the regex one.



  • Nice indentation level, 64 spaces or so? Makes me really appreciate its logical, erm, something.



  • @Jonathan Holland said:

    And finally, why use that pesky isNaN method, when you can write your own:

    Or when you can use the more appropriate (for outsourced workers) [i]isNaaN[/i] method:

      function isNaaN(obj) { return obj.isBread && obj.isCookedInTandoor && obj.isDelicious }



  • @djork said:

    Or when you can use the more appropriate (for outsourced workers) [i]isNaaN[/i] method:

      function isNaaN(obj) { return obj.isBread && obj.isCookedInTandoor && obj.isDelicious }

    me.isNowDesiringAGarlicNaan = true;

     



  • Okay, but let each of you ask yourself this question:

    "If I didn't have access to regular expressions, would my solution be any less fucked up than this one?"

    It doesn't look so bad. You people are a bunch of wimps. There are plenty of shops where no regex library is available or allowed to be used. Regular expressions are great, because they prevent EXACTLY this kind of code from having to be written. But it sounds like if you people didn't have them you'd be downright fucked. This code isn't bad.
     



  • @smxlong said:

    Okay, but let each of you ask yourself this question:

    "If I didn't have access to regular expressions, would my solution be any less fucked up than this one?"

    It doesn't look so bad. You people are a bunch of wimps. There are plenty of shops where no regex library is available or allowed to be used. Regular expressions are great, because they prevent EXACTLY this kind of code from having to be written. But it sounds like if you people didn't have them you'd be downright fucked. This code isn't bad.
     

     Then the shops are The Real WTF
     



  • @smxlong said:

    Okay, but let each of you ask yourself this question:

    "If I didn't have access to regular expressions, would my solution be any less fucked up than this one?"

    It doesn't look so bad. You people are a bunch of wimps. There are plenty of shops where no regex library is available or allowed to be used. Regular expressions are great, because they prevent EXACTLY this kind of code from having to be written. But it sounds like if you people didn't have them you'd be downright fucked. This code isn't bad.

    Name a language without regexp, save /VB[^(.Net)]/i

    There are plenty of shops where no regex library is allowed to be used

    I severely doubt it.

    But it sounds like if you people didn't have [regex] you'd be downright fucked

    That might be true, though.



  • @smxlong said:

    Okay, but let each of you ask yourself this question:

    "If I didn't have access to regular expressions, would my solution be any less fucked up than this one?"

    DFAs FTW.



  • @smxlong said:

    Okay, but let each of you ask yourself this question:

    "If I didn't have access to regular expressions, would my solution be any less fucked up than this one?"

    It would absolutely be less fucked up.

    Scan through the input, extract each numeric digit.  If you have enough digits, then you're done.  Extension should be a separate field.  Phone numbers should be stored as a simple string of digits, and then formatting can be applied later.  I don't care if someone wants to input their snobbish "123.456.7890" style or a traditional "(123) 456-7890" or even "+11234567890."

    @smxlong said:

    It doesn't look so bad. You people are a bunch of wimps. There are plenty of shops where no regex library is available or allowed to be used. Regular expressions are great, because they prevent EXACTLY this kind of code from having to be written. But it sounds like if you people didn't have them you'd be downright fucked. This code isn't bad.

    I think it could at least be improved, with the original algorithm preserved, by reducing the "arrow code" going on here.  The formatting is just absurd.



  • Here's a JS phone number validator:

      function isValidPhone(phoneStr) { return phoneStr.select(function(char) { return char.isDigit(); }).length >= minPhoneLength; }

    String#select and String#isDigit are left as an exercise to the reader :)



  • Exactly, loop through string, count digits, check if correct number of digits. Done



  • Yeah, that's some pretty awful indentation.  I smell Notepad.
     



  • "a suspected numeric input"

     hilarious



  • @djork said:

    Scan through the input, extract each numeric digit.  If you have enough digits, then you're done.

    No. 

    @XIU said:

    Exactly, loop through string, count digits, check if correct number of digits. Done

    No. 



  • @dhromed said:

    @djork said:

    Scan through the input, extract each numeric digit.  If you have enough digits, then you're done.

    No. 

    Explain. 

     @dhromed said:

    @XIU said:

    Exactly, loop through string, count digits, check if correct number of digits. Done

    No. 

    Explain.

    Please? 



  • @R.Flowers said:

    @dhromed said:

    @djork said:

    Scan through the input, extract each numeric digit.  If you have enough digits, then you're done.

    No. 

    Explain. 

     @dhromed said:

    @XIU said:

    Exactly, loop through string, count digits, check if correct number of digits. Done

    No. 

    Explain.

    Please? 

    please pass your algorithm over the following "phone number"

    e4e4e4e4e4e4e4e4e4e4
     

     even worst, when you try to internationalize this code it will basically be writing a whole new function for each culture.
     



  • @TehFreek said:

    @smxlong said:

    Okay, but let each of you ask yourself this question:

    "If I didn't have access to regular expressions, would my solution be any less fucked up than this one?"

    DFAs FTW.

     

    Quoted for Truth.  A DFA is trivial to implement in code.



  • @tster said:

    please pass your algorithm over the following "phone number"

    e4e4e4e4e4e4e4e4e4e4
     

     even worst, when you try to internationalize this code it will basically be writing a whole new function for each culture.

    Yup.

    Our library's phone validator rejects anything that doesn't start with 0, for example. It spits on 1234567890. Like so: *ptooey*.



  • @tster said:

     

    please pass your algorithm over the following "phone number"

    e4e4e4e4e4e4e4e4e4e4

    even worst, when you try to internationalize this code it will basically be writing a whole new function for each culture.

    4444444444?

    What's so hard about that?

    What about the fact that people enter phone numbers in a great variety of formats in the US alone?

    Out of these:

    5555555555, 555 5555555, 555 555 5555, 555-555-5555, 555 555-5555, (555) 555-5555, (555) 555 5555, +15555555555, +1 (555) 555 5555, +1 (555) 555-5555, +1 555 555-5555, or 555.555.5555

    what kind of algorithm would write and which one you would accept?  Why call any of those "invalid."  Unless your system is connected to a PBX and is actually using the POTS there is no reason to be a phone-format Nazi.



  • A common way to write UK numbers in the international form is:

    +44-(0)-20-7925-0918

    Any algorithm which simply discards punctuation will be wrong. International callers must drop the bit in parentheses; UK callers must drop the 44.



  • @dhromed said:

    Our library's phone validator rejects anything that doesn't start with 0, for example. It spits on 1234567890.
     

    I used to work at a pizza delivery place, where the computer system was a P120 running SCO OpenServer with 8 Wyse serial terminals.  It had some "cool" ideas when it came to phone numbers.

    Here in Australia, we have 0+one digit+eight digit phone numbers, but before 1995-1999 it was 0+one or two digits+seven or six digits. The conversion formulas were all fairly straightforward.

    The areas where I worked all phone numbers were in the form 07 46xx xxxx. The system allowed a 6-digit number (ala pre-8-digits) to be typed in and it would add the 0746 on the beginning. But I suspect the "phone number" field was an integer field in the database as typing in, say, 21 would cause the phone number to be 07 4600 0021. Reports would also print out the  number without the 0 on the beginning, but delivery dockets always had it in the correct formatting. If one typed in a >6 digit number the initial digits would be stripped and replaced with 0746.

    One exeception made was for mobile phone numbers, which all begin with 04. However, initially it only supported what was common: mobiles beginning with 040 and 041; If you tried to insert 0438123456 the system would "correct" it to 0746123456. This was eventually fixed.

    Finally, there  are valid numbers beginning with a 1, as in 1800 and 13 numbers. These would also be corrected to the 0746 format. I remember one time the army ordered some pizzas and (correctly) gave their number as 131901, but the delivery docket showed it as 0746131901 which could be a real phone number! (In fact the exchange prefix 46131 was valid in that area) I remember this because I tried to call and got no answer. I was about to give up the delivery then a guy came out and took the order, so it was all good.

    This system was eventually "upgraded" to 14 computers each running Windows XP. So it takes 14 P4-class computers running Windows XP to do the work of one P120 running some sort of Unix! They were   only upgraded because USA head office decreed: Virtually all Australian stores were running the Unix software from a local software maker. 



  • Note to self:  never work in Australia doing anything remotely related to phones or phone numbers. 



  • Regular expressions are slow, so they were optimizing for speed.



  •  Haha ok, so this "developer" obviously dosn't know RegEx, that stuff's just too complicated anyways...

    What I can't understand, is if you were going for straight string manipulation, wouldn't you go through the string value of the input char by char, and remove the char if it's not a number, then given the length of the number, you can tell if it's valid and format it properly. I mean, this is like the Swamp Search of javascript validation. 



  • @El_Heffe said:

    trwww:
    Plz sirs email teh codes, I like them a lot. And how use.

    ME TOO!

     

    Please stop teh maddness!



  •  Still digging through the codebase, I found this function in a page:

     

    function isDate(chkdate, dtformat)
    {
    var err = 0; // 0 - no error 1 - error
    var slash1 = 0; // the position of the first slash in the date
    var slash2 = 0 ; // the position of the second slash in the date
    var x;
    var xx;
    /*
    // do a check for the correct number of slashes and colons if the value hold time data as well
    for (var x=0; x<chkdate.length; x++)
    {
    var xx=chkdate.substring(x,x+1);
    if(xx=="/"||xx=="-" || xx==".")
    {
    if(x==0)return false;
    if (slash1==0)slash1=x;
    else
    {
    if (slash2==0)slash2 = x;
    else err=1;
    }
    }
    }
    if(slash1==0||slash2==0)return false;
    */


    // do a check for the correct number of slashes and colons if the value hold time data as well
    for (var x=0; x<chkdate.length; x++)
    {
    var xx=chkdate.substring(x,x+1);
    if(xx=="/")
    {
    if(x==0)return false;
    if (slash1==0)slash1=x;
    else
    {
    if (slash2==0)slash2 = x;
    else err=1;
    }
    }
    }
    if(slash1==0||slash2==0)return false;

    // assign the date values
    da=chkdate.substring(slash1+1,slash2); // day
    da=rtrim(ltrim(da));
    if(da.length==1)da=0+da;

    mo=chkdate.substring(0, slash1); //month
    mo=rtrim(ltrim(mo));
    if(mo.length==1)mo=0+mo;

    yr=chkdate.substring(slash2+1,chkdate.length); // year
    yr=rtrim(ltrim(yr));



    if(isNaN(mo)||isNaN(da)||isNaN(yr))return false;
    if(yr.length==2)
    {
    if(yr>20)yr="19"+yr; // add century if not specified
    else yr="20"+yr;
    }
    if(yr.length==1)yr="200"+yr; // add century and decade if not specified

    // basic date error checking
    if (parseInt(mo,10)<1||parseInt(mo,10)>12||parseInt(da,10)<1||parseInt(da,10)>31||parseInt(yr,10)<=0||parseInt(yr.length,10)==3)err=1;

    // months with 30 days
    if((parseInt(mo,10)==4||parseInt(mo,10)==6||parseInt(mo,10)==9||parseInt(mo,10)==11)&&parseInt(da,10)==31)err=1;

    // february, leap year

    if(parseInt(mo,10)==2)
    {

    // feb
    if(parseInt(da,10)>29)err=1;
    if(parseInt(da,10)==29)
    {
    if(parseInt(yr,10)%4==0)
    {
    if(parseInt(yr,10)%100==0)if(parseInt(yr,10)%400!=0)err=1;
    }
    else err=1;
    }

    }

    if (parseInt(yr,10) > 9999 || parseInt(yr,10) < 1753) err = 1;

    if (err==0)return true;
    if (err==1)return false;
    }

     For reference, here is how I validate a date:

     function checkDate(theDate)
        {
            var dateRegex =/^(?=\d)(?:(?:(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})|(?:0?2(\/|-|\.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))|(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2}))($|\ (?=\d)))?(((0?[1-9]|1[012])(:[0-5]\d){0,2}(\ [AP]M))|([01]\d|2[0-3])(:[0-5]\d){1,2})?$/;      

        return (theDate.match(dateRegex));
        }


  • Discourse touched me in a no-no place

    @Jonathan Holland said:

    For reference, here is how I validate a date:

    [...]

    =/^(?=\d)(?:(?:(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})|(?:0?2(\/|-|\.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))|(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2}))($|\ (?=\d)))?(((0?[1-9]|1[012])(:[0-5]\d){0,2}(\ [AP]M))|([01]\d|2[0-3])(:[0-5]\d){1,2})?$/;  

    That looks so much more readable and maintainable. Do you validate email addresses using regex aswell?



  • Amazingly, with a good regex tool and a healthy sample the different dates you accept, that shouldn't be too hard to figure out and maintain. 



  • @PJH said:

    @Jonathan Holland said:

    For reference, here is how I validate a date:

    [...]

    =/^(?=\d)(?:(?:(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})|(?:0?2(\/|-|\.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))|(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2}))($|\ (?=\d)))?(((0?[1-9]|1[012])(:[0-5]\d){0,2}(\ [AP]M))|([01]\d|2[0-3])(:[0-5]\d){1,2})?$/;  

    That looks so much more readable and maintainable. Do you validate email addresses using regex aswell?

     

    Fixed it for you. 


  • Discourse touched me in a no-no place

    @dhromed said:

    @PJH said:

    That looks so much more readable and maintainable. Do you validate email addresses using regex aswell?

     

    Fixed it for you. 

    You appear to have run out of oranges.



  • @PJH said:

    Do you validate email addresses using regex aswell?

    I like this approach pretty well... I mean, wtf? You need 83 messy lines of code too match strings that are about 20 characters at max?

    Perhaps they should have made e-mail specification simpler, like:

    • a bunch of letters or numbers
    • an "@"
    • some letters with single "."s or "-"s inbetween
    • a "."
    • two to four letters
    That would actually have resulted in a human-readable RegEx (I know they're rare).


  • @derula said:

    Perhaps they should have made e-mail specification simpler, like:

    • a bunch of letters or numbers
    • an "@"
    • some letters with single "."s or "-"s inbetween
    • a "."
    • two to four letters
    That would actually have resulted in a human-readable RegEx (I know they're rare).
    ...which would be still wrong (hint: there's a .museum TLD).


  • Discourse touched me in a no-no place

    @ender said:

    @derula said:
    Perhaps they should have made e-mail specification simpler, like:

    • a bunch of letters or numbers
    • an "@"
    • some letters with single "."s or "-"s inbetween
    • a "."
    • two to four letters
    That would actually have resulted in a human-readable RegEx (I know they're rare).
    ...which would be still wrong (hint: there's a .museum TLD).
    .. and it doesn't allow for +'s in the local part.


  • For fun. Another futile attempt.

    ^[^@\s]+@[a-z0-9-\.]+\.([a-z]{2,4}|museum)$



  • @dhromed said:

    For fun. Another futile attempt.

    ^[^@\s]+@[a-z0-9-\.]+\.([a-z]{2,4}|museum)$

    Nice. I should consider applying that to my wtf-y Content Management System.

    ...

    No, f2k .museum. 


Log in to reply