BLOCKED SCRIPT instr, even with ascii conversion yields random results



  • I've been at work for hours now on the simplest thing.  Email validation.



    background: This portion of the app is ultimately importing a CSV file
    to SQL,  on the page in question I'm importing a CSV file and
    displaying accepted and rejected records for the user based on a
    valid/invalid email addresses in the import file.



    1)  Of course the first thing I tried was a series of two instr
    statements.  The obvious and quick one, but low and behold even
    with an import file containing 8 records, 5 of which had valid email
    addresses, I could only got 4 to recognize off of the instr(textstring).



    2)  Well I thought, perhaps if I convert the string to ascii code
    and instr on my codes I'll get a better response.  Interesting I
    still only return 4 records with valid emails, but the erroneously
    omitted valid email was a different one than before.



    3)  So I said, "if both methods manage to catch all of the valid
    emails I'll just use both"; and that's what I did.  It worked on
    my 8 record import over and over again and for days more of development
    in other places.


    1. Then suddenly as I began stepping up my import size toward 150
      records the problem returned (even when using instr on both the text
      and ascii code strings)


    2. Finally since I mastered array = split(string, delimiter) for the
      file.readline text import I decided WTF, I'll try splitting the email
      addresses on "@" into an array, and then taking the 2nd index of that
      array and splitting it on "."



      It worked beautifully and takes a lot less code than the ascii
      conversion although it takes more code than instr would IF INSTR Fn
      WORKED!



      Here's the snippet - yah yah: ugly "if"s, horrible variable names -but:
      it works EVERY time and there are NO loops and I'm not changing $#!T as
      I would waste even more time on what SHOULD have worked the first time
      I wrote it.



          validemail = "false"

          if len(varEmailAddress) > 6 then

                test = split(varEmailAddress, "@")

                  if ubound(test) > 0 then

                      test2 = split(test(1), ".")

                      if ubound(test2) > 0 then

                          validemail = "true"

                    end if

          end if



      peace,

      Joey


  • The funny thing is how surprisingly (to the unitiated) many address formats RFC822 allows (such as comments, within parentheses), which so few people know about. At least half the forms on the web will reject perfectly valid email addresses while accepting other invalid ones.



  • true true, but I'm not validating from a web form, so I'm basically
    trying to catch corrupted (completely nonsense) fields during the
    import.  It's an opt-in newsletter so if someone wants to put a
    bullshit email address in it will just throw a soft error during
    broadcast and alert the administrator to fix it or delete it



    All of the asp email validation techniques that I have found via google
    (I'm not THAT retarded) employ the instr function somehow and many of
    them are too messy and long to be calling during a thousand record text
    file import, which yields erratic results - maybe because I'm
    textfilehook.readline'ing through a file in the web directory during
    the import and that is too quick for instr to keep up?  It isn't
    completely random it will choose some to pick on with instr(var,text)
    others with instr(var, asciicode) and still others with both employed -
    is anyone getting the precedence of this here?   And yes, I
    have gone back and recreated sample input files from scratch and gotten
    the same oddball behavior.



    The web form validation for the public subscribe page will be
    completely different, and likely I'll end up using a character by
    character loop for that.since I'll have all the time in the world at
    that point.



    I'll look around again this time focussing on the RFC standard and let you know what I end up doing.



    Any comments on "if instr(var, text) then"  randomness would be especially welcome.  Because THAT is the WTF here.




  • okay, how long did that take 3 minutes?



    3 minutes to find 12 or so to communicate... I haven't slept in 27 hours :)



    Coding the RFC specification is outside of the scope of this project...


    I searched google for your exact search phrase and turned up a
    bunch of
    instr crap like I thought (which might work great on a single record
    check forever), then I tripped over an email validating component
    hex-something or other (which I had
    completely ignored before but this time



    (again for the imports the split will do fine - I'm just getting
    rid of phone numbers and crap from people importing things badly into
    outlook before they export them for the subscriber lists --as fast as possible)



    The web form validation
    Beyond the BLOCKED SCRIPT Since I'm
    going to be coding an essential listserv (that portion in
    .net) we got aspnetmail to do the dirty work.  I have aspnetmail
    available to me in asp too - had to get the hosting provider to
    register it though.  Perhaps I can confirm the address by either
    sending an email or setting up a smtp to the target mx (real
    quick) to validate that it is real.  In the web form I again have
    a good bit of time to play
    with so a lot is possible.  I know, I know it still doesn't
    validate that the email address belongs to the person signing up, but
    that is functionality far outside the scope of this site "recreate"
    project.  (somebody did a huge clusterWTF in the php version - I
    couldn't look at php code for a week without getting nautious -kidding,
    not about the cluster though)



    It might be quicker too, for me to just set up a temp table and send
    out a uniqueid validation link that they have to click out of their
    email.



    I haven't started coding the broadcasting part of the app yet, so I'm
    sure I'll see the capabilities and limitations of the aspnetemail component as far as the web form
    goes in the next few days.



  • Call me a regexp fanatic, but wouldn't this be a prime example for using regular expressions?

    ^\S+@\S+.\S{2,4}$

    Would at least perform a basic sanity check, and could easily be expanded later if needed.



  • @phelyan said:

    Call me a regexp fanatic, but wouldn't this be a prime example for using regular expressions?

    ^\S+@\S+.\S{2,4}$

    Would at least perform a basic sanity check, and could easily be expanded later if needed.

     

    If the language supported it, yes.

    Drak



  • @Drak said:

    @phelyan said:

    Call me a regexp fanatic, but wouldn't this be a prime example for using regular expressions?

    ^\S+@\S+.\S{2,4}$

    Would at least perform a basic sanity check, and could easily be expanded later if needed.

     

    If the language supported it, yes.

    Drak

    *twitch* There are some that don't? *twitch* Urks...



  • @tufty said:

    > Coding the RFC specification is outside of the scope of this project Fair enough, but be aware that you are liable to get all manner of validly rfc formatted email addresses if you're accepting data from users. You really want to be dealing the the majority of them. I've written listservs, email systems, custom spam filters, and various other stuff along those lines. You wouldn't believe the crap you get sent that is valid As ASP.net should have regular expressions, why not use this (very basic) format validator? It's far from perfect but about a million times better than what you have. /^\s*(\w+([-+.]\w+))@([\w-]+.)+[\w-]+\s$/ And, frankly, if you've got a library that does email type stuff, use its validation. I assume aspnetemail does have some basic capabilities? Simon

    He's not using ASP.Net as far as i can see... At least, ubound looks more like ASP to me. ASP.Net uses .Length to check the size of an array...

    Drak



  • @Drak said:

    He's not using ASP.Net as far as i can see... At least, ubound looks
    more like ASP to me. ASP.Net uses .Length to check the size of an
    array...

    Drak



    ... and can still use Ubound() or the .GetUpperBound method to find the upper bound of an array (which is not congruent with the .Length).


  • What language is this? Looks like VB.  Then can't you use a RegExp
    object, or does VB not have this?  If so, you can use a regexp
    like this one:



    "^([a-zA-Z0-9][a-zA-Z0-9._-!+#])@([a-zA-Z0-9][a-zA-Z0-9-](.[a-zA-Z0-9][a-zA-Z0-9-]*)+)$"



    Which is what I use and works rather well.



        -dZ.



  • @DZ-Jay said:

    What language is this? Looks like VB.  Then can't you use a RegExp object, or does VB not have this?  If so, you can use a regexp like this one:

    "^([a-zA-Z0-9][a-zA-Z0-9._-!+#])@([a-zA-Z0-9][a-zA-Z0-9-](.[a-zA-Z0-9][a-zA-Z0-9-]*)+)$"

    Which is what I use and works rather well.

        -dZ.

     

    Can't you use something like the \w in Python? It's equivalent to a-zA-Z0-9_  ...

    "^([\w][\w\.\-!+#]*)@([\w][\w\-]*(\.[\w][\w\-]*)+)$"

    Special sequences are the fashion inParis I hear.



  • WTF does "blocked script" have to do with anything?



  • @tufty said:

    script injection attacks, of course. I should preview. Oh, yeah. Not. Oh, hang on. edit. Oh. No.

    What the fuck?

    Simon





    Couldn't have been much of a script to be in the title of the post. I just want to know what word got replaced with BLOCKED SCRIPT






  • I believe  j a v a s c r i p t  (1) and  v b s c r i p t (2)  get replaced. Maybe with < s c r i p t   l a n g u a g e = " " > (3 & 4) around them.

    (1) javascript

    (2) vbscript

    (3) <script language="javascript">

    (4) <script language="vbscript">

    Drak



  • Hmm, or maybe not.



  • javascript



  • Weird, I had this once. Maybe it's case sensitive.


    • javascript
    • Javascript
    • JavaScript


  • @Cyresse said:

    @DZ-Jay said:

    What language is this? Looks like VB.  Then can't you use a RegExp object, or does VB not have this?  If so, you can use a regexp like this one:

    "^([a-zA-Z0-9][a-zA-Z0-9\._\-!+#]*)@([a-zA-Z0-9][a-zA-Z0-9\-]*(\.[a-zA-Z0-9][a-zA-Z0-9\-]*)+)$"

    Which is what I use and works rather well.

        -dZ.

    Lots of you have mentioned to use regular expressions but I didn't see anyone mentioning how to execute a regular expression from vbscript or vb which doesn't support it natively.
    Ok so I just googled and found that microsoft have aparently added it (?)

    I was going to suggest using com+ , createobject or server.createobject if working in asp.

    If this works...then you won't need to. Replace pattern with whatever above.

    <%

    Function isValidEmail(myEmail)
      dim isValidE
      dim regEx
     
      isValidE = True
      set regEx = New RegExp
     
      regEx.IgnoreCase = False
     
      regEx.Pattern = "^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$"
      isValidE = regEx.Test(myEmail)
     
      isValidEmail = isValidE
    End Function

    %>

     

    Can't you use something like the \w in Python? It's equivalent to a-zA-Z0-9_  ...

    "^([\w][\w\.\-!+#]*)@([\w][\w\-]*(\.[\w][\w\-]*)+)$"

    Special sequences are the fashion inParis I hear.



  • maybe it doesnt like P.H.P. tags?

    ?> test

     

    <?php test


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.