Security by ignoring RFCs



  • from    PJH
    to    customerservice@firstsave.co.uk,
    date    17 Jan 2008 13:29:58 -0000
    subject    Error in online application process.

       
    I'd like to complain that your online application process refuses to accept an address with a + in the part before the @ sign, claiming the address to be 'invalid'

    from    customerservice@firstsave.co.uk
    to   
    PJH,
    date    Jan 17, 2008 4:06 PM
    subject    FirstSave


    Dear PJH

    Thank you for your email of 17th January 2008 our online application process

    I can confirm that for the security of our accounts we are only able to accept standardised e-mail address that do not include any special characters. However, we do apologise for any inconvenience caused by this.

    from    PJH
    to    customerservice@firstsave.co.uk
    date    Jan 17, 2008 4:19 PM
    subject    Re: FirstSave

    May I enquire as to what 'security' is enabled by not allowing valid characters in email addresses?  I can conceive of no reasonable reason for this. For example the + character is used by Google Mail to allow its users to filter email based on the local part of the email address used?

    Furthermore, may I suggest you refer whoever came up with this brain-dead idea to RFC 2822 (http://tools.ietf.org/html/rfc2822)?

    I don't expect a reply to this last one.

     
    Twats. 

     

     



  • Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.



  • I've been fighting this battle for years now... people writing "validation" regex for "email" addresses who haven't seen RFC[2]822, and ignoring the standard ways of doing this.  Of course, we have other people who copy those people, so it's really becoming this bad virus of what "email" addresses are.  Most of the time, these regex would reject my test address <fred&barney@stonehenge.com>, which has been in place for about a dozen years now. (Go ahead, try it... it's an autoresponder.)  There is no inherent insecurity in accepting '822.  It just means you coded bad somewhere else.  Never let an email address near an unescaped SQL parameter or shell command line!  It's not hard, people!



  • @PJH said:

    May I enquire as to what 'security' is enabled by not allowing valid characters in email addresses?
    Their financial security since they are able to sell lists of email addresses to spammers.



  • Hey - try my new email address:

     me'; drop database--@gmail.com
     



  • When a feature is poorly implemented, one can say that it is crippled for "security reasons".



  • @m0ffx said:

    @PJH said:
    May I enquire as to what 'security' is enabled by not allowing valid characters in email addresses?
    Their financial security since they are able to sell lists of email addresses to spammers.
    Didn't work with me; I just used a sneakemail address instead.



  • @realmerlyn said:

    I've been fighting this battle for years now... people writing "validation" regex for "email" addresses who haven't seen RFC[2]822, and ignoring the standard ways of doing this.  Of course, we have other people who copy those people, so it's really becoming this bad virus of what "email" addresses are.  Most of the time, these regex would reject my test address <fred&barney@stonehenge.com>, which has been in place for about a dozen years now. (Go ahead, try it... it's an autoresponder.)  There is no inherent insecurity in accepting '822.  It just means you coded bad somewhere else.  Never let an email address near an unescaped SQL parameter or shell command line!  It's not hard, people!

    Oh, for grins, google for:  site:regexlib.com "wrong wrong wrong"

    You'll see all the times I've been fighting this in a place intending to exchange regular expressions for things. 



  • You really could have just replied saying that foo+bar@wtf.com is a "standardized" email address with no "special" characters, and referred them to the RFC.  Politeness (or sarcasm misinterpreted as politeness) generally has a better chance of success. 

     As further reference (since this is a UK site), you could point them to any of the multiple US-based sites that only accept a "standardized" email address with only one dot in the domain name (or even more fun, the ones that will only accept .com .net or .org).  They're out there, but I don't recall seeing one any time recently.



  • The real WTF is that the first google hit for "email regex" is this douche:

     

    His (non-compliant, terribly restrictive) regex works for him on his shitty little site, it must be good enough for everyone!

     



  • I'm happy when sites at least accept a dot in the pre-@ part of the email address.  It shouldn't really surprise anybody that these rare characters aren't accepted by a lot of sites.  Perhaps the real WTF is that the RFC allows them in the first place.



  • @merreborn said:

    The real WTF is that the first google hit for "email regex" is this douche:

    http://www.regular-expressions.info/email.html

     

    His (non-compliant, terribly restrictive) regex works for him on his shitty little site, it must be good enough for everyone!


    He even goes as far as to mention the correct way of doing it in a regexp and then immediately dismisses it. The cube is perfectly fine for checking an email which should only be done when the user signs up and when the user changes their email.



  • @Latexxx said:

    Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.

     

    Huh?



  • @bighusker said:

    @Latexxx said:

    Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.

     

    Huh?

    This is what happens when you don't speak English and use some random translation service to translate to a few languages in turn.



  • @Lingerance said:

    @bighusker said:

    @Latexxx said:

    Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.

     

    Huh?

    This is what happens when you don't speak English and use some random translation service to translate to a few languages in turn.

    I'm relieved that I'm not the only one who didn't understand Latexxx. I thought I was lacking English classes. 



  • @Lingerance said:

    @merreborn said:

    The real WTF is that the first google hit for "email regex" is this douche:

    http://www.regular-expressions.info/email.html

     

    His (non-compliant, terribly restrictive) regex works for him on his shitty little site, it must be good enough for everyone!


    He even goes as far as to mention the correct way of doing it in a regexp and then immediately dismisses it. The cube is perfectly fine for checking an email which should only be done when the user signs up and when the user changes their email.

    My favorite part is how he suggests expanding the regex to check for specific TLDs because the standard one will allow domains with invalid TLDs like "asdf.asdf" or "aol.com.nospam". Too bad it'll still accept stuff like "fuyasermnslkvjyilausrasdflkjatas.com" (assuming said domain does not actually exist).

    Seems he's never heard of the technique of taking the hostname from an email address and doing a [b]DNS lookup[/b] to see if it exists - yes, a plain DNS lookup; contrary to what you might think, MX records [b]are not required[/b] for mail delivery, though some non-compliant mail servers such as Exchange still require it (as an amusing aside, when I lost the MX record on my own domain, ALL spam to it immediately ceased, while most normal mail kept working; once I readded the MX record, the spam started up again). Sure, it's a little bit slower, but it's a hell of a lot simpler than trying to keep one's site up to date with every new TLD that gets added.



  • @Quietust said:

    Seems he's never heard of the technique of taking the hostname from an email address and doing a [b]DNS lookup[/b] to see if it exists - yes, a plain DNS lookup;
    Is it legal to do that these days?

    http://www.channelregister.co.uk/2008/01/17/anti_spam_activist_lawsuit/

    http://yro.slashdot.org/article.pl?sid=08/01/17/0417209

    </hyperbole>



  • @merreborn said:

    The real WTF is that the first google hit for "email regex" is this douche:

    http://www.regular-expressions.info/email.html

    His (non-compliant, terribly restrictive) regex works for him on his shitty little site, it must be good enough for everyone!

     Oh yeah... I've written that guy to try to get him to change his site.  He laughs at me, I guess.  So I laugh in his general direction in return. 



  • @PJH said:

    @Quietust said:

    Seems he's never heard of the technique of taking the hostname from an email address and doing a [b]DNS lookup[/b] to see if it exists - yes, a plain DNS lookup;
    Is it legal to do that these days?

    http://www.channelregister.co.uk/2008/01/17/anti_spam_activist_lawsuit/

    http://yro.slashdot.org/article.pl?sid=08/01/17/0417209

    </hyperbole>

    Scary, but I believe that's a bit different - he was doing a zone transfer, requesting bulk amounts of information. All I suggested was doing a plain DNS 'A' lookup - anybody suggesting that those are illegal would have to be a complete idiot, as it is akin to looking up somebody's number in a public phone book.



  • @GettinSadda said:

    Hey - try my new email address:

     me'; drop database--@gmail.com
     

    Gmail's servers say you don't exist.



  • @realmerlyn said:

    I've been fighting this battle for years now... people writing "validation" regex for "email" addresses who haven't seen RFC[2]822, and ignoring the standard ways of doing this.  Of course, we have other people who copy those people, so it's really becoming this bad virus of what "email" addresses are.  Most of the time, these regex would reject my test address <fred&barney@stonehenge.com>, which has been in place for about a dozen years now. (Go ahead, try it... it's an autoresponder.)  There is no inherent insecurity in accepting '822.  It just means you coded bad somewhere else.  Never let an email address near an unescaped SQL parameter or shell command line!  It's not hard, people!

    Having written a C++ class for determining valid '822 addresses (and having had to make it configurable at runtime so that the company using it could eliminate complaint characters they couldn't handle!), I can attest to how weak a validation strategy regex is for this.  It reminds me of a sort of koan about regex:

     Suppose a programmer has a problem to solve.
    Suppose she chooses regex as a solution to the problem.
    Now she has two problems.

    The C++ code for it wasn't exactly a picnic, either.  I could tell because the ants didn't show up.



  • @bighusker said:

    @Latexxx said:

    Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.

     

    Huh?

    I never liked the term RFC.  It stands for "Request for Comment", which doesn't sound like the sort of thing that's been approved and become a standard.  Once it's been commented on, and the comments have been addressed, it should be renamed to something else.



  • The Real You Know What is that Google mail doesn't allow underscores in the mailbox name. Only aphanumerics, period and dash.



  • @GettinSadda said:

    Hey - try my new email address:

     me'; drop database--@gmail.com
     

     Hey look everyone, it's Bobby Tables!!

     



  • @mrprogguy said:

    It reminds me of a sort of koan about regex:

     Suppose a programmer has a problem to solve.
    Suppose she chooses regex as a solution to the problem.
    Now she has two problems.

    That's the infamous JWZ quote, misquoted.  See http://en.wikipedia.org/wiki/Jamie_Zawinski for the real quote.



  • @GettinSadda said:

    Hey - try my new email address:

     me'; drop database--@gmail.com
     

    [img]http://imgs.xkcd.com/comics/exploits_of_a_mom.png[/img]


  • @alegr said:

    The Real You Know What is that Google mail doesn't allow underscores in the mailbox name. Only aphanumerics, period and dash.

    I'm glad that I use an email provider that lets me use addresses of the form: anything@username.non-wtf-email.com. Lets me hand out a different address to different sites / companies, which might be good to have. So far, it's been useful exactly once, when I could see just from which site an address had been lifted by spammers - but at least it feels good if you want to do something interesting in the future.

    (It's FastMail.FM, in case anyone was looking for just that. Their free accounts won't let you do the random username thing, but there is a $15 "pay once" option. They're pretty good, IMHO. (And yes, that's a referral link - copy/paste if you don't like it.))



  • @merreborn said:

    The real WTF is that the first google hit for "email regex" is this douche:

    http://www.regular-expressions.info/email.html

    That website is, in general, somewhere between "poor" and "dangerous". There is nothing on it of any great value and a lot of things that are wrong.
     

    @Quietust said:

    Seems he's never heard of the technique of taking the hostname from an email address and doing a [b]DNS lookup[/b] to see if it exists - yes, a plain DNS lookup;

    The whole concept is braindamaged anyway.

    If you want to validate that an email address is correct, send it a mail with a validation link in it, and tell the user to go follow it.

    If you aren't going to bother, then why waste time on partial tests that still don't tell you whether it's the right email address? Either you care about having this person's address (in which case you need to validate it properly), or you don't (in which case you shouldn't be bothering). 



  • @H|B said:

    @Lingerance said:
    @bighusker said:

    @Latexxx said:

    Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.

     

    Huh?

    This is what happens when you don't speak English and use some random translation service to translate to a few languages in turn.

    I'm relieved that I'm not the only one who didn't understand Latexxx. I thought I was lacking English classes. 

     

    I must have been pretty drunk when writing that. I can hardly even understand it myself. 



  • @H|B said:

    @Lingerance said:
    @bighusker said:

    @Latexxx said:

    Maybe you should have also said them that the RFC is the standard as nothing now implies so unless they already know.

     

    Huh?

    This is what happens when you don't speak English and use some random translation service to translate to a few languages in turn.

    I'm relieved that I'm not the only one who didn't understand Latexxx. I thought I was lacking English classes. 

    Come on, give the guy a break.  Either "them" is a typo for "then" or, more likely, the word "to" is missing: easy mistake when typing fast.  I would have put a comma after "standard", to set off the remainder of the phrase.  Everything from "unless" on is redundant, but doesn't obscure the meaning of the statement. 

    Now I'm wondering how long it will be before standardized tests start to include an "internet reading comprehension" section, LOL.
     



  • Update!!

    Pretty much what I expected...

    from customerservice@firstsave.co.uk
    to PJH
    date Jan 18, 2008 3:32 PM
    subject    FirstSave


    Dear PJH


    Thank you for your email of 17 January.

    Please accept my apologies for the difficulties you experienced when applying for our account.

    Our system is set up not to allow special characters in email addresses to prevent any invalid addresses being given. However, your comments have been put forward to the appropriate department who may choose to change this in the future.

     

     



  • Well even gmail doesn't accept all valid e-mails, try using "Firstname Lastname"@domain.com, doesn't even accept it.



  • @PJH said:

    Pretty much what I expected...

    from customerservice@firstsave.co.uk
    to PJH
    date Jan 18, 2008 3:32 PM
    subject    FirstSave


    Dear PJH


    Thank you for your email of 17 January.

    Please accept my apologies for the difficulties you experienced when applying for our account.

    Our system is set up not to allow special characters in email addresses to prevent any invalid addresses being given. However, your comments have been put forward to the appropriate department who may choose to change this in the future.

     

    What else would anyone expect? The customer service person to answer "I updated the codebase, tested and released the new regex you provided. Works much better! Thx! - Meaningless Customer Service Employee"?

    Come on, the validation is stupid, and definitely a WTF, but you can hardly blame this poor bastard. Hopefully they really did pass it on, and maybe someone will incorporate it in at some point.



  • @asuffield said:

    The whole concept is braindamaged anyway.

    If you want to validate that an email address is correct, send it a mail with a validation link in it, and tell the user to go follow it.

    If you aren't going to bother, then why waste time on partial tests that still don't tell you whether it's the right email address? Either you care about having this person's address (in which case you need to validate it properly), or you don't (in which case you shouldn't be bothering).

    The whole point of using an email regex is so you [i]don't have to send an email[/i] to something which you could be otherwise 100% certain [b]will not work[/b] - it's not to tell you that the email address is valid, but to tell you that it isn't invalid in a particular way.

    In fact, some systems which require email addresses (such as forums) use [b]both[/b] methods - first they pass your email through a regex to make sure it isn't an obvious fake, and [i]then[/i] they send a mail with a validation link.

    Yes, it's true, mail servers do perform [i]all[/i] necessary validation (by definition), but if you rely on them entirely then you're going to be getting lots and lots of SMTP error messages from all of the crap you didn't filter out beforehand.



  • @Quietust said:

    Yes, it's true, mail servers do perform [i]all[/i] necessary validation (by definition), but if you rely on them entirely then you're going to be getting lots and lots of SMTP error messages from all of the crap you didn't filter out beforehand.

    So what? Why on earth does it matter that your send_mail() function returns an error? Why would you waste time AND BREAK THE DAMN APPLICATION by "filtering" the results first rather than just letting your MTA do its job? 



  • Well, attracting users to a site is always a compromise:

    If you want the maximum amount of users, but not much control over them, don't require registration at all.
    If you want the maximum amount of control at the cost of a few users, require them to confirm their email address.
    If you want something in the middle, do a quick check on their email but don't require them to confirm it.



  • @Cap'n Steve said:

    Well, attracting users to a site is always a compromise:

    If you want the maximum amount of users, but not much control over them, don't require registration at all.
    If you want the maximum amount of control at the cost of a few users, require them to confirm their email address.
    If you want something in the middle, do a quick check on their email but don't require them to confirm it.

    I am puzzled as to why you think knowing somebody's email address gives you control over them.

    The only possible reason to want an email address is to send somebody mail. Either you want to do this reliably for every user, or you do not. There is no "middle".



  • I mostly meant control over whether they can use your site or not. Banning an IP address is almost useless, banning a username is better, and banning a username and email is even better.



  • There is no limit to the number of email addresses that a person can generate with no effort or expense. That just doesn't work. 



  • @asuffield said:

    There is no limit to the number of email addresses that a person can generate with no effort or expense. That just doesn't work. 

    Which means that all attempts to ban a determined user are really just band-aids on a leaking dam. It doesn't do anything other than make you feel good.



  • How is signing up for a free email account "no effort"? If they have to get a new email address, sign up for your site, and then confirm that email every time they get banned, trolling suddenly doesn't seem so appealing any more.



  • @Cap'n Steve said:

    How is signing up for a free email account "no effort"? If they have to get a new email address, sign up for your site, and then confirm that email every time they get banned, trolling suddenly doesn't seem so appealing any more.
    So you're saying those steps can't be automated? The most effective way is to require payment upon registration, which is not suitable for most sites.



  • @Cap'n Steve said:

    @asuffield said:
    There is no limit to the number of email addresses that a person can generate with no effort or expense.
    How is signing up for a free email account "no effort"?

    Who said anything about signing up for a free email account? I've already mentioned two methods in this thread (myname+unique@gmail.com.com and sneakemail.com,) that don't require you to sign up to an ESP to get a new email address. And neither even require you to visit their websites to come up with the address.



  • @PJH said:

    @Cap'n Steve said:

    @asuffield said:
    There is no limit to the number of email addresses that a person can generate with no effort or expense.
    How is signing up for a free email account "no effort"?

    Who said anything about signing up for a free email account? I've already mentioned two methods in this thread (myname+unique@gmail.com.com and sneakemail.com,) that don't require you to sign up to an ESP to get a new email address. And neither even require you to visit their websites to come up with the address.

    And that's before we even consider anybody who owns their own domain names and mail servers. I routinely create new throwaway addresses by adding entries to /etc/aliases, for sites that insist on a real address but which I don't particularly trust. I could construct a whole subdomain just as easily (about ten lines to write, instead of one, in three directly accessible files).



  • steps:

    1.  Bad an account and the IP.
    2.  For troublesome trolls put a temporary ban on a range of IPs.  Current users will still be able to post because they already have an account.  The only problem is if someone new within that range attempts to create an account. Oh well.
    3.  for people using address+unique@gmail.com trick... that's a pretty easy one tofigure out and ban...
    4.  For people with their own domain...  just ban the whole domain.  It's pretty obvious that joescandystore.net is not an email provider.



  • @tster said:

    . . .For troublesome trolls put a temporary ban on a range of IPs.  Current users will still be able to post because they already have an account. . . .
    . . .
    4.  For people with their own domain...  just ban the whole domain.  It's pretty obvious that joescandystore.net is not an email provider.

    Blanket bans are fun because no innocent person will ever be affected by a blanket ban.



  • @asuffield said:

    @PJH said:

    @Cap'n Steve said:

    @asuffield said:
    There is no limit to the number of email addresses that a person can generate with no effort or expense.
    How is signing up for a free email account "no effort"?

    Who said anything about signing up for a free email account? I've already mentioned two methods in this thread (myname+unique@gmail.com.com and sneakemail.com,) that don't require you to sign up to an ESP to get a new email address. And neither even require you to visit their websites to come up with the address.

    And that's before we even consider anybody who owns their own domain names and mail servers. I routinely create new throwaway addresses by adding entries to /etc/aliases, for sites that insist on a real address but which I don't particularly trust. I could construct a whole subdomain just as easily (about ten lines to write, instead of one, in three directly accessible files).

    I've heard they have the technology now to ban entire domains. Quantum computers or something.



    Yes, nothing's perfect. Theoretically, someone could write a bot to register domains and create fake emails all day long. They could automate Yahoo's registration process, my site's registration process, etc. If they want to do all that just to troll my site, then I guess I'm screwed.



  • @Lingerance said:

    @tster said:
    . . .For troublesome trolls put a temporary ban on a range of IPs.  Current users will still be able to post because they already have an account. . . .
    . . .
    4.  For people with their own domain...  just ban the whole domain.  It's pretty obvious that joescandystore.net is not an email provider.
    Blanket bans are fun because no innocent person will ever be affected by a blanket ban.

    I'd cry myself to sleep over all the joescandystore employees that couldn't access my forums using their work email. 



  • @realmerlyn said:

    realmerlyn:
    I've been fighting this battle for years now... people writing "validation" regex for "email" addresses who haven't seen RFC[2]822, and ignoring the standard ways of doing this.  Of course, we have other people who copy those people, so it's really becoming this bad virus of what "email" addresses are.  Most of the time, these regex would reject my test address <fred&barney@stonehenge.com>, which has been in place for about a dozen years now. (Go ahead, try it... it's an autoresponder.)  There is no inherent insecurity in accepting '822.  It just means you coded bad somewhere else.  Never let an email address near an unescaped SQL parameter or shell command line!  It's not hard, people!

    Oh, for grins, google for:  site:regexlib.com "wrong wrong wrong"

    You'll see all the times I've been fighting this in a place intending to exchange regular expressions for things. 

    First off, you'd convince more people if you would provide an actual example of an email address that it fails to match. Second, there's no reason someone should have to allow comments, because there is no universe in which "Muhammed.(I am the greatest) Ali @(the)Vegas.WBA" is not the SAME address as "Muhammed.Ali@Vegas.WBA", and no reason, apart from misplaced pedantry, not to make the holder of that e-mail address type the latter. If I ever write an e-mail validation routine, it will specifically exclude that case, returning an error message of "try again without any fancy 'comments', smart-ass."



  • @Quietust said:

    Scary, but I believe that's a bit different - he was doing a zone transfer, requesting bulk amounts of information. All I suggested was doing a plain DNS 'A' lookup - anybody suggesting that those are illegal would have to be a complete idiot, as it is akin to looking up somebody's number in a public phone book.
    Don't do that - not all valid e-mail domains have an A record (like the address I had at my previous provider - do a lookup on guest.arnes.si). And yes, I had problems with sites that did that.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.