Sometimes overanalysis hurts...



  • As a developer deep in the middle of building an e-commerce system (shopping cart, credit card fulfilment, shipping, taxes, lions oh my, etc...) from scratch (long story, don't ask...) sometimes it's the people looking over your shoulder that cause the most problems.

    Case in point: The shopping cart's requirements include the ability to ship parts of the order to multiple recipients (ie: order 5 left-handed hydrospanners, you can ship any combination of those 5 to 1-5 recipients). Part of the checkout process has a checkbox to select whether you want the multi-recipient capability turned on. It's bog-standard html, basically <input type="checkbox" name="multirec" value="y">.

     The fun came later on, in the code handling the form submission. As per my own personal standard security practices, I don't trust ANYTHING that's submitted from the forms and validate/filter ALL fields submitted. So... my standard handler for a simple checkbox is (in PHP:)

         $multirec = ($_POST['multirec'] == 'y') ? 'y' : 'n';

     On occasion, I might cast the value to boolean true/false, depending on what's going to be done with the info later on, but for now, a y/n cast is enough.

     I was asked exactly why I was doing such a redundant check. "The form only ever will submit a 'y', or a blank if the checkbox was cleared. Why re-assign
    the submitted value?"

    So here I am trying to explain TDWtF-style trinary logic, of the True/False/File-Not-Found variety, to the reviewer, not getting very far in beating my way through the 200 yards of bone between me and whatever brains are in there.

     

     .... some people.
     



  • @MarcB said:

    "The form only ever will submit a 'y'..."

     

    That right there is the source of a very large percentage of security issues on web sites. Anyone (other than a student or new programmer) that thinks that should probably not be anywhere near the server end of a web site.


  • @MarcB said:

    .... some people

    I thought you were beating yourself up for writing that code after overanalysing the task, and that the 200 yards of bone consisted of your skull. But the brains were in "there" not "here".... so....

    [code]$multirec = ($_POST['multirec'] == 'y') ? 'y' : 'n';[/code]
    If the submitted value is a 'y' (and a lower-case 'y' at that) it assigns the value 'y' to $multirec. If the submitted value is anything else it assigns the value 'n' to $multirec. If there is no submitted value (because, say, the checkbox was cleared) then it triggers an error message and assigns 'n' to $multirec. Real classy.


    [code]$multirec = isset($_POST['multirec']);[/code]
    If anything is submitted in this field it assigns the value true to $multirec (unless it's so large that the maximum-request-size limit is pinged). If it's not 'y' then it's someone being bogus and you can treat it however you want (you can even look at it later if you care [i]that[/i] much). If the field is absent (because the checkbox is cleared) it assigns the value false to $multirec. See, it even does the boolean cast for you.



  • I think you have entirely missed the point, the compiler error is intentional, this is because it is a million times easier fixing a compile time bug compared to a run time bug. This is especially so if the run time bug is caused by code that isnt even your own (in this instance, the isset function assuming that you dont care that the checkbox isnt even there for what ever reason)



  • um, how exactly do you define "compile time" in PHP?



  • i consider it the time at which the compiler generates errors :-)



  • @Watson said:

    [code]$multirec = isset($_POST['multirec']);[/code]

     Consider

     

    [code]$multirec = isset($_POST["multirec"]) && $_POST["multirec"] == "y";[/code]

    Yes, PHP uses lazy evaluation.



  • "The form only ever will submit a 'y', or a blank if the checkbox was cleared. Why re-assign
    the submitted value?"

    Not quite right. If you don't check a checkbox, the "proper" behaviour is for the useragent to not send anything to do with the input whatsoever. This is different from sending "multirec=" as that should never happen. If it does, it's either a WTF from the browser or someone spoofing the form (take a lucky guess which one it is likely to be)

    If you don't set a value (which as it is being used as a Boolean control, rather than a control list isn't such a strange concept) then a form sends "multirec=on" if it is checked and nothing if it isn't.

    I do wonder what is wrong with using a Boolean value for Boolean controls rather than strings: If the checkbox value is sent AND == 'y' (or possibly "on") then var is true else false. Simple. 

    All that notwithstanding, the simple rule is "never trust anything from the client", just as you state. This isn't a personal security practice, it is a paradigm. If the reviewer doesn't know this then you have bigger problems than just their misunderstanding of how a browser POSTs a form...



  • Whenever somebody says "the form will only ever submit [some value]" I just type something like this in the location bar:


    javascript void(document.theForm.multirec.value = "Oh really? No sir, O'Reilly!");

    and hit submit. Usual results include database errors caused by the apostrophe and string length.

    This also gets around any "maxlength" attribute specified in the HTML for an input of type "text", which is another thing many foolish develpers rely on - the browser doesn't allow you to type in a string longer than maxlength, but if you put it in there with JavaScript, it will send the whole thing, not a truncated version. About ten years ago I brought down a live ecommerce server that way - I thought I was on the development server, honestly :-)

    UPDATE: the forum software doesn't like my example; there should be a colon (":") between "javascript" and "void".



  • @Watson said:

    [code]$multirec = ($_POST['multirec'] == 'y') ? 'y' : 'n';[/code]

    If the submitted value is a 'y' (and a lower-case 'y' at that) it assigns the value 'y' to $multirec. If the submitted value is anything else it assigns the value 'n' to $multirec. If there is no submitted value (because, say, the checkbox was cleared) then it triggers an error message and assigns 'n' to $multirec. Real classy.



    It throws an E_NOTICE level error.  In PHP, that's a minor sin.  The last time I turned notice level errors on in an open source CMS, it spammed a good thousand notices on every page load.

    Better a notice, than the "The form will only ever submit 'y'" logic the OP was complaining about.
     



  • I don't see why you should check whether the form sends an "y". I think this isn't right, because later you may choose to change this to something more descriptive, and break the form. I think the rule is that there should be as few depend The isset thing is OK. Anyway, you're concerned only if there is SOME value, and what this value might be is completely irrelevant. Unless you're inserting this 'y' into a database.



  • @PSWorx said:

    um, how exactly do you define "compile time" in PHP?

    If I remember correctly, PHP is one of those compiled-into-bytecode style languages, so I guess technically you can use the term "compile time". Usually, "compile time" in languages like PHP (and Perl, Python, etc.) refers to parsing, which is when syntax errors are caught. Since C et al catch syntax errors at compile time, the term stuck, even with languages that aren't really compiled.

     

    That and "parse time" sounds kinda stupid. 



  • @ByteMe said:

    it is a million times easier fixing a compile time bug compared to a run time bug.
    You'll have to clarify this bit for me; what are these "compile time" bugs and "run time" bugs you're referring to? The only bug I see here is just poor coding practice. (That and the fact that the if(==) code is about three times slower than the isset() because of all the extra work that it has to do to achieve a result).
     



  • @merreborn said:

    It throws an E_NOTICE level error.  In PHP, that's a minor sin.
    And who cares about programming practices in this forum? :)

    @merreborn said:

    Better a notice, than the "The form will only ever submit 'y'" logic the OP was complaining about.
    Better code that functions correctly without triggering a notice, than ignoring the fact that you're being sloppy.

     
    Those notices are there for a reason.



  • (Fine, so don't let me edit, then.)

    I, too, am looking at an open source CMS; bug-hunting in fact. I've just tracked down the problem. A chunk of code was copy/pasted from the middle of one if branch into the other; some variables were used to build up a string in the copied code, but their values were only being assigned in the first branch - in the new copy they were all uninitialised. So they were being cast as empty strings that (eventually) borked the string format, instead of using syntactically sensible defaults, or just rewriting the string in question so that they weren't needed.

     



  • Bah, if I wanted warnings about comparing unset values, I'd be using Perl.

    You should probably be careful with isset(), though.  I'm not sure about checkboxes but I know some form elements will happily send a blank string.



  • The complete list of what form fields are supposed to be submitted under what circumstances is in the HTML 4.01 spec. Specifically:

    • Checked checkboxes
    • All text, password, textarea, and hidden fields
    • At most one radio button with each distinct name
    • Select controls with at least one option selected
    • File controls with at least one file selected
    • The selected submit control, if any.
    • The selected button control, if any.
    • Object elements without the declare attribute
    ...that aren't disabled may be submitted.  That spec is extremely vague on the treatment of empty text controls; it's not clear that a text control with no default value and nothing entered in the user agent has a current value.  The only implementations I have to test with, IE, lynx, links, and Firefox, all submit empty text fields.  Object elements determine their own submission format, if any... I'm not actually sure this sees live use.


  • You should consider using a helper function, something like (PHP)
    function PostBoolean($key)
    {
      switch (ord(@$_POST[$key])) {
      case  49: return true; /* Value starts with '1' */
      case  89: return true; /* Value starts with 'Y' */
      case 121: return true; /* Value starts with 'y' */
      }
      return false;
    }
    The function returns true if and only if POST variable $key exists and has a value beginning with 'y' or 'Y', or the number '1'.
    Note that @ hides the warning if $_POST[$key] does not exist, and that ord() only inspects the first character. Usage is, of course,
    $multirec = PostBoolean('multirec');


  • Don't forget that in a production PHP system, warnings should be disabled. The only time I ever use @ in my code is somewhere I truly don't care if the call succeeds or not, i.e: in cases where "it's nice if it works, but not necessary for life to go on".

    I will admit, though, that it is tempting during development to slap down useless foreach() warnings with @.

    Case in point... let's say you've got a set of checkboxes in a form, something like:

    <input type="checkbox" name="boxes[]" value="val1">
    <input type="checkbox" name="boxes[]" value="val2">
    etc...

    And process them as:

    $boxes = $_POST['boxes'];
    foreach ($boxes as $idx => $val) {
    ... yada yada yada...
    }

    If the form's submitted and none of those boxes were checked, $boxes will be a NULL value, causing PHP to spit out

    PHP Warning:  Invalid argument supplied for foreach() in <file><file> on line <x><x>.</x></file>

    The joys of a shared namespace for all variables, I suppose. Perl's much nicer in this regard, with different namespaces for scalars, arrays, and hashes, and NOT complaining when an array you're iterating over happens to be empty.

    p.s. For those who've managed to avoid PHP so far (lucky buggers...), naming a form field as name[] (with the brackets) will tell PHP to treat those fields as a multi-value field and store them into the $_GET/$_POST/$_REQUEST superglobals as an array, rather than a single value which contains only the last value as submitted by the useragent.


Log in to reply