The joys of PHP



  • PHP is both a nice powerful language and also a series of infuriating hacks. But this post isn't about debating the merits/drawbacks of the language... Sometimes the biggest WTFs regarding PHP come from the on-line PHP documentation... particularly the commentary on certain pages.

     

    In any cases, some "helpful" individual will paste in their multi-megabyte-worth-of-code solution to accomplish what a one-or-two line built-in function already does. Case in point, on the 'preg_split' manual page, there's a comment from Aug 21/05, on how this person was using preg_split to limit the length of an SHA1 hash. I won't paste the code here, you can see it via the link above, but suffice it to say, this person wrote their own function to do string length limiting by splitting a string into an array of characters, looping over the array and rebuilding a new string, stopping when they've reached whatever limit was passed into the function.

    Total function length: 15 lines


    A followup comment by someone who knows a bit more about string manipulation proposed the following:

     

    $str = sha1('string to hash');

    echo substr($str, 0, 32);

     

     

     



  • Wow, and all that because he wants a 32-character SHA-1 string.



  • @MarcB said:


    A followup comment by someone who knows a bit more about string manipulation proposed the following:

     

    $str = sha1('string to hash');

    echo substr($str, 0, 32);

    The real WTF is using echo. Build the danged document (section) as a string, and then return it somewhere. Good lord, y'all!!!

    But seriously, I've converted enough series of 'print'/'echo' statements to string concatenation that I really, really don't like the looks of it, like, ever. :P



  • @fennec said:

    The real WTF is using echo. Build the danged document (section) as a string, and then return it somewhere. Good lord, y'all!!!

    But seriously, I've converted enough series of 'print'/'echo' statements to string concatenation that I really, really don't like the looks of it, like, ever. :P

    In this case, it's example code.

    But in general, building your whole document as a single string via a bunch of concatenations isn't always possible.  It chews up all kinds of memory and CPU time.  For example, if you're building a 80 meg CSV file, you're going to be a lot better off if you echo the lines as you go, so they can be sent out, rather than buffering the whole thing in a string.  There are places where inline echoes are entirely appropriate.   Necessary, even.

     



  • @merreborn said:

    @fennec said:

    The real WTF is using echo. Build the danged document (section) as a string, and then return it somewhere. Good lord, y'all!!!

    But seriously, I've converted enough series of 'print'/'echo' statements to string concatenation that I really, really don't like the looks of it, like, ever. :P

    In this case, it's example code.

    But in general, building your whole document as a single string via a bunch of concatenations isn't always possible.  It chews up all kinds of memory and CPU time.  For example, if you're building a 80 meg CSV file, you're going to be a lot better off if you echo the lines as you go, so they can be sent out, rather than buffering the whole thing in a string.  There are places where inline echoes are entirely appropriate.   Necessary, even.

    Indeed.  At least php doesn't seem as bad as VBScript when it comes to concatenation.  In VBS, concatenation can result in exponential growth of memory required.  Do it too much and you're looking at minutes of execution time for something that should take less than a tenth of a second.



  • I found this one recently:

    http://pl2.php.net/manual/en/function.array-reduce.php

    Basically the function does the fold operation known from functional languages.

    However the documentation does not specify which one of two callback's arguments (called wisely v and w) is the accumulator and which one is the element. Moreover the examples provided include multiplication and addition, which do not care about the order of the arguments, so you can't deduce the ordering from the examples neither:

     

    mixed array_reduce ( array input, callback function [, int initial] )

    array_reduce() applies iteratively the <VAR class=parameter>function</VAR> function to the elements of the array <VAR class=parameter>input</VAR>, so as to reduce the array to a single value. If the optional <VAR class=parameter>initial</VAR> is available, it will be used at the beginning of the process, or as a final result in case the array is empty. If the array is empty and <VAR class=parameter>initial</VAR> is not passed, array_reduce() returns NULL.

    Example 1. array_reduce() example

    <?php
    function rsum($v, $w)
    {
      
    $v += $w;
       return
    $v;
    }

    function
    rmul($v, $w)
    {
      
    $v *= $w;
       return
    $v;
    }

    $a = array(1, 2, 3, 4, 5);
    $x = array();
    $b = array_reduce($a, "rsum");
    $c = array_reduce($a, "rmul", 10);
    $d = array_reduce($x, "rsum", 1);
    ?>



  • I was just about to make a PHP thread.  Here's my contribution, from the [url=http://us2.php.net/manual/en/function.strpos.php]page for strpos()[/url].

    This function is supposed to return the position of the first occurrence of a string within a string, but come with this warning:

    This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE, such as 0 or "". Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.


    So basically you can be pretty sure it won't return an object or an array, but anything else is fair game.



  • A coment from the very same page:

    A simple function to find the number of occurances in a string within a string

    <?php
    function StringCount($searchstring, $findstring)
    {
       return (
    strpos($searchstring, $findstring) === false ? 0 count(split($findstring, $searchstring)) - 1);
    }
    ?>
     
    OK, it works. Probably it even works quite fast, because it uses built in functions. Somehow, I feel sad looking at count(split()), or similar constructs in SQL , like "SELECT *" + mysql_num_rows() combos.


  • @Cap'n Steve said:

    I was just about to make a PHP thread. Here's my contribution, from the [url=http://us2.php.net/manual/en/function.strpos.php]page for strpos()[/url].

    This function is supposed to return the position of the first occurrence of a string within a string, but come with this warning:
    This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE, such as 0 or "". Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.


    So basically you can be pretty sure it won't return an object or an array, but anything else is fair game.

    This is perfectly sensible. The function returns FALSE to mean "doesn't occur", and 0 to mean "occurs at position 0". What's the big deal here? Just be careful with types: FALSE and 0 both evaluate as "false" in an IF, or with ==, but === compares them properly.



  • How do you explain the blank string then?  I also think null would make more sense than false in this case.  I thought I remembered some case where it will return a negative number, but I can't find an example right now.



  • The runkit extension is particularly interesting. For example, from the comments for runkit_import, you learn that you cannot use it to re-import a file that redefines a function that is in the current stack. Granted, it's not a standard extension, but it seems a bit hackish and doesn't exactly instill great confidence in the clarity of PHP's internal workings.
     

     
    The alternative solution that one person came up with for this (it was in some PHP IRC bot) was even more interesting though. To load/reload modules, he read the .php file into a string, used preg_replace to replace the class name with a random string, eval'd the code string, and then updated the class name in an array of classes.



  • @Cap'n Steve said:

    How do you explain the blank string then? I also think null would make more sense than false in this case. I thought I remembered some case where it will return a negative number, but I can't find an example right now.

    It's simply saying that objects such as 0 or "" evaluate as false in a normal conditional, but only the identity operator (===) will determine if it's really the FALSE value, and not just evaluated as false. It's actually quite useful once you get used to how it all works. It's much better, and much more elegant than having functions return a special value like -1, which can get in the way when you want a return value that's outside all other possible values, like for a math function, where -1 could be a perfectly reasonable answer...



  • @Cap'n Steve said:

    ...from the [url=http://us2.php.net/manual/en/function.strpos.php]page for strpos()[/url].

    There are better things in comments :)

    $myDivsContent = getStrsBetween("<div","</div>",$myHtmlSrc);

    Sample:
    or...get rows for html table
    ...
    ...
    //using TextBetween from old comment...
    $aTable = TextBetween("<table","</table>",$myHtmlSrc);
    $rows = getStrsBetween("<tr","</tr>",$aTable);
    ...
    ...

    function getStrsBetween($s1,$s2,$s,$offset=0){
      $result = array();
      $index= 0;
      $L1 = strlen($s1);
      $found = false;
      do{
       if($L1>0){
           $pos1 = strpos($s,$s1,$offset);
       }
       else {
           $pos1=$offset;
       }
       if($pos1 !== false){
           if($s2 == '')
               $result[$index++]= substr($s,$pos1+$L1);
           $pos2 = strpos(substr($s,$pos1+$L1),$s2,$L1);
           if($pos2!==false){
               $result[$index++]= substr($s,$pos1+$L1,$pos2);
               $offset += $pos2 + strlen($s2);
           }
           else{
               $pos1 = false;
           }   
       }
      }while($pos1 !== false);
      return $result;
    }

    YEAH! Because regexp-ing #<div[^>]*>([^<]*)</div># is just something for crazy people! :)



  • @Cap'n Steve said:

    I was just about to make a PHP thread. Here's my contribution, from the [url=http://us2.php.net/manual/en/function.strpos.php]page for strpos()[/url].

    This function is supposed to return the position of the first occurrence of a string within a string, but come with this warning:
    This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE, such as 0 or "". Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.


    So basically you can be pretty sure it won't return an object or an array, but anything else is fair game.

     

    That's an issue with the person that wrote the documentation.  As far as I know, strpos() always returns an integer if the needle is found, or false of the needle isn't found. I don't think I've ever seen strpos() return a string, and I'm not sure why the documentation writer thinks that it can.



  • That's a standard boilerplate comment for any function that returns FALSE for failure, but may also return a value that == will interpret as FALSE. It basically means "always use === to check for error returns, dumbass".



  • @qbolec said:

    I found this one recently:

    http://pl2.php.net/manual/en/function.array-reduce.php

    Basically the function does the fold operation known from functional languages.

    However the documentation does not specify which one of two callback's arguments (called wisely v and w) is the accumulator and which one is the element. Moreover the examples provided include multiplication and addition, which do not care about the order of the arguments, so you can't deduce the ordering from the examples neither:

     

    mixed array_reduce ( array input, callback function [, int initial] )

    array_reduce() applies iteratively the <VAR class=parameter>function</VAR> function to the elements of the array <VAR class=parameter>input</VAR>, so as to reduce the array to a single value. If the optional <VAR class=parameter>initial</VAR> is available, it will be used at the beginning of the process, or as a final result in case the array is empty. If the array is empty and <VAR class=parameter>initial</VAR> is not passed, array_reduce() returns NULL.

    Example 1. array_reduce() example

    <?php
    function rsum($v, $w)
    {
      
    $v += $w;
       return
    $v;
    }

    function
    rmul($v, $w)
    {
      
    $v *= $w;
       return
    $v;
    }

    $a = array(1, 2, 3, 4, 5);
    $x = array();
    $b = array_reduce($a, "rsum");
    $c = array_reduce($a, "rmul", 10);
    $d = array_reduce($x, "rsum", 1);
    ?>

    Doesn't tell you whether it folds from the left or from the right either.

    It's probably a foldl (few languages have a right fold) and the callback function's first arg is the accumulator.
    @SpComb said:

    The runkit extension is particularly interesting. For example, from the comments for runkit_import, you learn that you cannot use it to re-import a file that redefines a function that is in the current stack. Granted, it's not a standard extension, but it seems a bit hackish and doesn't exactly instill great confidence in the clarity of PHP's internal workings.


    Hah. The internals of PHP consists of hacks built upon crap, the language doesn't even have a grammar and the parser is completely ad-hock (which is why, if they one day implement namespace, the separator may be "" because -- and I shit you not -- they don't have any other character left...)



  • The documentation is a problem sometimes, yes.  For a while, any time I did a documentation search on php.net, it would return proper results -- in Polish, only Polish, and nothing but Polish.  I had to manually edit the URLs to see the english language version.



  • @masklinn said:

    Hah. The internals of PHP consists of hacks built upon crap, the language doesn't even have a grammar and the parser is completely ad-hock (which is why, if they one day implement namespace, the separator may be "\" because -- and I shit you not -- they don't have any other character left...)


    The .NET PHP implementation (phalanger or something) uses ::: as namespace seperator :P



  • @Corona688 said:

    The documentation is a problem sometimes, yes.  For a while, any time I did a documentation search on php.net, it would return proper results -- in Polish, only Polish, and nothing but Polish.

    This happens when you are surfing in the Polish internet. You should rather connect to the English internet.

    Anyway, if the Polish documentation says something along the line of "PHP to bardzo brzydka języka" , you don't have to look further - it tells all you have to know.
     



  • If php.net guesses your language incorrectly (and be realistic, guessing that based only on what your browser sends isn't quite perfect), you can change it by clicking on the "my php.net" link in the top navigation list.



  • How could I forget that wonderful error message in Hebrew? [url=http://www.google.com/search?hl=en&safe=off&client=opera&rls=en&hs=pKu&q=T_PAAMAYIM_NEKUDOTAYIM&btnG=Search]T_PAAMAYIM_NEKUDOTAYIM[/url].



  • @Cap'n Steve said:

    How could I forget that wonderful error message in Hebrew? T_PAAMAYIM_NEKUDOTAYIM.

    Ah, the Jewish equivalent of Roman Catholic "hodie natus est radici frater"... :)


Log in to reply