PHP WTF-able code/suggestions for improvement?



  • So, I work at the computer lab of my community college as a student assistant. Part of my job is to write a MySQL-powered PHP app for our staff intranet. It does simply things; you know, fetch posts from a database, display them, etc. A good project for a newbie, and believe me, it's teaching me a *LOT* about planning first. And about using revision control--thank God for SVN. haven't managed to fit any of the programming classes into my schedule yet (RED ALERT! RED ALERT! This coming semsester, promise!) and have learned what I know of PHP and MySQL entirely through trial-by-fire over the past few years.

    I have this code. It prints a few links for this post display function that displays only posts for a specific year and month, which is provided in the URL; ie: index.php?do=displayMyPosts&month=5&year=2009. Not the most elegant, but it works. In fact, that's what I say about all of the code: it *works*. But I don't like it, and I'd love to improve any way I can. So tell me: what can I do to improve here? My personal "favorite" part of my own code is in lines 114 and 117. I think you'll see it.

    Minor tidbit: The last section of code is in a function that returns $content, which is later echoed by the "template". Also not the most brilliant thing, IMO, but it also works.

    http://pastebin.com/m77872f74

    (Whooops, just noticed how stupid my date variables are. Oh well...fixit soon.)

    (Whoops, also just noticed my stupid comments on nextMonth...*sob*



  • You might want to start escaping PHP and have as much of the HTML as you can in its own file. This way you can easily debug the HTML. IMO it just looks cleaner.

      <!-- Print the drop-down month selector -->
      <form action="index.php?do=displayArchivePosts&amp;year=<?php echo $year; ?>" method="post">
      <select name="month">
      <?php for ($i = 1; $i < 13; $i++) { ?>
        <option value="<?php echo $i; ?>" <?php if ($i == $month) { echo 'selected="selected"'; } ?>><?php echo monthName($i)?></option>
      <?php } ?>
      </select>
      <input type="submit" value="Goto"></form&gt;
    </pre>
    

    Also, you may want to avoid doing the request variable cleaning up-front and do it on demand. PEAR::MDB2 helps greatly with this (look at the prepare and execute methods).



  • That pastebin url doesn't seem to work. ok odd.

     

    Ling, is that your PHP? I personally have a severe dislike for that kind of overly verbose PHP+HTML intermingling. Focus on the PHP:

     <?
    for ($i = 1; $i < 13; $i++) {
       echo '<option value="'.$i.'" '. ($i == $month? 'selected="selected"':'').'>'. monthName($i).'</option>';
    }
    ?>

    In such a case, I've already written the HTML and I don't care for it anymore. The PHP is the most important; the HTML is secondary, so it's not productive to add all that syntax noise.



  • @dhromed said:

    Ling, is that your PHP?

    Not entirely. I generally try to have all my HTML in a few files, and all the heavy PHP stuff is done in files that don't deal with HTML. End result is there is very little mixing of the two, pretty much the bare minimum. I find the append syntax to be ugly because you end up with a mix of PHP quotes and HTML quotes. The example I gave was a direct conversion from his code.



  • @Lingerance said:

    I generally try to have all my HTML in a few files, and all the heavy PHP stuff is done in files that don't deal with HTML
     

    That's good, of course.

    @Lingerance said:

    I find the append syntax to be ugly because you end up with a mix of PHP quotes and HTML quotes.

    Syntax colouring takes care of that, so it's not really a problem. And as I said, the HTML at that point is a given and merely a distraction from the PHP code. You're not seriously writing HTML and seriously writing PHP at the same time, because then ur doin it rong. All that in-out-in-out of php tags looks way way rougher -- which is good for sex, but only leads to fucked code. Without syntax colouring here on the forum, the difference is not as apparent, but when all your HTML is a single colour, it does a good job of putting it in the background.



  • Unfortunately, having to place all the output in $content (instead of echoing it or outputting it directly as html) makes it impossible to do that. I forgot why I did that now--I think it was something to do with output buffering, or using the dreaded header("Location: blah.html"); die(); for redirects. Or something. Before that, I did have HTML mixed with PHP as little as possible.

    About Pear::MDB2: That looks fascinating, to say the least; however, I have an irrational fear of Pear. Not sure why; maybe it's just that I don't understand it and don't want to build an app that is dependent on a bunch of packages that some poor slob a maintainer down the road might get confused about, say, if they have to perform a reinstall. As if there isn't enough in my code to confuse them...



  • I feel little for reworking your code to something I would have used, but here are some general observations.

    - Start using a framework, I don't really care which one. PEAR, zend framework, code ignitor, cake php, whatever. just try a few and take the one you like. Focus on writing business logic not on framework code. The wheel has been invented, and it's round enough already.

    - Try to encapsulate functionality, even when writing procedural code create common functions and put them in their seperate "library" files.  You may or may not have already done this partly.

    - Don't use inline CSS unless your still designing, once the design is finished, structure and take out the inline CSS stuff. 

    - Filter input, escape output. 

    - When your month input should be a int, check if it's an int.  When you output to the database escape it for that database, when you output something for HTML escape it for HTML, (unless you of course want to output it as HTML)

    - Escaping input will result in form fields turning o'reily into o\'reiley,  when the validation fails and you return the form values, which after submitting again will result in o\\\'reiley etc.. Just avoid that kind of crap by escaping on output for that format, and only filter input.

     

    Lastly, and this is more of a comment on the discussion above. I don't mind mixing HTML and PHP, as long as the PHP is purely there to create dynamic elements or loops and such. I also prefer to use the following PHP syntax for it. But this is very personal and will mostly be determined by the coding guidelines of the project. (although in this case you obviously invent those yourself)

     <!-- Print the drop-down month selector -->
      <form action="index.php?do=displayArchivePosts&amp;year=<?= $year; ?>" method="post">
      <select name="month">
      <? FOR ($i = 1; $i <= 12; $i++): ?>
        <option value="<?= $i; ?>" <?= ($i == $month) ? 'selected="selected"' : ''; ?> ><?= monthName($i); ?></option>
      <? ENDFOR; ?>
      </select>
      <input type="submit" value="Goto">

     



  •  Ah, I think I see. So basically, I shouldn't do any database-specific filtering (such as mysql_real_escape_string) until I actually go to insert data into the database, correct? That makes sense.

    As for the rest:

    The html is still being tweaked, so I haven't gotten around to placing everything into my external stylesheets yet. Most of my functions are in seperate require()'d library files.

     For the framework...

    I've never used any sort of web framework before--or any framework, for that matter. I guess I'm a little paranoid out of ignorance. But I'll experiment with a few; any one in particular recommended? Keep in mind, this is a relatively small app; it's not some flagship product, although I shouldn't use that as an excuse!



  • mysql_real_escape_string would be escaping. Filtering is checking if the stuff coming from the user is what you expect.

    For instance, when the user has to supply a date via the form, you initially check if it is actually a date that got submitted. If not you return a error.

    Also the escaping is not limited to the database, you will also need to escape for HTML, XML or anything else you might ouput. It is a common made mistake for instance to not properly escape usernames. This comes into play when I for instance register as "stra<script type="text/javascript" src="127.0.0.1/doEvil.js" >tos"

    In short, filtering checks the input, and acts upon those findings. Escaping modifies the input to make it "safe" for the destination output. (Your DB is considered output in this respect)

     

    Now on to frameworks.

    Well zend framework is quickly becoming the dominant framework at the moment, but the DB abstraction is quite a hassle for small projects, so for something like this I would probebly use either code ignitor or kohana (kohana is a PHP5 remake of code iginitor)

    kohana is a light weight framework that implements MVC for you and gives you a bunch of other stuff as well. After installation you get a default structure in which you can put your application.

    /system
    /application
    /public/index.php

    System is where kohana itself lives and application is where your code lives. Then public is the place you will be pointing your webserver to and the index.php basically just initialises the framework and calls the router.
    A router is the code that decides what code needs to be executed based on the uri.

    Then in your /application dir you have the following dirs.

    /config
    /controllers
    /helpers
    /models
    /views

    Well you have a bit more, but these are the main ones.

    The config dir is where you obviously store config files, controllers dir is where controller files go. Controllers are where your business logic goes, A controler is more or less an object that is responsible for a certain part of your site functionality. For instance you could have a "msgboard" controller which would have methods called "viewPost", "editPost", "deletePost", etc..

    These controllers, after handling the request would then call a "view" from the views directory to render the needed html. This is your base seperation of logic and view, for instance for the "viewPost" action you would retrieve the requested msgboard post from the database, then call the "viewPost" view and pass along that post. In the meantime you of course have already checked if the post existed, if they where allowed to view that post and perhaps added some dynamic attributes to that post if needed.
    Dynamic attributes could for instance be the pre-generation of url's to edit the post or delete the post.

    Also the url for calling the above would be example.org/msgboard/viewPost/1 Which would get translated by the before mentioned router as controller=msgboard, action=viewPost param=1

    I also mentioned requesting a msgboard post from the database. You wouldn't just call the mysql_query function or whatever, but you would use a model, a model which of course is stored in the models directory.

    There are lots of different ways to do database abstraction, but kohana uses a pretty simple one, as with the controller, you juse have a object that reprisent a database table and actions that extract or modify the table. There is no actual rule that says it should reprisent a database table, but it's the most obvious way to do things.

    So you would perhaps have a msgboard table in the database, so you create a msgboard model with functions like "getPost", "updatePost", "getPostsByThreadId", etc..
    The method getPost would for instance look like this.

    function getPost($id)
    {
        if (!is_numerical($id)) throw new Kohana_Exception('$id not numerical, numerical input expected.');

        $result = $this->db->select()->from('msgboard')->where('id', $id)->get();

        if (!$result->count()==1) return false;
        
        return $result->current()
    }

    Above example is using the active record DB stuff from kohana, it also has ORM and just plain queries. But unless you either want to use ORM, or want something that can't be handled by the active record interface then it's fine. You for instance can't do subqueries in active record.

    Please note that while I checked that the $id was numerical, I did not escape it. This is because active record will escape it for me.  Also, this is probebly not the first place where i've checked if the id is numerical, I will have checked it in the controller as well, but the truth is that I don't trust myself or my co-workers either. The check in the controller will however will result in a user friendly result, the check in the model itself means that the application has failed since it should never get called with a non-numerical $id.

    This leaves us with the helpers dir. Now the helpers dir contains your helpers. Helpers are objects that contain small bits of reusable code. Kohana already gives you quite a few helpers. There is for instance a html helper that has methods for escaping html, among other stuff, and a url helper that helps you build urls and redirect among other stuff.

    But anyway, I'm no tutor and most frameworks, kohana not excluded have quite a bit of documentation and most of them also a quick start guide or initial tuturial to get your feet wet.



  •  Oh, that makes perfect sense!

    Right now, for example, I have all of my Post functions in a file called...Posts.php (Hah.) It would probably be a lot more easier and sane to have it represented by an object instead of a bunch of functions scattered all over the place!

    I'll look into Kohana right now; thank you!



  • @Supertanker said:

    So basically, I shouldn't do any database-specific filtering (such as mysql_real_escape_string) until I actually go to insert data into the database, correct?
    Do not concatenate data into strings to build SQL queries! Use prepared statements instead. Most database abstraction libraries make it easier to do this than concatenate strings anyway.



  • @Supertanker said:

    Ah, I think I see. So basically, I shouldn't do any database-specific filtering (such as mysql_real_escape_string) until I actually go to insert data into the database, correct? That makes sense.

    You shouldn't do any escaping at all unless and until you have to.  Invariably you'll need the output unescaped, or escaped differently than you did it the first time.  And escaping something that's already escaped can quickly turn 'this' into \\\\\\\\\'this\\\\\\\\\\'.

    <tangent>"Magic quotes" are evil for just this reason.  They were a misguided attempt to free weenies from having to write code to escape a string.  If you actually write decent code, though, magic quotes will mangle it -- thus forcing you to either write shit code, or do some workaround like "if (ini_get('magic_quotes_gpc')) { /* unescape it */ }".  Not to mention there are two different options that can enable magic quotes, which will mangle your data at different times.  Fortunately Zend figured this out and are apparently removing this misfeature altogether from PHP 6.  Now all they have to do is actually make it.</tangent>



  • Magic quotes has been disabled by default since 5.3 and is now only there for backwards compatability. Also I wouldn't get your hopes up for PHP6 anytime soon. Also, Zend is just a company and as a company has about as much say in PHP as any other webshop. Which is to say, nothing at all.


  • Discourse touched me in a no-no place

    @stratos said:

    Also I wouldn't get your hopes up for PHP6 anytime soon.
    They've been removed.



  •  Yes I am wel aware, I was refereing to when PHP6 would be released.



  • @stratos said:

    - Escaping input will result in form fields turning o'reily into o'reiley,  when the validation fails and you return the form values, which after submitting again will result in o\'reiley etc.. Just avoid that kind of crap by escaping on output for that format, and only filter input.

    A few comments on this, one if you have magic quotes turned on all $_REQUEST variables ($_GET, $_POST, etc...) will be run through addslashses, in some cases this isn't what you want, so in many cases you have to strip those first then run it through the correct escaping function, or just strip them as they get displayed directly. Secondly, as I mentioned before Pear::MDB2 will actually take care of all the escaping needs if used correctly.
    @stratos said:
     <!-- Print the drop-down month selector -->
      <form action="index.php?do=displayArchivePosts&amp;year=<?= $year; ?>" method="post">
      <select name="month">
      <? FOR ($i = 1; $i <= 12; $i++): ?>
        <option value="<?= $i; ?>" <?= ($i == $month) ? 'selected="selected"' : ''; ?> ><?= monthName($i); ?></option>
      <? ENDFOR; ?>
      </select>
      <input type="submit" value="Goto">

    The "<?=" syntax requires short tags is on, some people (like myself) will disable that. If you want your code to be used on multiple systems don't use short-tags.



  • Wat possible reason could you have for turning them off beyond disliking the eastethic combination of "<?=".

    But whatever, you are allowed to do as you please of course. And indeed, most of your server as such will be need to reconfigured to run my software.



  • @stratos said:

    But whatever, you are allowed to do as you please of course. And indeed, most of your server as such will be need to reconfigured to run my software.

    Will I also need to download your ego as a software update? What version of Ubuntu will I need?



  • @stratos said:

    What possible reason could you have for turning them off beyond disliking the eastethic aesthetic combination of "<?=".

    FTFY.

    And in response to your question, short tags also include the "<?" shorthand to signal PHP code.  If you're working with XML, as an example, short tags can cause some minor gotchas.



  •  My personal preference is to use <?php and ?>. I prefer my code to work on as many systems as possible. I will examine MDB2 more closely later today; it looks to be exactly what I need.

     



  • @cHao said:

    And in response to your question, short tags also include the "<?" shorthand to signal PHP code.  If you're working with XML, as an example, short tags can cause some minor gotchas.
    Exactly.



  • @Lingerance said:

    @cHao said:
    And in response to your question, short tags also include the "<?" shorthand to signal PHP code.  If you're working with XML, as an example, short tags can cause some minor gotchas.
    Exactly.
     

    And here i thought that portability was your thing, meaning that you will still have to echo the xml header to avoid your code to break of default php installations. 



  • @stratos said:

    @Lingerance said:

    @cHao said:
    And in response to your question, short tags also include the "<?" shorthand to signal PHP code.  If you're working with XML, as an example, short tags can cause some minor gotchas.
    Exactly.
     

    And here i thought that portability was your thing, meaning that you will still have to echo the xml header to avoid your code to break of default php installations. 

    Difference being, someone who's writing code for their own server doesn't need to worry about how other people have set up PHP.   People who intend for their code to run on other people's machines, though, do.



  • @stratos said:

    And here i thought that portability was your thing, meaning that you will still have to echo the xml header to avoid your code to break of default php installations. 

    Mostly, but as you'll note in my previous comments my coding style with PHP is to have a file act as the template, as such I won't echo/print our the HTML/XML unless I absolutely have to. At the same time when I write code that I want to be portable, and do do the echo '<?xml...'; bit, but that doesn't change the fact that it's ugly. Another reason to have short-tags disabled is if you want to enforce that as a coding-style. Personally I'd prefer if short-tags would only trigger if the opening is "<?php" "<? " (note the space) or "<?=", leaving the "<?xml" tags and any other tag like it alone.



  • I agree. The echoing of the xml header is ugly and they should have made the php short tag matching more precise. And also you make a good point about coding style. It goes without saying that in pre-existing projects or projects for which a coding style describes the rules about short tags that you follow what is written. In a perfect world you shouldn't even notice that multiple people worked on the same code base because they are all following the same style.

    Albeit I would prefer using svn hooks (or the equivelent in whatever source control software that is used) to check for coding style and forbidden syntax, instead of turning off short tags. That however takes a bit more time of course.


Log in to reply