Database design for php coders...;)



  • okay this is just from some guy trying to get help with his php website on a forum, he probably isn't a professional programmer...but I just had to laugh when i saw this.
    http://img-a1.ak.imagevz.net/forum1/23/52/caca705559de33c12c8d6e7a6b82/1-eab0ced1334d1e62.jpg



  • So we're just using any excuse now to take a shot at PHP coders? All I see is a whiteboard with some schema design on it. I can't read what it says to understand the schema.



  • @Pap said:

    So we're just using any excuse now to take a shot at PHP coders?

    I was not aware that an excuse was necessary. PHP itself is more than enough reason for the use of automatic weaponry. 



  • @wittgenstein said:

    okay this is just from some guy trying to get help with his php website on a forum, he probably isn't a professional programmer...but I just had to laugh when i saw this.
    http://img-a1.ak.imagevz.net/forum1/23/52/caca705559de33c12c8d6e7a6b82/1-eab0ced1334d1e62.jpg

     What's the problem? I can't read the text, but it looks like he's got a table where each row stores the information for a photo.
     



  • Well, notice that the right parts is a depiction of his website. It looks pretty much like the drawn equivalent of "so if I type this there into phpMyAdmin, it will make my page look pretty..."



  • Obviously not 3-tier...



  • You wouldn't know without knowing how he's binding that data to the UI. At least give the guy the benefit of doubt.

    And maybe this drawing was used to explain how the site works to someone who's not a programmer? I could so picture myself having to resort to things like this to get my boss to understand some data constraints here.



  • @obediah said:

    @wittgenstein said:

    okay this is just from some guy trying to get help with his php website on a forum, he probably isn't a professional programmer...but I just had to laugh when i saw this.
    http://img-a1.ak.imagevz.net/forum1/23/52/caca705559de33c12c8d6e7a6b82/1-eab0ced1334d1e62.jpg

     What's the problem? I can't read the text, but it looks like he's got a table where each row stores the information for a photo.
     

    The text is by no means funny. ("Description", "Amount",..)

    It's just a drawing which column goes where in the web page. IMO absolutely nothing wrong with that.



  • Php is not bad, it is just too easy for idiots to get working.



  • Sadly, the reality is that PHP is [b]not[/b] easy for idiots to get working. They just think it will be, and maybe they manage to get it kind of half working most of the time...



  • @wittgenstein said:

    okay this is just from some guy trying to get help with his php website on a forum, he probably isn't a professional programmer...but I just had to laugh when i saw this.
    http://img-a1.ak.imagevz.net/forum1/23/52/caca705559de33c12c8d6e7a6b82/1-eab0ced1334d1e62.jpg

    This is not a database design. Just arrows going from columns to layout parts of a page sketch. It doesn't even have anything specifically to do with PHP.

    Where's the funny? 



  • Maybe the WTF is about storing binary data in the database and not in external files? It's going to hurt efficiency a bit. OR is it a picture-of-a-scan-on-a-wooden-table?

    Nevertheless, I think the top-down approach here is generally OK, and the picture is easy to understand - it's actually worth a praise.



  • @Tweenk said:

    Maybe the WTF is about storing binary data in the database and not in external files? It's going to hurt efficiency a bit. OR is it a picture-of-a-scan-on-a-wooden-table?

    Nevertheless, I think the top-down approach here is generally OK, and the picture is easy to understand - it's actually worth a praise.

    Storing binary data is IMO not a WTF. I don't think that the performance penalty (if any) really hurts, and you gain some of advantages regarding management. For example, deleting a row in the table automatically deletes the binary data (i.e. picture) as well. If rollback the transaction, you also get the picture back :-))



  • @ammoQ said:

    @Tweenk said:

    Maybe the WTF is about storing binary data in the database and not in external files? It's going to hurt efficiency a bit. OR is it a picture-of-a-scan-on-a-wooden-table?

    Nevertheless, I think the top-down approach here is generally OK, and the picture is easy to understand - it's actually worth a praise.

    Storing binary data is IMO not a WTF. I don't think that the performance penalty (if any) really hurts, and you gain some of advantages regarding management. For example, deleting a row in the table automatically deletes the binary data (i.e. picture) as well. If rollback the transaction, you also get the picture back :-))



    Another perhaps to some less important advantage is that  you don't have to mess about with the FS.  Like max files per directory problems, or 2 files with the same name. Of course there are solutions to those problems, but when you put the data in the database, you don't have to solve them.



  • @stratos said:

    @ammoQ said:
    @Tweenk said:

    Maybe the WTF is about storing binary data in the database and not in external files? It's going to hurt efficiency a bit. OR is it a picture-of-a-scan-on-a-wooden-table?

    Nevertheless, I think the top-down approach here is generally OK, and the picture is easy to understand - it's actually worth a praise.

    Storing binary data is IMO not a WTF. I don't think that the performance penalty (if any) really hurts, and you gain some of advantages regarding management. For example, deleting a row in the table automatically deletes the binary data (i.e. picture) as well. If rollback the transaction, you also get the picture back :-))



    Another perhaps to some less important advantage is that  you don't have to mess about with the FS.  Like max files per directory problems, or 2 files with the same name. Of course there are solutions to those problems, but when you put the data in the database, you don't have to solve them.

    Most RDBMSes have their own stack of similar problems instead which you then have to solve or ignore (for example, the subtleties of vacuuming in postgresql, which are ignored by developers at least as often as directory size limits). Doesn't really buy you anything.

    While there's nothing fundamentally wrong with storing binary data in an RDBMS, if your binary blobs are large enough then it's going to be roughly two to three times slower (in the hot-cache case) than using the filesystem directly, because the data has to be copied around more before it reaches your application; this is slower than you expect, because it crosses process boundaries. While there may be workloads in which that is acceptable, you should avoid building a system that fundamentally relies on it, because it'll come back to bite you later. Usual warnings about premature optimisation apply, but the inverse can be just as bad: eliminating possible optimisations at the design stage is another root of all evil.



  • @asuffield said:

    @stratos said:
    @ammoQ said:
    @Tweenk said:

    Maybe the WTF is about storing binary data in the database and not in external files? It's going to hurt efficiency a bit. OR is it a picture-of-a-scan-on-a-wooden-table?

    Nevertheless, I think the top-down approach here is generally OK, and the picture is easy to understand - it's actually worth a praise.

    Storing binary data is IMO not a WTF. I don't think that the performance penalty (if any) really hurts, and you gain some of advantages regarding management. For example, deleting a row in the table automatically deletes the binary data (i.e. picture) as well. If rollback the transaction, you also get the picture back :-))



    Another perhaps to some less important advantage is that  you don't have to mess about with the FS.  Like max files per directory problems, or 2 files with the same name. Of course there are solutions to those problems, but when you put the data in the database, you don't have to solve them.

    Most RDBMSes have their own stack of similar problems instead which you then have to solve or ignore (for example, the subtleties of vacuuming in postgresql, which are ignored by developers at least as often as directory size limits). Doesn't really buy you anything.

    While there's nothing fundamentally wrong with storing binary data in an RDBMS, if your binary blobs are large enough then it's going to be roughly two to three times slower (in the hot-cache case) than using the filesystem directly, because the data has to be copied around more before it reaches your application; this is slower than you expect, because it crosses process boundaries. While there may be workloads in which that is acceptable, you should avoid building a system that fundamentally relies on it, because it'll come back to bite you later. Usual warnings about premature optimisation apply, but the inverse can be just as bad: eliminating possible optimisations at the design stage is another root of all evil.

    There's definitely something to be said for storing your images somewhere where they can be served via your HTTP server without a database hit... 



  • @merreborn said:

    There's definitely something to be said for storing your images somewhere where they can be served via your HTTP server without a database hit... 

    i wouldn't put static pictures (logo, navigation icons, borders, backgrounds etc.) in the database, but for those pictures that directly relate to database rows, it makes sense IMO. 



  • It simply depends, if your making a image bank website you don't want you images in a database.
    But if you want one of those cheap host your images here and get tons of ads thrown in your face sites, putting the images in a DB might be a reasonable option. 

    It also depends on how much space you have, lot's of hosters will put a limitation on the amount of HDD space you get, but won't look at your database size. (In the end you get what you pay for though, but for home-brew stuff it could be a cost saver)

    If however you have your own server and a good sysadmin, you could opt to make a specialised partition for the images.

    etc.. etc..

    It depends on purpose and perhaps also quite a bit on knowledge domain. I don't know all that much about databases, but i do have experience with working with files and making sure my programs play nice.



  • @merreborn said:

    @asuffield said:
    @stratos said:
    @ammoQ said:
    @Tweenk said:

    Maybe the WTF is about storing binary data in the database and not in external files? It's going to hurt efficiency a bit. OR is it a picture-of-a-scan-on-a-wooden-table?

    Nevertheless, I think the top-down approach here is generally OK, and the picture is easy to understand - it's actually worth a praise.

    Storing binary data is IMO not a WTF. I don't think that the performance penalty (if any) really hurts, and you gain some of advantages regarding management. For example, deleting a row in the table automatically deletes the binary data (i.e. picture) as well. If rollback the transaction, you also get the picture back :-))



    Another perhaps to some less important advantage is that  you don't have to mess about with the FS.  Like max files per directory problems, or 2 files with the same name. Of course there are solutions to those problems, but when you put the data in the database, you don't have to solve them.

    Most RDBMSes have their own stack of similar problems instead which you then have to solve or ignore (for example, the subtleties of vacuuming in postgresql, which are ignored by developers at least as often as directory size limits). Doesn't really buy you anything.

    While there's nothing fundamentally wrong with storing binary data in an RDBMS, if your binary blobs are large enough then it's going to be roughly two to three times slower (in the hot-cache case) than using the filesystem directly, because the data has to be copied around more before it reaches your application; this is slower than you expect, because it crosses process boundaries. While there may be workloads in which that is acceptable, you should avoid building a system that fundamentally relies on it, because it'll come back to bite you later. Usual warnings about premature optimisation apply, but the inverse can be just as bad: eliminating possible optimisations at the design stage is another root of all evil.

    There's definitely something to be said for storing your images somewhere where they can be served via your HTTP server without a database hit... 

    One downside of static data in RDBMSes would be HTTP caching. If you store the data in static files, everything is already done for you by the HTTP server. And the server has easy busyness as well, because he can get most necessary information directly from the file system.
    However, if you have the data in a database, the server will just see a generic dynamic script and you're on your own with serving the actual data.
    Concerning caching, you have two choices then: Dismiss caching completely and take a lot of unnecessary traffic into account; Or implement it by yourself. But then you most likely will have to query the database for freshness information and if you have bad luck, "304 Not Modified" responses end up almost as expensive as responses where you actually serve the data. In any case, much more processing will have to be done than if you just served a static file.



  • @stratos said:

    It simply depends, if your making a image bank website you don't want you images in a database.

    One of the products of my current employer is a web-based picture managent application, used by stock photography agencies, picture libraries and the like, and this application keeps all the pictures (somewhere between 10000 and 100000) in the database.

    I think it mostly depends on the expected traffic - image hosters like imageshack have to care much more about that than the mentioned agencies with a much lower number of hits. 



  • @PSWorx said:

    Concerning caching, you have two choices then: Dismiss caching completely and take a lot of unnecessary traffic into account; Or implement it by yourself. But then you most likely will have to query the database for freshness information and if you have bad luck, "304 Not Modified" responses end up almost as expensive as responses where you actually serve the data. In any case, much more processing will have to be done than if you just served a static file.


    You can store that kind of metadata in RAM/on disk with a good mmap plugin.  Catalyst, for example, uses
    the Catalyst::Plugin::Cache::FastMmap module.  This sort of thing is really easy.  Essentially, you'll want your cache to maintain an "instance revision" and "current revision" field for each cachable object.  Have changes to the database trigger increments to the relevant objects' current revision fields.  Finally, when it comes time to serve a page/object, pull the current and instance revision fields from the cache, compare, and either serve or rebuild the object if they don't match (obviously making sure to set the instance revision field equal to the current revision.

    Not much more processing has to be done.  You pull data from a disk/RAM (Apache would have to do this for static content too) and run a simple comparison test.  If you don't have to rebuild the object, the test is the <i>only</i> extra overhead.

Log in to reply