Overheard in the cubicles



  • While I was programming away, I overheard some developers from another team talking about databases, which they will be incorporating in their software in a future release. One guy was explaining to the others the distinction between a "select" and a "join", the main one being that "a join returns all the columns, so if you only want some of them, you have to use a select instead of a join". It was like listening to a 7 year-old explain to a 6 year-old where babies come from, except it was scary instead of cute.

    I walked over and spent some time trying to explain about joins and select, but I'm not sure they believed me.



  • @rstinejr said:

    I walked over and spent some time trying to explain that they should stick to coding or get some formal training if they're going to be doing database stuff.

    FTFY.

    The only thing worse than a code programmer that knows nothing about databases is a code programmer that believes their limited knowledge is sufficient.



  • That's because you shouldn't have explained the hard facts to them. They don't understand them. There are books to help you. You can start with "My mummy has a database in her tummy".



  • @TGV said:

    That's because you shouldn't have explained the hard facts to them. They don't understand them. There are books to help you. You can start with "My mummy has a database in her tummy".

     

    Select * from MummysTummy where beautiful = true and genius = true and rich = true

    No records returned.... damnit!

    Maybe there shoudl have been a join on Daddy in there somewhere.



  • @KattMan said:

    Select * from MummysTummy where beautiful = true and genius = true and rich = true

    No records returned.... damnit!

    Maybe there shoudl have been a join on Daddy in there somewhere.

     

    No, the join would have been done in the original view. Not sure where I'd stand on rolling back on any transactions though...

     



  • @Cassidy said:

    @rstinejr said:

    I walked over and spent some time trying to explain that they should stick to coding or get some formal training if they're going to be doing database stuff.

    FTFY.

    The only thing worse than a code programmer that knows nothing about databases is a code programmer that believes their limited knowledge is sufficient.

    Note to self: do not feed trolls.



  • @ASheridan said:

    @KattMan said:

    Select * from MummysTummy where beautiful = true and genius = true and rich = true

    No records returned.... damnit!

    Maybe there shoudl have been a join on Daddy in there somewhere.

     

    No, the join would have been done in the original view. Not sure where I'd stand on rolling back on any transactions though...

    (edit: Juvenile observation: that was my 69th post!)

     

    There was no view (the lights were off).

    Plus, a union seems more appropriate than a join here, unless you want to do multiple joins.... if you're into that sort of thing.

     

     



  • @Cassidy said:

    @rstinejr said:

    I walked over and spent some time trying to explain that they should stick to coding or get some formal training if they're going to be doing database stuff.

    FTFY.

    The only thing worse than a code programmer that knows nothing about databases is a code programmer that believes their limited knowledge is sufficient.

    I find it amusing that database "programmers" think of themselves as programmers. lol.



  • @SilentRunner said:

    I find it amusing that database "programmers" think of themselves as programmers. lol.

    I've never heard of a "database programmer".

    I have heard of SQL Developers, mind. They tend to know lots about schema, joins, tables, PL/SQL, stored procs, functions.... all the stuff that people mistakenly expect DBAs to know about.



  • @Cassidy said:

    @rstinejr said:

    I walked over and spent some time trying to explain that they should stick to coding or get some formal training if they're going to be doing database stuff.

    FTFY.

    The only thing worse than a code programmer that knows nothing about databases is a code programmer that believes their limited knowledge is sufficient.

    Agreed. It's not just databases, either, but filesystems, daemons, networking, caching.. The problem is that to a lot of programmers it's just another API; they start out by saying "How do I do X.." and then search the documentation until they find a method.

    Of course, not every "code programmer" is ignorant (by "code programmer" I'm assuming you mean someone who mostly writes logic that manipulates in-memory objects). But far too often I've seen this play out:

    Programmer: "I don't know much about databases, but I have to do database work for this next project. I better read up on that.."

    2 days later

    Programmer: "Wow, this O'Reilly book is great! This database stuff really isn't all that complicated!"

    1 week later

    Programmer: "That normalization stuff is overcomplicated. I came up with a much better way by having a table with only 4 columns: object ID, key, value and type! I can store anything I want and I don't have to mess around with schemas! No wonder SQL is dying!"



  • @morbiuswilters said:

    Of course, not every "code programmer" is ignorant (by "code programmer" I'm assuming you mean someone who mostly writes logic that manipulates in-memory objects). But far too often I've seen this play out:

    Programmer: "I don't know much about databases, but I have to do database work for this next project. I better read up on that.."

    2 days later

    Programmer: "Wow, this O'Reilly book is great! This database stuff really isn't all that complicated!"

    1 week later

    Programmer: "That normalization stuff is overcomplicated. I came up with a much better way by having a table with only 4 columns: object ID, key, value and type! I can store anything I want and I don't have to mess around with schemas! No wonder SQL is dying!"

     

    I've never really understood what people find so "overcomplicated" about normalization.  It's not that different (conceptually at least) from a standard object model, except that all the references point backwards.  How complicated is that?

     


  • BINNED

    @Cassidy said:

    @SilentRunner said:

    I find it amusing that database "programmers" think of themselves as programmers. lol.

    I've never heard of a "database programmer".

    I have heard of SQL Developers, mind. They tend to know lots about schema, joins, tables, PL/SQL, stored procs, functions.... all the stuff that people mistakenly expect DBAs to know about.

    You are correct, but at least in theory DBAs need to know those things to adequately do their jobs.



  • @Mason Wheeler said:

    I've never really understood what people find so "overcomplicated" about normalization.  It's not that different (conceptually at least) from a standard object model, except that all the references point backwards.  How complicated is that?

    The same people that have problems with linked lists perhaps?

     



  • @morbiuswilters said:

    (by "code programmer" I'm assuming you mean someone who mostly writes logic that manipulates in-memory objects).

    You assume correct: coders of .net, Java, PHP especially (much volunteer-driven stuff I've seen contains horrendous SQL), and perl.

    When learning about perl and databases, I read something (article? The Camel?) that said "if you're a perl programmer, WRITE PERL. Leave the DB stuff to someone else, just call their routines." It was my first insight into separation of code from SQL.

    @morbiuswilters said:


    Programmer: "I came up with a much better way..."

    *stab*thock*hurt*maim*

    Quickest way for any developer to gain "red jersey" status in my book, that sentence...



  • @morbiuswilters said:

    But far too often I've seen this play out:

    Programmer: "I don't know much about databases, but I have to do database work for this next project. I better read up on that.."

    2 days later

    Programmer: "Wow, this O'Reilly book is great! This database stuff really isn't all that complicated!"

    1 week later

    Programmer: "That normalization stuff is overcomplicated. I came up with a much better way by having a table with only 4 columns: object ID, key, value and type! I can store anything I want and I don't have to mess around with schemas! No wonder SQL is dying!"

     

    The pain, the pain. I've worked with such people. No, that's not completely true. I inherited such a design once. The guy had designed a new database, to solve all problems (yeah, sure), and left before finishing. However, he had entered so deeply into the minds of the even less informed people working around and above him, that the company stuck to that design for 3 more years. And still uses it for "legacy" systems.

    The worst table, the one where measurements were stored, had a typeid. If it was 1, the field labelled "value1" would mean one thing, if it was 2, value1 would mean something else, etc. There were tables with value1 through value4, but also tables with 9 value columns. Not all of the value columns would be in use. Most of the time only the first 2 or 3 values would be in use. And the typeids were different on different systems. And some "types" shared the same measurement, but in different columns. On top of that, there were category labels, and there could be measurements for different categories. To find a whole set of measurements, you had to group on several ids, timestamp, and typeid. And then you had to determine whether or not it was a set with missing zeroes (which, admittedly, was my idea, since 90% of the data tended to be zero, and reports were extremely slow). And there were of course descriptions of the types, but generally they were confusing or out of date.

    A key-value store is so much fun!



  • @PedanticCurmudgeon said:

    You are correct, but at least in theory DBAs need to know those things to adequately do their jobs.

    Yeah, I know. I wanted to make the distinction between "I'm aware of those things" and "this is my daily bread-and-butter".

    (I probably didn't make that too clear.)



  • @TGV said:

    The worst table, the one where measurements were stored, had a typeid. If it was 1, the field labelled "value1" would mean one thing, if it was 2, value1 would mean something else, etc. There were tables with value1 through value4, but also tables with 9 value columns. Not all of the value columns would be in use. Most of the time only the first 2 or 3 values would be in use. And the typeids were different on different systems. And some "types" shared the same measurement, but in different columns.

    This sounds quite a bit like a system I worked on. One of the beautiful things was that a lot of "constants" were store in the database rather than in code. And we had 2000+ customer databases. Due to a variety of factors (such as some DBs not getting a patch, some getting the same patch over and over which inserted the "constants" multiple times) the values of the constants differed between databases. So if you wanted to find an object of type "Foo", you first had to query for the ID of Foo types. Writing patches was a real joy, because if you needed to add a new type or modify an existing one, the patch had to do all kinds of complex operations.

    The guy who designed this system was also really, really fond of text columns of comma-separated values. So querying for something with the value of 7 would mean a where clause that looked like: WHERE (col LIKE '7,%' OR col LIKE '%,7,%' OR col LIKE '%,7')

    This was done because "a separate table would be too slow". In reality, the software had hilarious, almost-unimaginable performance issues. I swear to God some of the functions ran in factorial time..



  • @morbiuswilters said:

    The guy who designed this system was also really, really fond of text columns of comma-separated values. So querying for something with the value of 7 would mean a where clause that looked like: WHERE (col LIKE '7,%' OR col LIKE '%,7,%' OR col LIKE '%,7')

    Oh dear, I hope they never got any values such as 17 or 70...

    edit: oops, that wouldn't be a problem, since there's no % before the first regex, nor one after the third... (dunno if they're really considered regexes, but what else to call them?)



  • @ekolis said:

    @morbiuswilters said:
    The guy who designed this system was also really, really fond of text columns of comma-separated values. So querying for something with the value of 7 would mean a where clause that looked like: WHERE (col LIKE '7,%' OR col LIKE '%,7,%' OR col LIKE '%,7')

    Oh dear, I hope they never got any values such as 17 or 70...

    edit: oops, that wouldn't be a problem, since there's no % before the first regex, nor one after the third... (dunno if they're really considered regexes, but what else to call them?)

    Wildcard matches on a string?



  • @morbiuswilters said:

    One of the beautiful things was that a lot of "constants" were store in the database rather than in code....

    I'm sure a few open-source CRMs do this also: WordPress, Xoops, etc...?

    At least they have the foresight of making the column unique to prevent duplicates.



  • @Cassidy said:

    @morbiuswilters said:

    One of the beautiful things was that a lot of "constants" were store in the database rather than in code....

    I'm sure a few open-source CRMs do this also: WordPress, Xoops, etc...?

    At least they have the foresight of making the column unique to prevent duplicates.

    Yeah, I've seen it other times, too, although it's usually competently implemented. The two problems I have with it are: 1) each DB has its own copy of the constants, so it's possible things can get out of sync; and 2) you have to do several additional joins or subqueries just to find what you're looking for.



  • @morbiuswilters said:

    The two problems I have with it are: 1) each DB has its own copy of the constants, so it's possible things can get out of sync; and 2) you have to do several additional joins or subqueries just to find what you're looking for.

    Agreed daftness - (2) is crying out for a few views, I guess.

    I often wonder about the benefits of storing constants in the DB, especially when they're config settings for CRMs.

    I can see the point that they'll survive code upgrades and the webserver doesn't need to change privs on any files, but even the DB creds themselves can't be stored inside a table.



  • @Cassidy said:

    Agreed daftness - (2) is crying out for a few views, I guess.

    Views help, but don't change the fact that you've to write them, update them when tables are updated, and the fact that they're still extra joins/subqueries. They're also more of a PITA to use; instead of having a nice enum with user types, for example, you end up with strings (or enum-to-string mappings) all over the place, because that's what the DB expects.

    @Cassidy said:

    I often wonder about the benefits of storing constants in the DB, especially when they're config settings for CRMs.

    Some stuff can go into config files (or other suitable in-memory structures). Some stuff can go into memcache with a high expiry. And some stuff needs to live solely in the DB. (Clarification: the DB is the SPOT for all of that data, but it can be cached depending on how frequently it changes.. DB connection info rarely changes, so that goes into a config file, for example. A global site-specific setting that is used on every page load but almost never changes can be cached aggressively in memcache..)



  • Clarification: the DB is the SPOT for all of that data, but it can be cached depending on how frequently it changes.. DB connection info rarely changes, so that goes into a config file, for example. A global site-specific setting that is used on every page load but almost never changes can be cached aggressively in memcache..

    I think it could be argued that the DB is not the right spot for storing DB connection info... unless maybe you mean information about connecting to OTHER databases. But yeah, even then, config file.



  • I think we're basing our arguments to/for the storing of config info in a DB table based upon the volatility of the data. Static stuff can be heavily cached, and dynamic stuff has more efficient accessibility over file-based storage.

    I suppose it largely boils down to the schema design: were the data structures designed by a DB bod that understood the problem and presented an efficient solution, or were they hastily cobbled together by a code programmer that had a poor grasp of DB concepts?

    I tend to find the latter in many cases, sadly.


Log in to reply