Guess the table layout



  • The result of this went into a 'special' report today:

    SELECT region, type, COUNT(*)
    FROM ((SELECT id, SUBSTRING_INDEX(region, ',', 1) region, type FROM venues WHERE LENGTH(SUBSTRING_INDEX(region, ',', 1)) > 0)
    UNION (SELECT id, SUBSTRING_INDEX(region, ',', -1) region, type FROM venues WHERE LENGTH(SUBSTRING_INDEX(region, ',', -1)) > 0)
    ) regionsplit
    GROUP BY region, type
    ORDER BY region, type
    

    Beautiful isn't it?



  • So there are fields that contain multiple, comma separated values? Ingenious.


  • BINNED

    Please tell me it was a database that supports proper arrays too?



  • @RevCurtisP said:

    So there are fields that contain multiple, comma separated values?

    One-or-two actually :smirk: the above would ignore the middle values.

    @RevCurtisP said:

    Ingenious.

    Thank you. Was it my personal lazyness or budgetary concerns that led to this? Choose your reality.

    @Onyx said:

    Please tell me it was a database that supports proper arrays too?

    I don't know MySQL well enough. Probably not.


  • BINNED

    @gleemonk said:

    MySQL

    I think that's a no, then. But since I don't intend on touching the damned thing unless under threat of extreme physical violence, I don't feel like checking.



  • I worked with an ecommerce system that had a CSV import for shoppers (with optional logins). It was extended to allow automatic order creation, by adding an item number field. Then our clients wanted to be able to have multiple items per order, so my boss modified the code to parse the item number field for multiple item numbers, separated by semicolons.



  • @Onyx said:

    Please tell me it was a database that supports proper arrays too?

    I didn't even know this was a thing. But I haven't done enough SQL to have needed this functionality. Yet another reason to prefer PostgreSQL over MySQL, for any kind of serious work.


  • BINNED

    @RevCurtisP said:

    I didn't even know this was a thing.

    Yup. I actually have a single table where I use an array. Slap a GIN index on it and you can even get decent performance out of it.



  • @Onyx said:

    Slap a GIN index on it and you can even get decent performance out of it.

    Better performance than, say, more traditional ways?


  • BINNED

    I haven't tested it that much, but I doubt it, honestly. The only reason I'm even using an array is that it's a denormalized table we use for better performance reasons, and it's rarely filtered on the column that's an array. The alternative would be two queries and then process the results in PHP. I think the array wins in this particular case.



  • @Onyx said:

    process the results in PHP

    Sure, avoiding that is always a winner :fish_cake:



  • @loopback0 said:

    Better performance than, say, more traditional ways?

    Fetching a single row with X number of fields is almost always faster than fetching X number of rows.



  • @RevCurtisP said:

    @loopback0 said:
    Better performance than, say, more traditional ways?

    Fetching a single row with X number of fields is almost always faster than fetching X number of rows.

    At the expense of having to do additional processing somewhere to convert what you get into something usable. It's also considerably harder to filter in, say, a where clause.



  • @Onyx said:

    Please tell me it was a database that supports proper arrays too?

    ... like all SQL databases? They're just called one-to-many relationships instead of "proper arrays".



  • @blakeyrat said:

    They're just called one-to-many relationships instead of "proper arrays".

    There are ones that will let you say that a column contains an array of things rather than setting up the values you are storing there in a different table and mapping it to the key of the current one. Why you use this instead of building a one-to-many relation between tables I'm not sure on (I don't use them) but they do exist.



  • TIL

    <!-- Expressive my ... -->


  • @blakeyrat said:

    ... like all SQL databases? They're just called one-to-many relationships instead of "proper arrays".

    Yeah, that's not the same thing at all.



  • Right; but in car terms, you're saying:

    "This car is great, but wouldn't it be nice if there were a place on it I could store a couple suitcases" and your solution to the problem is buying a plastic tub from Target and duct-taping it on the roof.

    The car already has a trunk. Use the fucking trunk.

    What they want to do? It's a solved problem in SQL. It's been solved for decades. There's no need to add shit to SQL to solve the solved problem.



  • I tried to find out from the PostgreSQL documentation why it has an array type, but the only thing I could find was "for conformance with the SQL standard". Which just pushes the question back one step: Why is the array type specified in the SQL standard?

    Oh, but it does have a reason why you wouldn't use an array:

    *Tip*: Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.

    I suppose if I had some type of record that included a matrix I'd find it useful, but I haven't had one of those yet.


  • BINNED

    @Watson said:

    I tried to find out from the PostgreSQL documentation why it has an array type

    Why would you have JSON types? Conventional wisdom says - either parse it and store it in a proper table structure, or just store it as a string. Yet there are such things like NoSQL databases which deal with JSON almost exclusively. Now, whether they are any good or not is a separate question, but they exist. And people use them. So there is a market niche where it's useful.

    Same goes for XML. Or geometric types. Or network address types. They may seem silly on the surface, but sometimes they do the job quite well and help you offload some of the stuff you'd have to do on your programmatic data later, often by iterating over results. Which is what RDBMSs are good at already, so why not let your RDBMS do it instead, on raw data, instead of pulling it into your application and messing with it there?

    As any tool, it's not something that should be used all the time. As I said, I only used an array once so far, and it was to solve a specific issue. And, in this specific case, it's faster than doing the joins and then either aggregating it all into a string (which I'd then have to explode in PHP anyway for my templating engine to display it the way I want), or pulling in extra rows and then joining them back together in PHP.



  • Certainly, assisting data transparency at the DBMS level provides an additional layer of defence against garbage data.

    I agree that it's more a nice-to-have-if-I-need-it. I was just wondering if there was any more considered justification that saw it being added as a standard feature when others weren't (For me, native complex-number support would be nice....). For example, the reasoning behind record types is clear enough on orthogonality grounds.

    I thought I'd see what SQL Server said about the things, and it turns out that it doesn't have them and you have to sorta emulate them with its approach to record types (aka TVPs).


  • BINNED

    @Watson said:

    I was just wondering if there was any more considered justification that saw it being added as a standard feature when others weren't

    That might warrant extra research. It does seem kinda weird for it to be defined in ANSI SQL. I wonder if it was the case of popular engines implementing them and then they got added into the standard, like W3C tends to do with web standards at times.



  • @Onyx said:

    Why would you have JSON types?

    In an RDBMS? Likes arrays, fuck knows.

    Maybe the kids need to get off my databases lawn.



  • @Onyx said:

    I think that's a no, then. But since I don't intend on touching the damned thing unless under threat of extreme physical violence, I don't feel like checking.

    My incentives are more civil but I don't feel like checking either.



  • @blakeyrat said:

    Right; but in car terms, you're saying:

    "This car is great, but wouldn't it be nice if there were a place on it I could store a couple suitcases" and your solution to the problem is buying a plastic tub from Target and duct-taping it on the roof.



  • @Onyx said:

    Same goes for XML. Or geometric types. Or network address types.

    Well, not XML. But the other two allow you to better specify the domains of the stored data without a bunch of programmatic constraints. Domains are what RDBMS' ought to support but don't do well at all. Our favorite database crumudgeon talks about them a lot.

    The best way would be for real support of domains - the "user defined data types" that exist in oracle and sql server today fail at this miserably.



  • Okay, so it looks like they were added, along with the whole idea of user-defined types, as part of the object-relational idea, to make object mapping more natural. "Ordered sequence of instances of a certain type" is a common structure in OO programs, so the notion was added into SQL. (And things like LINQ are approaching from the other side.)

    Actually implementing it is still very much a work in progress (you can't have a class hierarchy without subclasses, and PostgreSQL doesn't have type inheritance yet). The array type is probably a bit of an oddity because a lot of the infrastructure that it would fit in with it isn't there yet.


  • BINNED

    @Watson said:

    PostgreSQL doesn't have type inheritance yet

    At least it has table inheritance and it's pretty freaking awesome, IMHO :smile:



  • Table Inheritance? Is that just a fancy name for a one-to-one relationship between two tables?



  • I know you added this image to illustrate "array types are like using plastic boxes" but to me it looks more like "MySQL explained with two cars" :laughing:



  • @Watson said:

    I tried to find out from the PostgreSQL documentation why it has an array type, but the only thing I could find was "for conformance with the SQL standard". Which just pushes the question back one step: Why is the array type specified in the SQL standard?

    I use pg arrays

    • to get data into stored functions

    • to get data out of joins (eg get all Users, each with an array of Address records in one result set)

    I also dabbled with declaring table columns as arrays. It does make thing easier, in that you can store a complex record in one query, instead of having to set up a transaction and then execute several queries to get the lookup tables synchronized. But in the end, it felt kind of icky, so I reverted to just using foreign keys most of the time.



  • @blakeyrat said:

    "This car is great, but wouldn't it be nice if there were a place on it I could store a couple suitcases" and your solution to the problem is buying a plastic tub from Target and duct-taping it on the roof.

    Worse. You have a couple of suitcases. Rather than storing them in a closet, you toss them into a plastic tub you bought from Target and duct-tape the lid on because it doesn't really close with that shit jammed in there.

    Then you say "wow, I wish I could store these suitcases in my car". And then, rather than unpacking the tub and putting the suitcases in the trunk, you duct-tape the entire bin to the roof.

    And every time you want to get something from one of the suitcases-- you slice open the duct-tape and pull out all the suitcases, unpack EVERYTHING in them, lay them out on the ground, go through them one at a time until you find the right one. And then you repack everything in the exact same way, stuff the suitcases back into the tub, duct-tape it shut-- mount it on the roof and duct-tape it to the car.

    Then one day you decide to buy a third suitcase...


  • BINNED

    Why not use a time-series database for array needs (InfluxDB for example)? :stuck_out_tongue_winking_eye:



  • @dse said:

    Why not use a time-series database for array needs (InfluxDB for example)? :stuck_out_tongue_winking_eye:

    Why not use your mom for entertainment needs?



  • I suppose you could emulate it that way: a foreign key column in the "class" table for each "subclass" table, a check constraint to ensure that at most one of those foreign keys is non-null in any given row, a join every time you want to retrieve a row from the "subclass" table, and an outer join every time you want to retrieve rows that are instances of the class so that you get the subclasses as well. And repeat the process if any of those subclasses are themselves subclassed.

    But the idea of ORDBMS is to reduce the effort required in object-relational mapping, ideally to the point of it being a no-op.



  • @cartman82 said:

    @dse said:
    Why not use a time-series database for array needs (InfluxDB for example)? :stuck_out_tongue_winking_eye:

    Why not use your mom for entertainment needs?

    Just like the OP, she also enjoys having several values tightly packed into a single space.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.