One field table

stratos

I have an interesting DB design where all my spider senses are telling me it is probably wrong, but I'm not sure.

The basic structure is like this:

sumthing:
id: int,PK

sumthing_attribute:
id: int, PK
sumthing_id: int, FK, PK
fieldA: varchar
fieldB: varchar

Without any common attribute for sumthing. Now removing 'sumthing' all together doesn't feel right, because the entity is 'sumthing'. But having a table with only a id, just feels wrong.

Any input on it would be appriciated.

dhromed

What sort of application describes items that have 0 properties in common, not even a name?

Theoretically possible, but reeks of WTF in a real-world application.

PJH

@stratos said:

The basic structure is like this:
sumthing:
id: int,PK
sumthing_attribute:
id: int, PK
sumthing_id: int, FK, PK
fieldA: varchar
fieldB: varchar
Without any common attribute for sumthing. Now removing 'sumthing' all together doesn't feel right, because the entity is 'sumthing'. But having a table with only a id, just feels wrong.

Is there a one to many relationship between sumthing and sumthing_attribute?

stratos

Yeah, it's a one to many.

and this would be that theoretical possibility dhromed. Annoying the ORM functionality of the framework I use (kohana v2) isn't eating it up, won't save the record. I can work around that by overwriting the functionality, but the whole thing just bugs me. Something just has to be wrong with it, but I can't figure it out.

Data wise I could drop the <sumthing> table and use a unique on the <sumthing>_id in the <sumthing>_attribute table, but that would bug me as well really.

PJH

@stratos said:

Something just has to be wrong with it, but I can't figure it out.

It's the fact it's a single column. If there were another column of, say, "name" then it would seem less strange.

Scarlet_Manuka

I'm with dhromed on this one. If your entities are so varied that they do not have even one single attribute in common, you almost certainly should not be stuffing them all into the same table. If you believe you have a good reason for doing this, and are allowed to give us some details, I'd certainly be interested to hear it.

stratos

Well allright.

This particular part of the database is about product attributes. So a table for attributes of a product. Think "Natural wooden finish", "sterling silver", etc.. These attributes are accompanied with a title and a bit of summary text.
However, this is a multiligual setup with the texts saved in the database as well. Also meta information like Who edited what and when is saved on language level.
So you more of less end up with a structure like this.

attribute
id: PK

attribute_lang:
prod_attr_id: PK, FK
lang_id: PK, FK
name:
desc:
user_id: FK
last_update:

Now I "could" add something like a date creation field to prod_attr, but I don't really need it, nor would I actually use it.

ammoQ

"Flexible" database designs lead to a lot of problems. Queries like 'List all Nike shirts in red or blue, size XL' require a lot of joining, making the program slow and the development process tedious. You cannot create an index on more than one attribute, so queries become even slower. It's almost impossible to enforce any kind of constraint, so together with the tedious development process, you have to expect a lot of inconsistent data caused by bugs.

b_redeker

If these are products, and you don't just want multi-language but also multi-country, I can make this more difficult for you; attributes are not necessarily the same across countries.

Take for instance shoe or clothing sizes; a 36 US is not 36 UK is not 36 FR. There's liters, gallons (UK) and gallons (US). In the past, tv screen sizes used to be measured across the outside in the US and along the inside in Europe (or vice versa). Local variations for NTSC/PAL or 220/110 V should be different SKUs, but this may not always be the case. Or remember the pizza box featured on this forum some time ago where the Oz description was different from the UK description.

The most sane way to tackle this, I found in eCommerce framework Magento, where a product (SKU) has a number of base attributes that can be extended with attribute groups; so you can define your own categories and your own attribute sets. So one store could have an attribute set for shoes and another for shirts; but a specialized shoe store might have separate attribute sets for women shoes (heel size) or athletic shoes (type of sport). This way, you can actually compare products within a certain category (attribute set).

If however what you're doing is making a store/app for a reasonably simply range of products, it pays to keep it simple and stick to just one table with columns for products with attributes. Or take a standard solution ;)

Oh, and to take this one step further: multi-country also most likely means multi-image. The image that works fine in the Netherlands may not be so well received in Dubai or Kenya (or Curacao).

Scarlet_Manuka

Oracle Financials does something similar to this. For instance, the vendors table has attribute_category with 15 associated attribute columns and global_attribute_category with 20 associated global attribute columns. When you define an attribute category, you give meaningful identifiers for the attributes you're using, and those are displayed in the UI when the user views or enters data. (Oracle Financials calls these flex-fields.) I'm not sure about how the localisation aspect works at the database level, but Oracle normally does that pretty well.

RTapeLoadingError

@stratos said:

This particular part of the database is about product attributes. So a table for attributes of a product. Think "Natural wooden finish", "sterling silver", etc.. These attributes are accompanied with a title and a bit of summary text.
However, this is a multiligual setup with the texts saved in the database as well. Also meta information like Who edited what and when is saved on language level.
So you more of less end up with a structure like this.
attribute
id: PK
attribute_lang:
prod_attr_id: PK, FK
lang_id: PK, FK
name:
desc:
user_id: FK
last_update:

It's late in the day so I could be totally wrong... but isn't joining on...

attribute_lang.prod_attr_id = attribute.id

...where the attribute table has only one column logically equivalent to not joining the tables at all? All the information in the row in the "attribute" table is already included in "attribute_lang" in the FK.

The only thing it gives you I guess is the ability to have products with zero attributes but that sounds a bit philosophical to me.

drobnox

@stratos said:

Well allright.
This particular part of the database is about product attributes. So a table for attributes of a product. Think "Natural wooden finish", "sterling silver", etc.. These attributes are accompanied with a title and a bit of summary text.
However, this is a multiligual setup with the texts saved in the database as well. Also meta information like Who edited what and when is saved on language level.
So you more of less end up with a structure like this.
attribute
id: PK
attribute_lang:
prod_attr_id: PK, FK
lang_id: PK, FK
name:
desc:
user_id: FK
last_update:
Now I "could" add something like a date creation field to prod_attr, but I don't really need it, nor would I actually use it.

Looks like you're reinventing RDF

PSWorx

@RTapeLoadingError said:

@stratos said:
This particular part of the database is about product attributes. So a table for attributes of a product. Think "Natural wooden finish", "sterling silver", etc.. These attributes are accompanied with a title and a bit of summary text.
However, this is a multiligual setup with the texts saved in the database as well. Also meta information like Who edited what and when is saved on language level.
So you more of less end up with a structure like this.
attribute
id: PK
attribute_lang:
prod_attr_id: PK, FK
lang_id: PK, FK
name:
desc:
user_id: FK
last_update:

It's late in the day so I could be totally wrong... but isn't joining on...
attribute_lang.prod_attr_id = attribute.id
...where the attribute table has only one column logically equivalent to not joining the tables at all? All the information in the row in the "attribute" table is already included in "attribute_lang" in the FK.
The only thing it gives you I guess is the ability to have products with zero attributes but that sounds a bit philosophical to me.

Well, the one information the "attributes" column (and also the join above) would give you is if an attribute exists, irrespective of the languages it's formulated in. I could see this as somewhat useful if you want to have your DB validate PK/FK constraits:

If you, for some reason, ever need to clear and re-import your "languages" table (e.g., because some other department changed the captions or descriptions), you'd have to keep one "alias" language for each attribute, so the primary key doesn't vanish. This could potentially make reimport more complicated and messy. But I agree, I'm not sure how much impact that would actually have.
I'd also say it will make it easier to add language-independant properties later, when you discover you need some of those. Depends how likely it is that you WILL need them later.

PSWorx

(addendum)

I find it interesting that there are is apparently a well-known somewhat similar antipattern that actually has a single-column table as its fix: link (slides 80+). This isn't exactly the same problem as ours, but the pros/cons discussion seems similar. (Basically, replace "enum" with "arbitrary unique ID").

RTapeLoadingError

@PSWorx said:

@RTapeLoadingError said:
Isn't joining on...
attribute_lang.prod_attr_id = attribute.id
...where the attribute table has only one column logically equivalent to not joining the tables at all? All the information in the row in the "attribute" table is already included in "attribute_lang" in the FK.

Well, the one information the "attributes" column (and also the join above) would give you is if an attribute exists, irrespective of the languages it's formulated in. I could see this as somewhat useful if you want to have your DB validate PK/FK constraits:

If you, for some reason, ever need to clear and re-import your "languages" table (e.g., because some other department changed the captions or descriptions), you'd have to keep one "alias" language for each attribute, so the primary key doesn't vanish. This could potentially make reimport more complicated and messy. But I agree, I'm not sure how much impact that would actually have.
I'd also say it will make it easier to add language-independant properties later, when you discover you need some of those. Depends how likely it is that you WILL need them later.

I still don't see it. If an attribute exists, irrespective of language, you just need to search on attribute_lang.prod_attr_id without specifying a lang_id. Joining or using attribute_lang.prod_attr_id IN (SELECT id from attribute) doesn't tell you anything extra.

If you need to refresh the captions/descriptions then you can update the records with the correct attribute_lang.prod_attr_id/attribute_lang.lang_id. It would seem that something somewhere cares what a specific attribute ID is (e.g. "Wooden" is attribute ID #1) but this is not reflected in this part of the schema.

If the attribute table contained information about the attribute in some default language, and attribute_lang was a specific translation then it would make sense.