How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema
-
We have a lot of class hierarchies in our system and I figured that the most normalized way to represent these class hierarchies in our database would be to use a TPT strategy. The ORM capabilities of our framework of choice, Laravel (yes I know PHP is the Antichrist around these parts), doesn't like representing class hierarchies in the database in a fully normalized fashion and therefore I'm left to figure this one out on my own. Working up the hierarchy, from child to parent for all CRUD operations is quite easy but moving in the opposite direction for reads is quite difficult.
Given the example below, when instantiating a car object, I want to have the ability to figure out if it's an Accord, Civic, Pathfinder or Altima. Alternately if this instance's class hierarchy is only 1 level deep, find out if it's a Honda or Nissan. In addition to that, when instantiating a Nissan object, I want to have the ability to figure out if it's a Altima or Pathfinder.
Everything I've thought of so far is neither scalable nor performant. For example, redefining the hierarchy as a nested array and running a series of SQL queries based off of the array's structure.
I am aware that parent classes not needing to know about child class details is one of the cornerstones of object oriented programming but there are several cases where given a parent instance, we need to know what it's deepest subclass is.
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
performant
Have you measured this? SQL is way better than you think at joining tables.
I'm worried that the entire genesis of this question is premature optimization.
What's the current data storage solution? Is it performant? Is there a reason to think you'll improve things by rewriting it?
-
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
SQL is way better than you think at joining tables.
Provided it has the right indices set up, of course. I'd expect them to be there by default for integer primary keys (as part of being able to enforce a uniqueness constraint), so they probably work in that schema.
-
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Have you measured this?
I have not yet measured this because programming hasn't yet started for the project. I do wish I had the time to run some POCs but when working for a non-profit, you rarely have this luxury. This optimization may be premature and I know that we can always go back and refactor but it doesn't hurt to ponder about this right now. Also for the following reasons:
-
Some of these classes will be instantiated on almost every page load. For example a user object.
-
We have tons of data saved in these hierarchical tables (> 1GB).
SQL is way better than you think at joining tables.
Is there a way to find the deepest child record given a "car" record purely through a single SQL query? The only way I found this to be achievable is using multiple queries. Perhaps multiple queries will be "performant enough" but I'm not sure at this point.
What's the current data storage solution?
A MariaDB database.
Is there a reason to think you'll improve things by rewriting it?
Rolling our own implementation of Table-per-Type DB tables and model classes is needed considering the chosen framework does not represent class hierarchies in this manner. This is all for the sake of at least achieving the second normal form when representing class hierarchies.
@dkf said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
SQL is way better than you think at joining tables.
Provided it has the right indices set up, of course. I'd expect them to be there by default for integer primary keys (as part of being able to enforce a uniqueness constraint), so they probably work in that schema.
There will be correct indices set up. MariaDB does this for integer PKs and FKs as well.
-
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Is there a way to find the deepest child record given a "car" record purely through a single SQL query? The only way I found this to be achievable is using multiple queries. Perhaps multiple queries will be "performant enough" but I'm not sure at this point.
Possibly a recursive CTE, although off the top of my head I'm not sure how that would work with the schema above
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
A MariaDB database.
I meant what's the current schema, and what drawbacks are prompting this move to a new schema?
Basically, I'm trying to understand the problem before I offer any advice, not trying to be annoying. (The DBMS doesn't matter; we already knew it was SQL.)Especially since you added the detail that labor's in short supply-- well then why are you so adamant you have to burn-down what you already have? Is that a good use of your manpower?
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Is there a way to find the deepest child record given a "car" record purely through a single SQL query? The only way I found this to be achievable is using multiple queries. Perhaps multiple queries will be "performant enough" but I'm not sure at this point.
To answer this specifically, each subclass needs to retain an ID to its own table.
Your schema needs to be something like:
Car CarId Nissan NissanId CarId Juke JukeId NissanId
When yanking out a Juke, you do it with the JukeId not the CarId. You JOIN up the hierarchy, not down.
-
@jaloopa said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Is there a way to find the deepest child record given a "car" record purely through a single SQL query? The only way I found this to be achievable is using multiple queries. Perhaps multiple queries will be "performant enough" but I'm not sure at this point.
Possibly a recursive CTE, although off the top of my head I'm not sure how that would work with the schema above
Thanks for throwing the term CTE. This looks promising http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
A MariaDB database.
I meant what's the current schema, and what drawbacks are prompting this move to a new schema?
Basically, I'm trying to understand the problem before I offer any advice, not trying to be annoying. (The DBMS doesn't matter; we already knew it was SQL.)The example provided above replicates what a typical class hierarchy would look like, except with user types. For example a "User" table being the highest level. Student, Educator, Administrator being at level 2, inheriting from user. And tutor at level 3, which inherits from educator. There may be other deeper hierarchies but the one mentioned is the simplest one.
Especially since you added the detail that labor's in short supply-- well then why are you so adamant you have to burn-down what you already have? Is that a good use of your manpower?
To be honest I'm not 100% married to the idea of Table-per-Type. If it causes too many headaches we'll just do Table-per-Hierarchy and live with most columns of a single table being nullable.
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Is there a way to find the deepest child record given a "car" record purely through a single SQL query? The only way I found this to be achievable is using multiple queries. Perhaps multiple queries will be "performant enough" but I'm not sure at this point.
To answer this specifically, each subclass needs to retain an ID to its own table.
Your schema needs to be something like:
Car CarId Nissan NissanId CarId Juke JukeId NissanId
When yanking out a Juke, you do it with the JukeId not the CarId. You JOIN up the hierarchy, not down.
Joining down is what I'm interested in. It might be achievable through the method described here http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/.
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
To be honest I'm not 100% married to the idea of Table-per-Type. If it causes too many headaches we'll just do Table-per-Hierarchy and live with most columns of a single table being nullable.
Ok, hang on. Maybe I'm asking the wrong question.
Is this greenfield development?
In your original question, you said it was an already-existing system, and I was trying to work out how you were storing this data on disk now. After this reply, I'm starting to think this is greenfield. Now I'm just confused.
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Joining down is what I'm interested in.
Why not just... not do that?
If you have a Juke, you can find that it's a Nissan and a Car. Trivially.
Sure there might be ways to go the other direction, but why would you bother? Just send/store the most specific ID whenever you need to send the ID somewhere. Your subclass can just override the
.getDatabaseId()
or whatever method to return the most specific one.
-
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
To be honest I'm not 100% married to the idea of Table-per-Type. If it causes too many headaches we'll just do Table-per-Hierarchy and live with most columns of a single table being nullable.
Ok, hang on. Maybe I'm asking the wrong question.
Is this greenfield development?
In your original question, you said it was an already-existing system, and I was trying to work out how you were storing this data on disk now. After this reply, I'm starting to think this is greenfield. Now I'm just confused.
It's greenfield development. All we're doing with the existing system is importing its data into the new system's DB schema.
@blakeyrat said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
Joining down is what I'm interested in.
Sure there might be ways to go the other direction, but why would you bother?We would need to bother to do that in the case of Users. Laravel can load a user instance of the current user that's browsing our system. However, depending on if that user is a Student or Educator, they would have the ability to do different things.
-
@captainpants said in How do I Simplify Parent to Child Reads When Representing Inheritance Using a Table-Per-Type Schema:
We would need to bother to do that in the case of Users. Laravel can load a user instance of the current user that's browsing our system. However, depending on if that user is a Student or Educator, they would have the ability to do different things.
I'm wondering if there's some way to better represent the data as relations rather than as classes. Relations fit databases far more effectively…