Best way to get data from database into objects

Sunstorm

I'm looking for advice on best-practices for loading data from the database into model objects.

As far as I've seen, there's two main ways to do this. The first is to fill them externally:

object Foo
{
  public title;
  public data;
}

[...]
queryResult = db.query(...);
Foo f = new Foo();
f.title = queryResult['title'];
f.data  = queryResult['data'];

This, on a higher scale, can be done either through getting or filling, through some kind of manager or adapter object. I'm not sure how they're formally called.

object FooAdapter
{
  public function fillFoo( &f )
  {
    queryResult = this.db.query(...);
    f.title = queryResult['title'];
    f.data  = queryResult['data'];
  }

  public function saveFoo( &f ) { ... }
}

object FooManager
{
  public function getFoo()
  {
    queryResult = this.db.query(...);
    Foo f = new Foo()
    foo.title = queryResult['title'];
    foo.data  = queryResult['data'];
    return foo;
  }

  public function saveFoo( &f ) { ... }
}

The first one would be the way .NET seems to prefer doing it, with it's strict-typed dataset generator. The downside I see is that the data object is reduced to a slightly smarter array, which is really what a DataSet is. Every element needs to be public to be assigned externally. It's not really abstracting the data, just repackaging it.

The alternative is to load the data from within the object itself:

object Foo
{
  public title;
  public data;

  public function load()
  {
    queryResult = this.db.query(...);
    this.title = queryResult['title'];
    this.data  = queryResult['data'];
  }

  public function save() { ... }
}

The object treats the database as state storage. This is fine for retrieving individual objects, but what if you need to get it in some other way, like from a query that returns a whole list of these? Plus you'll need to pass the database connection into it every time. One could include an external access method:

object Foo
{
  public title;
  public data;

  public function load()
  {
    queryResult = this.db.query(...);
    this.loadFromResult( queryResult );
  }

  public function loadFromResult( queryResult )
  {
    this.title = queryResult['title'];
    this.data  = queryResult['data'];
  }

  public function save() { ... }
}

But this seems hackish, and relies on the queryResult always having the same fields. Plus, there are situations where you don't need certain fields, like if I'm simply building a list of titles, and data is a large BLOB object. Creating another object, like FooListItem, with just title, for those specific cases, and duplicating all of the respective behavior doesn't seem very flexible either. And there's something odd about Foo inheriting from FooListItem.

What's your preferred method? Are there any articles or publications that cover this? I'm sure there is an elegant solution somewhere that involves some kind of pattern, but it's illuded me so far. There's also the chance that I'm doing everything completely wrong, in which case feel free to beat me with a programming handbook.

ammoQ

I propose you take a look at NHibernate.

Sunstorm

@ammoQ said:

I propose you take a look at NHibernate.

As far as I can tell from the documentation, it's an OR mapper that blurs the line between objects and database tables. I'm not sure how comfortable I'd be with working with a framework like that. I tend to think of the database and the objects as two completely different domains, and I'd still rather define the mappings between them manually.

I'm more interested in a general methodology for a solution, especially one that I could apply to any OOP language.

ammoQ

Hibernate is generally relatively flexible in the mapping, but it requires to write or generate a lot of XML to define the mappings.

Anyway, maybe you prefer the active record pattern?

Sunstorm

That's more like the validation I'm looking for, thanks.

Still, one has to wonder, why doesn't MS prefer this way, with their DataAdapters and typed datasets? I like to think they've put in their required ammount of thought into it...

ammoQ

@Sunstorm said:

Still, one has to wonder, why doesn't MS prefer this way, with their DataAdapters and typed datasets? I like to think they've put in their required ammount of thought into it...

They probably don't prefer it because it's not exactly enterprisey. (Despite being mentioned in Martin Fowlers "Patterns of Enterprisey Application Architecture")

Honestly, I think there are better ways to do it than having database access logic and business logic in the same class.

Sunstorm

@ammoQ said:

@Sunstorm said:
Still, one has to wonder, why doesn't MS prefer this way, with their DataAdapters and typed datasets? I like to think they've put in their required ammount of thought into it...

They probably don't prefer it because it's not exactly enterprisey. (Despite being mentioned in Martin Fowlers "Patterns of Enterprisey Application Architecture")

Honestly, I think there are better ways to do it than having database access logic and business logic in the same class.

Better ways is better. Care to ellaborate?

ammoQ

@Sunstorm said:

Better ways is better. Care to ellaborate?

IMO Hibernate does a good job. Persistance is taken out of the classes responsible for business logic. Fine, but it comes at a price: The business class object must expose all fields (as properties or getter/setter pairs) that need to be written to the database and read back to restore the objects state. ActiveRecord avoids that problem; the only requirement is that the database has the necessary columns (or offers alternative means to save/restore the state, e.g. key-value pairs), but the public interface of the class (and therefore the object model) is not affected.

PhillS

An alternative to Hibernate is iBATIS. I haven't had to use it in any projects as yet but I had to take a look at it a little while ago and it seems pretty flexible. Looks like you'd have to do more configuration on it compared with Hibernate, though.

Sunstorm

Thought I should mention it, I've since followed the link trail from ammoQ's link to find the book "Patterns Of Enterprise Application Architecture" from Addison Wesley, which has a chapter dedicated to this exact question that addresses everything I was looking for.

I've since settled for the Data Mapper pattern. Thanks for the pointers.

phx

NHibernate is great, and very flexible. To stop running into mapping impedance issues (shit you can do in a database that your language doesnt support) you should look at the ActiveRecord pattern.

Forget about handling the mapping yourself, you'll just introduce bugs. Get a copy of Diamond Binding from http://dunnchurchill.com and then you get all the joy of synchronised definitons and mapping files. Its basically what Microsoft should have done. Theres some examples up there on the site.... theres no way you could implement (and maintain) an API that cool in house. Plus it seems to use NHibernate a bit in the backend - which is probably a good thing.

Sunstorm

Seems like a lot of overhead of something that's as simple as an SQL query. Plus I'm afraid of using a system like that because maybe I don't necessarily want to map directly from a table to an object. Data can have many meanings, and I might need it sometimes, and I might not. Doing the mapping by hand doesn't seem to be giving me any trouble as long as I keep it well encapsulated and structured.

I suppose I should have mentioned that (right now) my primary use cases for this are in PHP, not .NET. However, I'll keep these products in mind next time I have a large scale .NET project.

djork

@Sunstorm said:

Seems like a lot of overhead of something that's as simple as an SQL query. Plus I'm afraid of using a system like that because maybe I don't necessarily want to map directly from a table to an object. Data can have many meanings, and I might need it sometimes, and I might not. Doing the mapping by hand doesn't seem to be giving me any trouble as long as I keep it well encapsulated and structured.

I suppose I should have mentioned that (right now) my primary use cases for this are in PHP, not .NET. However, I'll keep these products in mind next time I have a large scale .NET project.

Doing it in PHP is going to be easier than in .NET, since PHP objects are dynamic.

Just iterate over each column in the data row and stuff it in a corresponding field, and you're done. If you write a function that takes an object and an associative array (a data row from mysql_fetch_assoc, for instance), then you can essentially "merge" things defined by your class and the data in the row.

$obj->$columnName = $columnValue;

Nether

This is an issue I find myself compemplating a lot within my current (Java) application's architecture, with a line of thought that closely matches Sunstorm's. Without realizing it, in our last architecture redesign I proposed the Active Record pattern, not because I had ever heard of the thing, but because it seemed like a sane method that was a vast improvement over the previous way of doing it. Object fields map closely to database fields, and there are load() and write() methods for each object. I contemplated implementing these methods as an interface, but the arguments to our load methods are completely irregular due to a strange identifier system that requires multiple values to uniquely identify an entity in the database.

As the application has grown more complex, I've found that we've moved farther and farther away from a 1-to-1 relationship of tables to objects. For instance, table A may have a row that represents a certain entity, but table B is essentially a list of other pieces of data associated with A (i.e. normalized), so that B has the form (A_ID, datum) without a key. In the object model, data in B is thrown into a List contained in object of type A, usually with a standard java data type. So even with this, the data in one object now gets mapped out to more than one table when that object is save()d.

The result is a sort of inconsistent "bubbling-down" of a save() command from the top object in the model to subobjects and structures. That is, if A contains objects B and C, it will call B.save() and C.save(), and also manually save the objects in standard java structures contained in A. This works ok, but because not every table is represented by an object, the save commands are growing highly specialized and complicated. Certain tables should be written before others, some data does not need to be written at all, some needs to have its values mapped to suitable database representations, and the code that does so is growing rather complicated. This seems to indicate that our pattern is growing closer to that of the Data Mapper that Sunstorm linked, but with the mapper still an instrinsic part of the object being mapped when perhaps it should not be.

There's probably no wonderful, perfect solution that works in every case. I've contemplated moving the mapping logic out to its own part but am hesitant to add more complexity to the codebase at this point. I've even mulled over some ideas like creating thin wrappers around standard java library types (like ArrayList) by extending them and adding only a suitable app-specific name and some mapping methods, but nothing seems to suggest itself as as silver bullet to the issue.

jasmine2501

There is no "best" way to do this. If you're loading an unknown number of name/value pairs from a stored procedure, then obviously you should use a different method than if you're loading a single, but very complex object from your stored procedure. In my shop, as long as you're loading it from a stored procedure, I don't care what else you do. However, I am a fan of the KISS principle. I have no use for 5 layers of crap in my data access... just load the thing the simplest way possible, as long as it makes sense for the situation.

brazzy

@Sunstorm said:

Seems like a lot of overhead of something that's as simple as an SQL query. Plus I'm afraid of using a system like that because maybe I don't necessarily want to map directly from a table to an object. Data can have many meanings, and I might need it sometimes, and I might not. Doing the mapping by hand doesn't seem to be giving me any trouble as long as I keep it well encapsulated and structured.

You only end up doing the same thing again and again, manually. Logic (if not code) duplication galore, with all that that entails (low maintainability, more potential for bugs, etc.). The O/R mapping frameworks are mature enough to cover far more than just the simple cases (like directly mapping tables to objects).

Sunstorm

@djork said:

Doing it in PHP is going to be easier than in .NET, since PHP objects are dynamic.
Just iterate over each column in the data row and stuff it in a corresponding field, and you're done. If you write a function that takes an object and an associative array (a data row from mysql_fetch_assoc, for instance), then you can essentially "merge" things defined by your class and the data in the row.

$obj->$columnName = $columnValue;

This is pretty much what I do, but since I make a point of ~~cluttering my code with~~ using only public getters and setters, I do this inside a _restore( array $data ) method inside whatever class I need to load from an external source. This is part of a IRestorable interface I have, that's marked "internal" in the documentation. PHP doesn't actually have that kind of protection, but it's enough for me. It has worked quite well for high scale projects so far.

GeneWitch

Re: OP

You could use a database that was meant to store objects. There's a handful of them that are good at it... in the sense that even if they're just regular relational databases at heart, you input objects and it outputs objects. I don't remember who on this board mentioned it to me, but i did some reading and basically it works thusly:

rather than rows and columns in a database, you set the databas's input to the same properties as whatever your software's objects are. brian fried can't think, but basically the object without methods gets put into the database. i'd assume like when you were using an object at runtime all references to it get stored in memory of one sort or another, but with this database they're persistant :-)

IMO using a regular database to fill up objects makes as much sense as using a oldschool qbasic sequential fil. or somesuch.

blah. it all depends on how often you have to do it, i'd imagine.

witchdoctor

@GeneWitch said:

You could use a database that was meant to store objects. There's a handful of them that are good at it... in the sense that even if they're just regular relational databases at heart, you input objects and it outputs objects.

Isn't that pretty much what an O/R mapper does with any relational database anyway?