ORMs

xaade

@Weng said in MongoDB? What's that, some kind of fruit? Give me a real database!:

I don't trust ORMs with performance-critical database code.

Most ORM is just flesh over reflection, or it reads the database and compiles some classes for you.

You can do it simple with just spinning up your own classes that take from the results and put into a class. Not hard at all, no performance problems, etc.

ORM is stupidly simple concept, where it gets fat is just how much convenience you want.

flabdablet

@xaade said in MongoDB? What's that, some kind of fruit? Give me a real database!:

ORM is stupidly simple concept

Every complex question has a simple answer that's wrong.

xaade

@flabdablet The whole point is to take relational data, and make it into classes. That's it. Why people invent massive frameworks to do such a simple task is beyond me.

ScholRLEA

@xaade Unfortunately, most of the cases where you want (or need) an ORM is in languages where objects are the base means of abstraction, and often the only one. The alternative would be to have a catch-all Relation class that takes an arbitrary table with no structuring at all, basically just hiding the relational aspects behind a conceptual wall. This was what they tried originally in some ORBs back in the 1990s, and. it was not a conspicuous success.

The weird thing is, a lot of the mismatch is in the understanding of the two sets of ideas, not in the paradigms themselves. Since SQL is the dominant language for RBDMS, despite it being a utterly half-assed and shitty misimplementation of Codd's rules, and since SQL handles user-defined domains about as well as I would handle riding a unicycle on the lip of a canyon, most OOP people understandably assume that relational domains are static by their very nature and can't imagine why anything would be so stupid. However, both groups tend to equal relations with objects, rather than domains (an attitude reinfoced by the previous point) which is itself a ridiculous misinterpretation of what RC and RA are about.

xaade

@ScholRLEA said in MongoDB? What's that, some kind of fruit? Give me a real database!:

user-defined domains

Need to read on that....

@ScholRLEA said in MongoDB? What's that, some kind of fruit? Give me a real database!:

that relational domains are static

That too

@ScholRLEA said in MongoDB? What's that, some kind of fruit? Give me a real database!:

relations with objects, rather than domains

Isn't a foreign key just a reference...

I need to read up on RDB buzzwords to understand this...

@Weng said in MongoDB? What's that, some kind of fruit? Give me a real database!:

Variable domains really aren't all that hard in SQL

Oh.... I get it now, I think.

What you're saying is that OOP is usually static in that the classes are defined up front, however RDB can be dynamic?

Well not really, the tables are defined up front, and foreign keys are just references.

Maybe the reason there is confusion is BECAUSE sql defines tables up front.

Before I continue, I'm going to go read up on that terminology.

xaade

@ScholRLEA said in MongoDB? What's that, some kind of fruit? Give me a real database!:

user-defined domains

I think what we're looking at is a different beast altogether.

SQL doesn't really begin to support that concept.

It even returns joins as flat rows rather than a relational dataset.

It's built around static.

What you're thinking of is a document-based DB that uses JSON that can query the relations behind the scenes.... or maybe just stores the entire DB in one JSON object?

flabdablet

@xaade said in MongoDB? What's that, some kind of fruit? Give me a real database!:

The whole point is to take relational data, and make it into classes. That's it. Why people invent massive frameworks to do such a simple task is beyond me.

The whole point is to take marriage, and make it into food. That's it. Why people invent massive frameworks to do such a simple task is beyond me.

ScholRLEA

@xaade You've misunderstood what I meant. In RT, a domain is a category, equivalent to a type or class; so the standard domains in SQL are things like INT, CHAR, VARCHAR, BIT, etc. It defines what kinds of data can be put into an attribute (i.e., a column) and how it is to be interpreted. A user-defined domain would allow you to define a relation with attributes in some domain like, for example, IpAddress which specializes the underlying domain (presumably CHAR(29) or something similar) with a set of filters to restrict the data that can go into it, or define a compound domain such as User with the combination of FirstName, MiddleInitial, and LastName.

While it is possible - and common practice - to define a relation for compound data, and simply use a FK to reference it, this leads to a loss of abstraction. Codd saw this from the start, and since his theoretical system doesn't depend on the specific domains of attributes - only that domains are disjoint - he assumed that defining new domains (either as its own thing or as a syntactic sugar over an FK to a relation) would be a basic practice.

And here is where the difference between theory and practice arises. The original SQL standard had no provisions for user-defined domains, and IIRC only defined INT, FLOAT, CHAR, and VARCHAR. While the SQL standard from some point on (1992 I think) includes a CREATE DOMAIN clause, the way the standard defines it is a broken mess, and the few RDBMSes that support it at all do such a half-assed job of it that it is almost never used. Most SQL coders have no idea it is even there.

C. J. Date has long argued about how ridiculous this state of affairs is, and one of the primary goals the Third Manifesto was to get RDBMS vendors to get their shit together about it. Unfortunately, this was so badly misunderstood and misrepresented as being a case of him advocating for OODBMS - which was exactly what he was arguing against, on the grounds that RDBMS already has that area covered - that it never got taken seriously.

Among the few people who have even tried to grasp this issue, many argued that it would be a violation of normalization to bundle data into fixed elements like this, but this argument is so pants-on-head that it isn't even funny - the whole point is to give the system a better mechanism for enforcing normalization by giving it the information it needs about the abstractions being used.

Anyway, it's something of a pet peeve of mine, and is one of the reasons I try to avoid that nutbar TopMind whenever he shows up in some forum or wiki.

xaade

@flabdablet

Look, if you want another way to interact with a relational database, that's fine.

But ORM = Object Relational Mapping.

I have no idea what you are getting on about marriage and food, because there's no MRF concept.

If ORM meant something more than that, I'd entertain your thoughts, otherwise.... TDEMS.

xaade

@ScholRLEA Ah....

I suspected that's what you meant, but wasn't sure.

Yes, if you got Relational databases right, you wouldn't need ORM. You could simply pass the data along with a schema and have a way for the languages to build the result in a way that it could be interacted with.

Ultimately the data has to be interacted with, and referencing datum with ["{fieldname}"] is cumbersome. So we're really just looking at an ORM that is much lighter weight.

This is why I like the concept of JSON driving data storage, because it plops out in a way that multiple languages can interact with it, immediately.

At the time, .NET didn't have a baseline feature to parse JSON into classes, but it did have a way to parse XML. I don't know if that's changed, I've moved on to other things.

As far as no OODBMS, can you give me a concept that allows you to query a relational database without resulting in classes that is somehow superior, given that people implemented C.J.Date's ideas correctly, because I have a little bit of tunnel vision here and can't think of a better way to interact with the data (again assuming that SQL got domains right, not talking about the current massive beasts of ORM).

For example, assume that querying a order, the table has

Customer /*(defined as Firstname, lastname)*/, Item /*(defined as a pizza with ingredients in an array)*/

You query

SELECT Customer from Orders;

Results

[{ Firstname: "John", Lastname: "Doe" },
 { Firstname: "Mary", Lastname: "Sue" }]

How would you interact with that in C#?

ScholRLEA

@xaade said in MongoDB? What's that, some kind of fruit? Give me a real database!:

As far as no OODBMS, can you give me a concept that allows you to query a relational database without resulting in classes that is somehow superior, given that people implemented C.J.Date's ideas correctly, because I have a little bit of tunnel vision here and can't think of a better way to interact with the data (again assuming that SQL got domains right, not talking about the current massive beasts of ORM).

Well, first off, the idea isn't that you wouldn't use classes, just that the general formula would be "class == domain", not "class == any arbitrary relation". The approach of turning a general relation into an object is the sticking point here.

I can understand the confusion, though; it is hard enough to get people to see that "attribute != table", never mind "table != object". It may sound like an academic argument, but it isn't; confusing the concept with the implementation is 90% of the problems we get in these matters.

My thought is that, first off, we need to start looking at relational calculus in terms of some notation other than SQL, something that lets you deal with the schemas, domains, and relations as first-class objects. Such systems do exist, and a few companies have started using them as they try to find meaningful ways of working with Big Data, but for the most part this is still esoterica.

Let's start by generalizing domains themselves as a meta-domain (or class, if you will) consisting of a relation, a set of filters, and a set of CRUD operations wrapping around the standard ones. Once you do that, a lot of possibilities open up, and you can apply the obvious approach of building compound domains from relations without the abstractions leaking all over the place. As you can see, we aren't avoiding OODBMS so much as co-opting it, which is where a lot of people misunderstood Date. RDBMS and OODBMS are compatible ideas, we've just been approaching that compatibility from an ineffective direction.

I'll get back to this later, I've got some things to do and I'll need time to frame the rest of this anyway. I've got to look carefully at the really hard parts now, which have to do with things like using keys without breaking encapsulation, constraint propagation, data granularity, and a host of problems where the complexity has a high fractal dimension.

boomzilla

@xaade said in MongoDB? What's that, some kind of fruit? Give me a real database!:

@flabdablet The whole point is to take relational data, and make it into classes. That's it. Why people invent massive frameworks to do such a simple task is beyond me.

"That's it."

The ORM I use (Hibernate) lazily loads objects from the database as I access them using normal accessors. It allows me to run a query using a SQL like query language and get my business objects back as results. When it's too slow I can drop to native SQL to do things.

You can take my "massive framework" from my cold, dead hands.

boomzilla

@ScholRLEA said in MongoDB? What's that, some kind of fruit? Give me a real database!:

I'll get back to this later,

Sounds like an interesting discussion. Please put it in its own topic!

ScholRLEA

@boomzilla Quite so. If one of the mods could move this part of the thread, it would help, though if needed I'll just add a link back here in my future posts.

Yamikuronue

@ScholRLEA Gotcha

xaade

@boomzilla said in ORMs:

The ORM I use (Hibernate)

Hibernate isn't that bad.

It requires a little too much custom config every time you use it. I don't know, maybe someone created a parser that could plop the config down for you. But, when I started to use it, everything was manual.

It's not a bad one.

I'm looking more at the Microsoft ORM that they had in Visual Studio, that built a massive amount of code just for a basic three column table.

xaade

@ScholRLEA @boomzilla Ditto

Would like to learn more of your thoughts.

I'd also like to know how you feel about document based using JSON, because I feel like that level of verbosity is what you're asking relational databases to support.

asdf

@boomzilla said in ORMs:

The ORM I use (Hibernate) lazily loads objects from the database as I access them using normal accessors. It allows me to run a query using a SQL like query language and get my business objects back as results. When it's too slow I can drop to native SQL to do things.

Personally, I've never used Hibernate, but I've used Doctrine (PHP) before, which claims to be similar to Hibernate, and it was a really nice experience. SQLAlchemy (Python) is great and extremely powerful (as in: you probably won't need to write any raw SQL) as well.

You can take my "massive framework" from my cold, dead hands.

+1

ScholRLEA

@xaade said in ORMs:

@ScholRLEA @boomzilla Ditto

Would like to learn more of your thoughts.

I'd also like to know how you feel about document based using JSON, because I feel like that level of verbosity is what you're asking relational databases to support.

What are you... where does serialized data representation even enter into this discussion?

/me wanders off mumbling about orthogonal issues and confusing concept and representation

EDIT: Sorry about that.... after thinking about it a few minutes, I kind of see where this would be relevant in this... eventually... but not really. How the the program talks to the RDBMS is pretty much irrelevant at this level of abstraction, the data could be passed as JSON, XML schemas, Python pickles, smoke signals, it really doesn't matter because it should be general enough to work with several formats.

Indeed, ideally the program and RDBMS should be able to negotiate the format as needed, or even handshake their way through creating a unique one specific to the application at runtime, though presumably not every bloody time something is send back and forth. That's really unlikely to happen any time soon, but we can dream.

xaade

@ScholRLEA said in ORMs:

What are you...

You're taking it too literal.... and too serious.

@ScholRLEA said in ORMs:

eventually... but not really.

It's very relevant.

Ok, I think it would be helpful to look at this top-down. Because I think you're getting lost in definition and semantics, and that's a little bit confusing to me right now. But on the other hand, I think it's clouding your ability to think big picture.

Let's say that you want to make a report that shows the list of all the purchases from a single customer. You decide to normalize the data such that customer is a foreign key to the purchase entries, because why repeat that information in the database over and over? There are two ways you can get the data that represent two different kinds of queries.

Either you can go look for the customer you want, select that data, fill that in at the top of the report,
or you can do a join on customer and purchase entry data.

With the results

If you do a join, when the data comes back, every line in the select result has that customer data repeated.
If you don't do a join, you have to perform two selects.

Now, if you didn't care about the customer data being in a separate box at the top of the page, and you were ok with the data being in every line on the report, you could just do the join, take the flattened results, and shove them into a grid as is. This distinction doesn't change whether the custom domains work like you are saying they should, or not. The same decision would exist.

The point of this demonstrates how joins could be useful even if database custom domains were verbose.

Now, consider document style databases.

You might have a "table" of customer purchases, yet it is represented as a block of data at the top for the customer, and a series of entries below.

You don't even have to think about the design of your report, because you just simply throw your elements on the page in the same pattern, and bind the data. Done.

I can imagine that relational databases could return you the data in the same format, if you provide the query a schema for the returned data format.

JSON isn't so much important, and whether it is serialized isn't important either.

The importance was the verbosity of JSON in the data it could return, and the fact that there's nothing keeping us from making query results give us data in a verbose format such as the above from a hypothetical relational database. The database could be stored normalized, and it could be stored without serialization, but it could also return verbose results as mentioned.

Another way of saying it.

I see document style databases as a static merging of queried data schema and relational structure.

But with a good schema expression, you could fabricate the same results from a normalized relational database.

That's why I asked you about JSON-based document databases. The verbosity of the domains is very flexible, but at the loss of normalization and quick formation of complex queries.

ScholRLEA

@xaade Ah, what tripped me up was that I haven't heard of 'document style databases' before, and missed the key point as a result.

I need to catch up on that, but if I understand the idea, then such verbose communication would be tremendously useful in avoiding having the applications do a ton of unnecessary work - if the database engine can add more contextual information beyond the query result set, then it should, or at least it should be able to do so when asked to. Being able to pass along additional information about the schema, the join constraints, candidate keys (or at least the primary key actually used), foreign keys, and similar meta-data would all be important to what I have in mind.

ScholRLEA

OK, I've taken a quick look at what Wicked-pedo has to say about this, and it isn't what I thought it was; I have not really touched NoSQL or other modern non-relational approaches much yet, and this might be the kick in ass I need to get me to do so.

Thought now I am thinking that this dovetails nicely with some of what I have been thinking in regards to bridging the gap between relational methods and xanalogical storage; since Xanadu is all about flexibly tracking the relationships between irregular data, this sounds like a promising direction. Since a 'database' in Xanadu would (like just about anything else) merely be a set of internal links connecting the fragments and one or more application level views over those links,

Just to make one part clearer (I hope), xanalogical links are OOB, and while they are themselves data, unlike most other data in a docuverse they can be set to be mutable. This means that an application doesn't change the data at all, it just manipulates sets of links and draws the data views through them. This means that the sensible approach would be to have the foreign keys, key constraints, and indices as links or link sets rather than in-band in the relations.

Gotta think about this. Anyway, this is going a bit afield, as there are no working xanalogical systems right now, at least not ones which do what they were meant to do. I need to get back to connecting relations to objects, and while the ideas I have for this are related to what I just said, the relationship isn't too important right now.

anonymous234

@flabdablet So can you explain in simple terms what ORMs have to do, other than translate objects into SQL tables, for people who don't have experience with them?

flabdablet

@anonymous234 I could, but the explanation would be wrong.

anonymous234

@flabdablet Well that's not very helpful.

xaade

@anonymous234 I think he's mixing up two things here.

What would be a better pattern for communication between a hypothetical database and OO languages, which cannot exist because of the limitations of SQL.
ORM.

My comment was that, as ORM exists and as SQL exists, the current ORM's go through a lot of effort to do a rather simple thing.

I was not implying that a simpler ORM would be superior to the ORM pattern. I was only implying that ORM as a gap in relational database communication, is usually overly complex.

Have you looked at some of the setup required to get some of these ORM's off the ground?

Now consider that you can hand-write a class that takes a query result and reads it into properties, and then writes out an update or insert query when commanded with very little effort, and ORM can be much simpler and faster.

Whereas, the real problem, the dynamic nature of query results and joins, is often left UNSOLVED by these massive ORMs.

Jaime

@xaade said in ORMs:

Now consider that you can hand-write a class that takes a query result and reads it into properties, and then writes out an update or insert query when commanded with very little effort, and ORM can be much simpler and faster.

Whereas, the real problem, the dynamic nature of query results and joins, is often left UNSOLVED by these massive ORMs.

These problems run deeper than you let on. Once you do a join, you now have a result set where it isn't entirely obvious how updates should be handled, or even if they make sense at all. For example, how do you implement an update for a value that is a sum of many records in the database?

xaade

@Jaime Actually, I "let on" in the quote you just made.

Any advantage ORM would have, would be best solved by not tackling reverse engineering joins to determine how to do updates, by designing your database without using joins for anything other than display or to inform your updates. I strongly advocate against calculations and complex logic in the database, especially if you use ORM.

If you plan to use ORM, your database should be simple and normalized, and all of the heavy work should be done outside the database.

If you want to divide the work between your database and your middle-ware, you shouldn't use ORM.

That's just my opinion.

Eldelshell

I hate ORMs, I despise SQL and I loathe NoSQL.

You would think that by 2016 there would be an easier way to store data which integrates with the code you write, but no, you have to keep your tables and your models synchronized in the most annoying way possible. For fucks sake, not even the data types are 100% compatible.

Hell, you'd expect at least SQLServer + .Net and Oracle + JEE to be fully integrated and that ain't happening. And no, no code you provide, no example you give of how cool LINQ or Hibernate are, while I, manually, have to do an ALTER TABLE every time I change a class model, I'll keep thinking that the system is broken in the same way it has been for the past 40 years.

At the time I thought Informix4GL was dumb, but after having to maintain stupid enterprise SQL backed monsters for too long, I find it that they were right and everyone else was wrong.

asdf

@Eldelshell said in ORMs:

while I, manually, have to do an ALTER TABLE every time I change a class model

I'm sure there's something like alembic for Hibernate and whatever .Net provides. But yeah, you have a point.

anonymous234

@Eldelshell I agree. The entire point of DBMSs is to provide a nice abstraction over serial storage, to let users (programmers) store their data more easily.

It seems silly to me that the model they provide (SQL tables) is so different from the model that 99% of programming languages use (objects). Then you need to write glue code yourself or use an ORM to abstract the abstraction. Bad.

M_Adams

@ScholRLEA said in ORMs:

Among the few people who have even tried to grasp this issue, many argued that it would be a violation of normalization to bundle data into fixed elements like this, but this argument is so pants-on-head that it isn't even funny - the whole point is to give the system a better mechanism for enforcing normalization by giving it the information it needs about the abstractions being used.

This. Most of these arguments are around the very stupid dogma that "atomic means non-decomposable" which is

pants-on-head

retarded.

Atomic needs to be understood in relation to what you are trying to model. A call-center app for queuing calls may see a phone number as an atomic item, whereas the center's analytic unit may not (revenue by area code, etc ). A molecular biology db may see atoms as "atomic", a nuclear physics db, not.

I'm all for normalization (that being my tagline ) even unto DKNF and 6th normal forms, but first one needs to determine what atomic means in each case for each table in each database, or you're just pretending to normalize your data and are tying the pant legs around your neck.

Jaime

@Eldelshell said in ORMs:

... no example you give of how cool LINQ or Hibernate are, while I, manually, have to do an ALTER TABLE every time I change a class model, I'll keep thinking that the system is broken in the same way it has been for the past 40 years.

Don't forget that the application you are modifying may not be the only one using that database. The ORM system can't know if it's safe to add a column with a NOT NULL constraint or if it has to preserve compatibility.

Jaime

@xaade said in ORMs:

@Jaime Actually, I "let on" in the quote you just made.

Any advantage ORM would have, would be best solved by not tackling reverse engineering joins to determine how to do updates, by designing your database without using joins for anything other than display or to inform your updates. I strongly advocate against calculations and complex logic in the database, especially if you use ORM.

If you plan to use ORM, your database should be simple and normalized, and all of the heavy work should be done outside the database.

If you want to divide the work between your database and your middle-ware, you shouldn't use ORM.

That's just my opinion.

I never suggested doing anything in the database.

xaade

@M_Adams There's nothing saying you can't describe your types in such a way that a relational database can't query a portion of that type.

Customer.Phone.AreaCode.

Ironically I work for a company that designed a database that can use SQL that does just that.

This is part of what I've been saying.

People are so stuck in the current SQL box, that they can't think abstractly about relational databases and expect something better.

dkf

@Jaime said in ORMs:

Don't forget that the application you are modifying may not be the only one using that database.

That's the exact issue that I dislike about ORMs. They are too often used to try to make the database definition be driven by the code, rather than the other way round. Starting code first is neat at first, but spirals towards disaster as most developers simply can't get their data definitions even close to correct first time as they've never been trained on how to do domain modelling.

Once you eliminate driving the DDL from code, ORMs are pretty simple stuff (classes model the relations induced by the queries; you're essentially just doing column mapping at that point). Yes, updates through a particular class won't always work, but that's just reality being its usual self.

dkf

@ScholRLEA said in ORMs:

since Xanadu is all about flexibly tracking the relationships between irregular data, this sounds like a promising direction.

Sounds like you should be considering NoSQL for that; either a document store or a graph DB.