Normalisation



  • Just look at this:

    SELECT [RECNUM]
          ,[EXTRAID]
          ,[POSITION]
          ,[FieldName]
          ,[U_FIELD3]
          ,[XMLNODE]
          ,[FORBIKE]
          ,[FORCAR]
          ,[FORMACH]
          ,[FORPART]
          ,[FORTRCK]
          ,[FORTRLR]
          ,[FORZAAN]
          ,[FORZBUS]
          ,[FORCARP]
          ,[FORCARV]

          ,[WORD1]
          ,[WORD2]
          ,[WORD3]
          ,[WORD4]
          ,[WORD5]
          ,[WORD6]
          ,[WORD7]
      FROM [Trucks].[dbo].[EXTRA] 

    The structure for a table called Extra. These are extra's that apply to different VehicleTypes of vehicles.

    So a CAR has a different set of extra's than a TRCK.

    So the VehicleType Table has rows like this:

    TRCK
    MACH
    CAR
    PART
    BIKE
    TRLR
    ZAAN
    ZBUS
    CARV
    CARP

     

    So they hardcoded the VehicleTypes into the Extra Table Structure.

    Gimme a break. I am not that well educated, but even I got Normalisation at school.

    Can anyone give me a valid reason for such structure???

    Performace or something???


  • Discourse touched me in a no-no place

    @Ice^^Heat said:

    Gimme a break. I am not that well educated, but even I got Normalisation at school.

    Can anyone give me a valid reason for such structure???

    Performace or something???

    How far did they get teaching you normalisation? 3NF? 4NF? Well after 6NF comes denormalisation: http://en.wikipedia.org/wiki/Denormalization

     
    That's not to say that's what's happened here, but there is a valid reason for doing that sort of thing. 



  • @Ice^^Heat said:

    Just look at this:

    SELECT [RECNUM]
          ,[EXTRAID]
          ,[POSITION]
          ,[FieldName]
          ,[U_FIELD3]
          ,[XMLNODE]
          ,[FORBIKE]
          ,[FORCAR]
          ,[FORMACH]
          ,[FORPART]
          ,[FORTRCK]
          ,[FORTRLR]
          ,[FORZAAN]
          ,[FORZBUS]
          ,[FORCARP]
          ,[FORCARV]

          ,[WORD1]
          ,[WORD2]
          ,[WORD3]
          ,[WORD4]
          ,[WORD5]
          ,[WORD6]
          ,[WORD7]
      FROM [Trucks].[dbo].[EXTRA] 

    The structure for a table called Extra. These are extra's that apply to different VehicleTypes of vehicles.

    So a CAR has a different set of extra's than a TRCK.

    So the VehicleType Table has rows like this:

     

    TRCK
    MACH
    CAR
    PART
    BIKE
    TRLR
    ZAAN
    ZBUS
    CARV
    CARP

     

    So they hardcoded the VehicleTypes into the Extra Table Structure.

    Gimme a break. I am not that well educated, but even I got Normalisation at school.

    Can anyone give me a valid reason for such structure???

    Performace or something???

    No valid reason.  It's a horrible structure.  On top if that, I see Word1, Word2, Word3, etc ... also a horrible violation of normalization.  Looks like you might have a mess on your hands.

    It's funny, people talk about "denormalzing" for performance, when all it does it make performance WORSE if you actually know how to write SQL ..... 



  • Those word columns are for an implementation of "Full-Text Search".

    When the user types in "Heater", and a Word column also has the value of "Heater", then the user gets vehicles with an Heater. Word2, word3 etc are then for the word "Heater" in other languages.

     

    I'd consider emailing the consultant about this mess. But I don't want to sound like an arrogant know-it-all smart ass intern.

    *sigh* 



  • @Ice^^Heat said:

    Those word columns are for an implementation of "Full-Text Search".

    When the user types in "Heater", and a Word column also has the value of "Heater", then the user gets vehicles with an Heater. Word2, word3 etc are then for the word "Heater" in other languages.

     

    I'd consider emailing the consultant about this mess. But I don't want to sound like an arrogant know-it-all smart ass intern.

    sigh 

    Normalization is a funny thing.  Most best practices are tedious to implement sometimes -- really good error checking, unit testing, not exposing public members in classes, etc -- but normalization is odd in that if you do it, it costs NOTHING and makes everything easier, shorter, and quicker. there are literally no cons to a normalized database, assuming that you know SQL. You can always write denormalized views if the data needs to "look" a certain way (though that word -- "look" -- tells you that you shouldn't do this at your databse at all, but at your presentation layer), so no one even needs to know that your data is "secretly" normalized behind the scenes.

    I suspect that if you explain to the consultant that because of his design simple SQL operations are needlessly complex, and a more normalized structure would have great benefits, he will probably respond back that you are too "lazy" to work with "real sql code that is complicated" and that "normalization theory" is just that -- theory, and has no place in the "real world". 



  • Well the query to get the vehicles is incredibly long. Dynamic SQL ofcourse, what else....... all in the business logic layer.

    With pain I adapted the function so that paging is supported. But the performance is terrible, I already mailed the consultant about the performance. He didn't mail back yet.

    This is a 9 year old legacy crap system. Everytime a new requirement came in it was like:

    My Company: "We need an extra Extra!!"

    Software Company: "Sure I'll make another Extra Column!! Done in a sec! And I'll add a row to the Extra Table!"

     

    If it where up to me, I would restructure the whole damn Data Model, completely redo it so that their is no need to alter it in the future. But its not up to me, and the project is approaching the deadline.



  • @Jeff S said:

    @Ice^^Heat said:

    Those word columns are for an implementation of "Full-Text Search".

    When the user types in "Heater", and a Word column also has the value of "Heater", then the user gets vehicles with an Heater. Word2, word3 etc are then for the word "Heater" in other languages.

    I'd consider emailing the consultant about this mess. But I don't want to sound like an arrogant know-it-all smart ass intern.

    *sigh* 

    Normalization is a funny thing.  Most best practices are tedious to implement sometimes -- really good error checking, unit testing, not exposing public members in classes, etc -- but normalization is odd in that if you do it, it costs NOTHING and makes everything easier, shorter, and quicker. there are literally no cons to a normalized database, assuming that you know SQL. You can always write denormalized views if the data needs to "look" a certain way (though that word -- "look" -- tells you that you shouldn't do this at your databse at all, but at your presentation layer), so no one even needs to know that your data is "secretly" normalized behind the scenes.

    I suspect that if you explain to the consultant that because of his design simple SQL operations are needlessly complex, and a more normalized structure would have great benefits, he will probably respond back that you are too "lazy" to work with "real sql code that is complicated" and that "normalization theory" is just that -- theory, and has no place in the "real world". 

    Well, you're mostly right.  In an OLTP database, you should normalize like crazy.  Updates, Inserts and Deletes are much faster in narrow, highly normalized tables with very few indexes on them.  In an OLAP database, you often want to denormalize a lot of the data so that Selects run faster, and don't have as many Joins in them.  There are serious cons to an overly normalized OLAP database, in terms of speed.  (When you have to join 20 tables to generate a report, you're better off warehousing the data into a single denormalized table, or a set of denormalized measures and dimensions, depending on how flat the report is going to be.)



  • @Gsquared said:

    In an OLAP database, you often want to denormalize a lot of the data so that Selects run faster, and don't have as many Joins in them.  There are serious cons to an overly normalized OLAP database, in terms of speed.  (When you have to join 20 tables to generate a report, you're better off warehousing the data into a single denormalized table, or a set of denormalized measures and dimensions, depending on how flat the report is going to be.)

    Sometimes. It all depends.  I have found that good indexes and clean, efficient well-written SQL statements are often all you need, but of course, not always. The key is to try it with good SQL and indexes and all that FIRST, and then consider a denormalized warehouse.  But if you ever query that denormalized warehouse in a way that it was not original set up for, which invariably happens of course, querying those denormalized tables and then transforming them so you can do what you need is often much more expensive (and lots more complicated, of course) than simply querying the normalized schema.

    So, it all depends, but overall that is a good point to mention.  It doesn't affect a database model, though -- it only affects how you might decide to design a supplemental reporting data warehouse.
     



  • @Ice^^Heat said:

     

    If it where up to me, I would restructure the whole damn Data Model, completely redo it so that their is no need to alter it in the future. But its not up to me, and the project is approaching the deadline.

    And your application would suffer the same things this one does now.

    In the beginning (9 years ago) this application could have followed all best practices and looked real nice.  Through use and constant modifications you have what you have. This is simply software aging and every application will go through it.  There will always come a time in the future where you will have to alter an application, unless it is no longer being used.

    Go google "Big Ball of Mud" and read up on the software life cycle (not software development life cycle) it might open your eyes a little.



  • @KattMan said:

    @Ice^^Heat said:
     

    If it where up to me, I would restructure the whole damn Data Model, completely redo it so that their is no need to alter it in the future. But its not up to me, and the project is approaching the deadline.

    And your application would suffer the same things this one does now.

    In the beginning (9 years ago) this application could have followed all best practices and looked real nice.  Through use and constant modifications you have what you have. This is simply software aging and every application will go through it.  There will always come a time in the future where you will have to alter an application, unless it is no longer being used.

    Go google "Big Ball of Mud" and read up on the software life cycle (not software development life cycle) it might open your eyes a little.

    Uh ... no. He is talking about creating a clean, normalized data model.  that doesn't change as long as relational databases are still in use.  Normalization existed 10 years ago, you know!   This database design was bad way back then, and that caused it to be ugly and sloppy and inefficient and redundant back then as well, and all of that still applies today and will in the future as long as a relational database is still used.

     



  • @KattMan said:

    @Ice^^Heat said:
     

    If it where up to me, I would restructure the whole damn Data Model, completely redo it so that their is no need to alter it in the future. But its not up to me, and the project is approaching the deadline.

    And your application would suffer the same things this one does now.

    In the beginning (9 years ago) this application could have followed all best practices and looked real nice.  Through use and constant modifications you have what you have. This is simply software aging and every application will go through it.  There will always come a time in the future where you will have to alter an application, unless it is no longer being used.

    Go google "Big Ball of Mud" and read up on the software life cycle (not software development life cycle) it might open your eyes a little.

    There is no design, there is no documentation. And it is made with an RAD tool. Clicky, clicky Wooh! An application!

    I highly doubt any best practise was used at all when they created this.

    The application is actually quite simple, it just stores vehicles and their subsequent information. 10 tables at most.

    I know no application can live without being altered, but good design would actually make things a little easier.



  • @Ice^^Heat said:


    Uh ... no. He is talking about creating a clean, normalized data model.  that doesn't change as long as relational databases are still in use.  Normalization existed 10 years ago, you know!   This database design was bad way back then, and that caused it to be ugly and sloppy and inefficient and redundant back then as well, and all of that still applies today and will in the future as long as a relational database is still used.


    @Ice^^Heat said:

    There is no design, there is no documentation. And it is made with an RAD tool. Clicky, clicky Wooh! An application!
    I highly doubt any best practise was used at all when they created this.
    The application is actually quite simple, it just stores vehicles and their subsequent information. 10 tables at most.
    I know no application can live without being altered, but good design would actually make things a little easier.

    I only stated that the application "could have followed" practices not "did follow"  After this much time you can not prove that documentation did not exist at some point, It could have been lost or not exist at all.

    Ice Heat stated he would create one that would never have to be modified and would always look good.  This is an impossibility.  If either of you actually read the Big ball of mud you would understand why I stated that. 



  • I get what you are saying, but at least the previous developers could have realized that vehicle features change over time. And built their data model so that it could support those changes.

    I just want an effective data model that supports future changes in some degree.

    And minimize the amount of columns for performance reasons.

     

    But whatever, now I have plenty to write about in my end-thesis. I will write a new chapter about Normalisation.


Log in to reply