SQL Azure

arh

Ok, so we're moving stuff to the cloud. Great, I love it, there's very little not to love about the concept. Azure is cheap and nice and suddenly you end up getting married before inspecting the goods thoroughly first.

Let's deploy to Azure. Guess what: SQL Server in Azure doesn't support extended properties. WTF???? And why is that? No word. Exactly HOW do you expect me to document my database? MS Office??? And MS fanboys try to sugarcoat it by saying "well, most people don't use extended properties anyways". WHAT THE HELL IS THIS UNPROFESSIONAL SHITTY ATTITUDE. Do I look like a script kiddie with a MySQL T-shirt? This is on the level of XCode not have automatic streamlined support for class and method documentation. Are we going back in time?? Seriously???? Why do people accept this?

henke37

What's an "extended property"?

PJH

@henke37 said:

What's an "extended property"?

I take it Google's Rorschach tests today are preventing people from using it...

RaceProUK

@arh said:

Exactly HOW do you expect me to document my database? MS Office???

Corporate wiki???

TGV

@henke37 said:

What's an "extended property"?

Table and column comments, I suppose. I don't particularly like them. In Oracle, at least, they disappear as soon as you think about an ALTER or DROP statement.

Ronald

@arh said:

Ok, so we're moving stuff to the cloud. Great, I love it, there's very little not to love about the concept. Azure is cheap and nice and suddenly you end up getting married before inspecting the goods thoroughly first.

Let's deploy to Azure. Guess what: SQL Server in Azure doesn't support extended properties. WTF???? And why is that? No word. Exactly HOW do you expect me to document my database? MS Office??? And MS fanboys try to sugarcoat it by saying "well, most people don't use extended properties anyways". WHAT THE HELL IS THIS UNPROFESSIONAL SHITTY ATTITUDE. Do I look like a script kiddie with a MySQL T-shirt? This is on the level of XCode not have automatic streamlined support for class and method documentation. Are we going back in time?? Seriously???? Why do people accept this?

Extended properties are a lousy way to document a database. The typical and more effective way to do it is to use the proper features in your database IDE (like SSDT) or any other modeling/design tool that maintains a separate repository for metadata.

If you use SQL Azure I strongly suggest to rely on SSDT, it can be configured to target any version of SQL Server (including Azure) when you run the build, and it has advanced features for pre- or post-deployment scripts so you can load/update master data if needed. The deployment features also allow you to ignore stuff that may cause problems but that you missed in code review, like conflicting collations. It's free and immensely superior to Management Studio for design, coding and deployment.

arh

@Ronald said:

Extended properties are a lousy way to document a database. The typical and more effective way to do it is to use the proper features in your database IDE (like SSDT) or any other modeling/design tool that maintains a separate repository for metadata.
If you use SQL Azure I strongly suggest to rely on SSDT, it can be configured to target any version of SQL Server (including Azure) when you run the build, and it has advanced features for pre- or post-deployment scripts so you can load/update master data if needed. The deployment features also allow you to ignore stuff that may cause problems but that you missed in code review, like conflicting collations. It's free and immensely superior to Management Studio for design, coding and deployment.

SSDT looks really nice. I'll check that out. However, the great thing about extended properties/comments is that the documentation stays with the database. When you jump into some system some other people made, you might not have the IDE files and whatnot. It all depends on the environment and users. With table comments, five or ten years down the road people can still see what is what. It's simple and works great for all system sizes.

Databases tend to be painfully neglected in far too many environments, and having to rely on external tools to document it just makes it that bit worse. You have JavaDoc for Java, XMLDoc for .NET and all these fancy things. You even had PLDoc for PL/SQL, for what it was worth, when it existed. Reading a codebase and building documentation files from that is worth its weight in gold. Moving documentation out of the code/original system is a decision you make, for better or worse, but disallowing documentation in the database has no point and will definitely make things more difficult.

Ronald

@arh said:

@Ronald said:
Extended properties are a lousy way to document a database. The typical and more effective way to do it is to use the proper features in your database IDE (like SSDT) or any other modeling/design tool that maintains a separate repository for metadata.
If you use SQL Azure I strongly suggest to rely on SSDT, it can be configured to target any version of SQL Server (including Azure) when you run the build, and it has advanced features for pre- or post-deployment scripts so you can load/update master data if needed. The deployment features also allow you to ignore stuff that may cause problems but that you missed in code review, like conflicting collations. It's free and immensely superior to Management Studio for design, coding and deployment.

SSDT looks really nice. I'll check that out. However, the great thing about extended properties/comments is that the documentation stays with the database. When you jump into some system some other people made, you might not have the IDE files and whatnot. It all depends on the environment and users. With table comments, five or ten years down the road people can still see what is what. It's simple and works great for all system sizes.

Databases tend to be painfully neglected in far too many environments, and having to rely on external tools to document it just makes it that bit worse. You have JavaDoc for Java, XMLDoc for .NET and all these fancy things. You even had PLDoc for PL/SQL, for what it was worth, when it existed. Reading a codebase and building documentation files from that is worth its weight in gold. Moving documentation out of the code/original system is a decision you make, for better or worse, but disallowing documentation in the database has no point and will definitely make things more difficult.

I guess it depends on the context and i don't want to start a religious war, but IMHO in any given closed environment a database is a code artifact, not an infrastructure service, and therefore can/should follow the same rules as any other component that can be subjected to CI. This was not easy to achieve before VS 2008 Database Edition (aka DataDude) but things have evolved rapidly since then. It's now pretty easy to orchestrate complex builds including the data persistence layer, and both database objects AND master data (if applicable) can be versioned and packaged as easily as C# classes.

This being said, I agree with you that it seems odd that for SQL Azure they would remove a specific feature that does not appear to cause any performance issue, especially since SQL Azure is basically SQL 2008, it's not like they created the engine from scratch. However extended properties *can* be queried and I suspect that there is no easy way to optimize them; since SQL Azure is a shared platform it would make sense for Microsoft to take out of the equation possible issues with morons building huge resources hogs by abusing extended properties to create meta-crawlers or who knows what. It's a pattern with SQL Azure; they enforce a few rules (like requiring a clustered index) to avoid obvious idiotic use cases. Maybe I'm wrong but I've seen pretty horrible stuff in Access extended properties years ago and I know people can come up with pretty "clever" ideas with them.

TDWTF123

@Ronald said:

Extended properties are a lousy way to document a database. The typical and more effective way to do it is to use the proper features in your database IDE (like SSDT) or any other modeling/design tool that maintains a separate repository for metadata.

I don't do any database design stuff, and I don't really understand the concepts involved, so this is probably a stupid question, but why wouldn't you add a 'metadata' table to every database for metadata? How does that differ from extended properties?

@arh said:

Exactly HOW do you expect me to document my database?

Excel.

@El_Heffe said:

@arh said:
Exactly HOW do you expect me to document my database?
Excel.

Excel web app, to be precise. We want to keep things in the cloud don't we?

Ronald

@TDWTF123 said:

@Ronald said:
Extended properties are a lousy way to document a database. The typical and more effective way to do it is to use the proper features in your database IDE (like SSDT) or any other modeling/design tool that maintains a separate repository for metadata.
I don't do any database design stuff, and I don't really understand the concepts involved, so this is probably a stupid question, but why wouldn't you add a 'metadata' table to every database for metadata? How does that differ from extended properties?

In SQL Server every table has a record in a built-in system table (sys.table) which is managed by the database engine. There are other tables for fields, stored procedures, etc. and a few legacy tables who also contain most of the information (like sysobjects). All of those tables can be queried to "explore" the structure (similar to reflection), build dynamic queries, run maintenance jobs, etc. They offer a decent performance as they are basically regular tables with at most a few thousands records (unless the database is insanely huge).

Extended properties are something else; they are free-form text tags that can be associated with any database object (typically tables), and can contain notes from the developer, or stuff like the x:y position of the table in a schema diagram. They are like the properties found in Word document (author, description, etc). They used to be popular because Microsoft database management tool (Management Studio) has built-in fields in the GUI allowing people to feed extended properties (Description, etc) when they design a table.

It's not a shameful idea to store database documentation in the database. It's like packaging javadoc output with a java application or those .xml files for sandcastle with a .Net application. But the way extended properties are designed, they can be actively queried and this can be a problem because they do not offer a lot of room for performance tuning. That would be like writing a java app that uses its own javadoc output as part of the app; this could be a problem because it's not designed for that (unlike reflection or similar technologies).

So as far as database documentation goes, IMHO it's better not to store it in the database itself but instead rely on a ERD tool, or if push comes to shove to use actual tables (but those are basically as disconnected as a wiki even if they are within the database). But to each his own.

TDWTF123

@Ronald said:

Extended properties are something else; they are free-form text tags that can be associated with any database object (typically tables), and can contain notes from the developer, or stuff like the x:y position of the table in a schema diagram.

Perhaps I wasn't very clear in my question, but what I'm failing to understand is where that differs from storing all that information in a separate table in the database, other than that the data is kept differently.

Ronald

@TDWTF123 said:

@Ronald said:
Extended properties are something else; they are free-form text tags that can be associated with any database object (typically tables), and can contain notes from the developer, or stuff like the x:y position of the table in a schema diagram.
Perhaps I wasn't very clear in my question, but what I'm failing to understand is where that differs from storing all that information in a separate table in the database, other than that the data is kept differently.

Extended properties are stored in a system table (sys.extended_properties) which is roughly a key/value store that contains the following information: the table Object ID (a database-scoped unique integer), the type of property, the name and the value. Because they are attached to a table (or another object), the extended properties will automagically show up when you script that object, they will stick with it if you rename the object, and they will disappear if you delete the object because the database engine is aware of their existence.

There are three main issues with those extended properties that prevent them from scaling (up or out):

They cannot be indexed.</il>
They do not survive replication
They are filled with poorly documented meta-metadata (e.g: "MS_Description" is the name of the property created by Management studio when you fill the Description textbox in the Table Wizard, but nothing prevents you or your third-party tool from using a different "standard" if you want).

On the other hand, if you really want the description of your tables to be in the database but choose to store that information in your own separate table, you will need fancy stuff like DDL triggers if you want to keep your metadata up-to-date without requiring a manual reconciliation process - and you still cannot replicate this data in a reliable way. I can't think of a situation where a separate table can add value, unless you get into advanced data mining capabilities or want to implement sci-fi stuff like natural language queries or ontology-driven reporting. IMHO both extended properties and custom metadata tables are poor ways to maintain database documentation.

Jaime

@arh said:

SSDT looks really nice. I'll check that out. However, the great thing about extended properties/comments is that the documentation stays with the database. When you jump into some system some other people made, you might not have the IDE files and whatnot. It all depends on the environment and users. With table comments, five or ten years down the road people can still see what is what. It's simple and works great for all system sizes.

With any custom software if you can't keep track of you source code and the history of changes, you are screwed. There is no need to come up with a system that is better than nothing, but not quite as good as real source control, because real source control is so easy and cheap. Putting your documentation in the database is like embedding your application documentation in the compiler output.

With any database, the actual instance of the database and the scripts that created it should be two totally different things. The documentation belongs with the latter.

Also, I can rewrite your statement to look like this by just changing the subject:

@arh said:

Git looks really nice. I'll check that out. However, the great thing about putting the source code change history in embedded resources is that the documentation stays with the program. When you jump into some system some other people made, you might not have the IDE files and whatnot. It all depends on the environment and users. With embedded resources, five or ten years down the road people can still see what is what. It's simple and works great for all system sizes.

TheCPUWizard

Ronald's posts on this particular subject are spot on...although it did take him a bit to get to the primary (but not sole) reason they are not in Azure....Replication!