Table Naming Conventions

element_0

I couldn't find anything in the search on this topic but my apologies if it's been posted before.

I recently had to do some bug fixing/feature requests for a .net + SQL Server application written by a contractor at another company who had since gone overseas. Thankfully the code was quite neat, well commented and easy to understand (unlike most maintenance jobs). There was one thing i found strange though about his database naming conventions, all the tables and field names had spaces in them ie. "Customer Request Information" which has a field like "Date Of Request". I usually try to avoid spaces in my field names and table names and use camel casing for tables and usually _ for stored procs. I can see where his approach has merit though, it is more readable and, this is slightly dodgey, but if you were doing some kind of dynamic data binding the db field name would be the label you would actually want for that information at the presentation layer. The only real annoyance i can see is having to put everything in [] but then again in the production code all names should probably be wrapped in [] anyway.

I was just wondering what the general feeling was with database naming conventions? I'm wondering if i should switch my naming conventions to that style..

Pap

Don't put spaces in the table names. Not all DBMSs support it, and if you're creating an application that is for sale it's always a benefit for it to work with the customers' choice of DBMS (abstraction layer!). I don't think there is any reason why all names need to be wrapped in []'s for SQL Server. This is not a substitute for properly escaping any input used in dynamic queries. Just avoid spaces and special characters!

We have a "temporary" database at work where all the tables are prefixed by "tbl_". Drives me friggin nuts.

lpope187

I agree, do not put spaces in database object names. It really just clutters the queries with useless characters. As far as [], I wouldn't call that a requirement - I'd be more concerned with fully qualifying the object names. And never prefix your procs with "sp" - those are reserved for system procedures. You can use it, but it takes longer for the engine to find them.

If you are working with just SQL Server, I'd go with camel casing. If your application is database agnostic, you might want to use an Oracle convention with underscores "MY_TABLE" so that query syntax is closer. I just hope you don't have to use DB/400, because you'll be stuck with 8 character table/field names and 20 character proc names. Nothing bothers me more than having cryptic table/field names like AVPART, STKMM, and BINB.

RayS

Spaces in table names? noooooooooooooooooooooo! Not only is it an ugly hack that should never have been allowed, it also causes confusion as table/column names propogate up through the application layers.

Datatables in the UI that come straight fom the DB should come from views, not raw tables. If the libraries you're using in your front end don't allow for renaming columns, just use something like this in the SQL for your view:

SELECT Table.CamelColumnName AS [Friendly column name]

Use CamelCase or some_underscores instead, as has been said above. Don't dirty the data itself.

ammoQ

@lpope187 said:

Nothing bothers me more than having cryptic table/field names like AVPART, STKMM, and BINB.

In some cases, they are the source of fun, like OPORNO

(short for "order position order number" or so)

Hellz99

@Pap said:

We have a "temporary" database at work where all the tables are prefixed by "tbl_". Drives me friggin nuts.

I feel your pain.

I also hate when each column name begins with the table name. Ex: Table name is "people". Columns are people_lastname, people_firstname, people_dob, people_ssn, etc.

ammoQ

@Hellz99 said:

I also hate when each column name begins with the table name. Ex: Table name is "people". Columns are people_lastname, people_firstname, people_dob, people_ssn, etc.

This is made by people who think that typing _ is less work than typing .

For example:

select people_whatever from people, team where people_team_id = team_id and team_name = 'whocares'

compare that to:

select people.whatever from people, team where people.team_id = team.id and team.name = 'whocares'

To be fair, some really stupid database systems require column names to be unique over all tables.

Hellz99

@ammoQ said:

To be fair, some really stupid database systems require column names to be unique over all tables.

Oh I know, but it still give me pain.

element_0

@Pap said:

We have a "temporary" database at work where all the tables are prefixed by "tbl_". Drives me friggin nuts.

Yeah i know what you mean about that i've had to maintain a few old legacy db's with that particular naming convention, i've also had one where every field in the db was prefixed with "fld_", certainly gives you no extra information but makes you type a heap more in your queries. I could maybe see the tbl_ one if you had lots of queries selecting from views as well as tables and wanted to have some way of distinguishing them in your queries...maybe. but i wouldn't do it.

ammoQ

@element[0] said:

Yeah i know what you mean about that i've had to maintain a few old legacy db's with that particular naming convention, i've also had one where every field in the db was prefixed with "fld_", certainly gives you no extra information but makes you type a heap more in your queries. I could maybe see the tbl_ one if you had lots of queries selecting from views as well as tables and wanted to have some way of distinguishing them in your queries...maybe. but i wouldn't do it.

In most of our systems, we have ~100 tables and maybe 10 views. Tables have no prefix, but the views have the prefix vw_. No particular reason for that, but it doesn't hurt either and it helps to find the object in Toad resp. Tora.

Oscar_L

@element[0] said:

@Pap said:

We have a "temporary" database at work where all the tables are prefixed by "tbl_". Drives me friggin nuts.

Yeah i know what you mean about that i've had to maintain a few old legacy db's with that particular naming convention, i've also had one where every field in the db was prefixed with "fld_", certainly gives you no extra information but makes you type a heap more in your queries. I could maybe see the tbl_ one if you had lots of queries selecting from views as well as tables and wanted to have some way of distinguishing them in your queries...maybe. but i wouldn't do it.

I remember one developer who worked with me on a database design. He insisted on prefixing all table names with "tbl" because, in his words, how would you know that the thing you're working with is a table.

In my opinion, if you're in a database and at any moment you don't know whether or not you're manipulating a table, you need to take your hands off the keyboard and step away from the computer. No sudden moves, because this programmer shoots to kill.

Oscar_L

@element[0] said:

I couldn't find anything in the search on this topic but my apologies if it's been posted before.

I recently had to do some bug fixing/feature requests for a .net + SQL Server application written by a contractor at another company who had since gone overseas. Thankfully the code was quite neat, well commented and easy to understand (unlike most maintenance jobs). There was one thing i found strange though about his database naming conventions, all the tables and field names had spaces in them ie. "Customer Request Information" which has a field like "Date Of Request". I usually try to avoid spaces in my field names and table names and use camel casing for tables and usually _ for stored procs. I can see where his approach has merit though, it is more readable and, this is slightly dodgey, but if you were doing some kind of dynamic data binding the db field name would be the label you would actually want for that information at the presentation layer. The only real annoyance i can see is having to put everything in [] but then again in the production code all names should probably be wrapped in [] anyway.

I was just wondering what the general feeling was with database naming conventions? I'm wondering if i should switch my naming conventions to that style..

The important thing to remember is that you name your tables and columns appropriately. This means that if the table you're working with contains customers, your table is named Customer. Not Customers, tblCustomer, or Customer_Data. Your column names should similarly express what's in the column. Since you're working with SQL Server, take advantage of the environment's expressiveness by using those spaces. After all, as you pointed out, production code should enclose table names in square brackets, and as AmmoQ suggested, your table names should be properly qualified. IIRC, SQL Server will otherwise assume you specifically want it to search the available schemas for the table you want, rather than specifically referencing the particular schema you're using - probably dbo, if the author left things fairly default. Of course, maybe you really do mean that, in which case it's just a matter of knowing what the difference is and why you would choose one rather than the other.

I recommend getting comfortable with spaces and square brackets because many of the concerns others have posted are along the lines of "but what if you want to port it to x?". As AmmoQ again pointed out, some RDBMSs require globally unique table names and a max of 8 characters. Just how backward compatible do you want to get? Do you want to start naming all your tables A, B, C, ..., you know, just in case a hypothetical future or past platform only supports or will support single character, case senstive, alpha only, globally unique tables? What if someone is actually stupid enough to tell you to port the system to that platform? And what if rather than .net you have to port your codebase to COBOL? Better watch those variable names!

lpope187

@Oscar L said:

The important thing to remember is that you name your tables and columns appropriately. This means that if the table you're working with contains customers, your table is named Customer. Not Customers, tblCustomer, or Customer_Data. Your column names should similarly express what's in the column.

In terms of plural versus singular, that is really a matter of style. If you look at the tables within msdb, you'll find tables named both plural and singular (dbo.backupfile vs dbo.sysjobs). But if you look at the INFORMATION_SCHEMA and views, you'll find most are plural. Does this mean that plural is the way to go? I don't think MS has defined a object naming convention for SQL, at least not as well as they have with C# and VB.Net (No hungarian notation). It is more important to choose one naming convention (singular or plural) and stick with it.

@Oscar L said:

I recommend getting comfortable with spaces and square brackets because many of the concerns others have posted are along the lines of "but what if you want to port it to x?". As AmmoQ again pointed out, some RDBMSs require globally unique table names and a max of 8 characters. Just how backward compatible do you want to get? Do you want to start naming all your tables A, B, C, ..., you know, just in case a hypothetical future or past platform only supports or will support single character, case senstive, alpha only, globally unique tables? What if someone is actually stupid enough to tell you to port the system to that platform? And what if rather than .net you have to port your codebase to COBOL? Better watch those variable names!

Here, I think you'll be in the minority but that's just a gut feel and I have no empirical evidence to back it up. I will however provide a reason not to use spaces that isn't related to portability. Quoted Identifiers are not an option for procedure, function, and variable names - they must conform to regular identifier rules. So you could end up with object names adhering to two different conventions which IMO is a bad thing. It is a weak argument, I know.

As for the 8 character limit, I certainly don't recommend designing to the lowest common denominator. I mentioned the 8/30 character limit more out of my disgust with the ancient limits still imposed by AS/400 platform.

BTW, Does anybody know what the AS stands for? Answer: Agonizingly Slow

lpope187

Quoted Identifiers are not an option for procedure, function, and variable names - they must conform to regular identifier rules.

That's a little misleading. The rule applies to variable type objects ie procedure/function parameter names and variables. Actual function and proc names can be quoted.

ammoQ

@Oscar L said:

and as AmmoQ suggested, your table names should be properly qualified.

... no, I haven't suggested that. Honestly, I wouldnt suggest that anyway. I'm pretty confident that SQL*Server - like any sane DBMS - looks in the current schema first.

As AmmoQ again pointed out, some RDBMSs require globally unique table names and a max of 8 characters.

... no, I've pointed out that some retarded DBMSs require globally unique column names.

lpope187

@ammoQ said:

@Oscar L said:
and as AmmoQ suggested, your table names should be properly qualified.
... no, I haven't suggested that. Honestly, I wouldnt suggest that anyway. I'm pretty confident that SQL*Server - like any sane DBMS - looks in the current schema first.

You are correct, SQL Server looks in the schema owned by the user first and then elsewhere. For example, If Bob connects to MyDatabase and issues SELECT MyColumn FROM MyTable, the engine will look for MyDatabase.Bob.MyTable and then to MyDatabase.dbo.MyTable if the first is not found. I suggested qualifying the table name for several reasons. First is clarity as there is no doubt as to what object you are referencing - although there are times when the design requires not qualifying the schema. The second reason is performance as the engine doesn't need to search for the object - the same reason I gave for not naming procs with "sp" + "whatever".

This is one of the rare instances where MS has actually provided guidance in the BOL. They also recommend using the full table/view name. IMO that's a good rule of thumb but not always possible.

ammoQ

@lpope187 said:

The second reason is performance as the engine doesn't need to search for the object

Any idea how much faster it is when you write "select * from myschema.mytable" than "select * from mytable"?

lpope187

This is what I ran on a 2005 SP2 and 2000 SP4 box. On 2005, It yielded 0.327s versus 0.127s. On 2000, it yielded 0.343s vs 0.330s. The difference between 2005 and 2000 is understandable considering the radical difference between schema implementations. In either case, not a big enough difference to worry about unless your box is handling millions of queries per second.

DECLARE @counter int
SET @counter = 0

DECLARE @StartTime datetime
DECLARE @EndTime datetime
DECLARE @uid int

SET @StartTime = GetDate()

WHILE @counter < 10000
   BEGIN
       SELECT @uid = [uid] from Table_1
       SET @counter = @counter + 1
   END

SET @EndTime = GetDate()

SELECT @EndTime - @StartTime

SET @counter = 0
SET @StartTime = GetDate()

WHILE @counter < 10000
   BEGIN
       SELECT @uid = [uid] from dbo.Table_1
       SET @counter = @counter + 1
   END

SET @EndTime = GetDate()

SELECT @EndTime - @StartTime

KenW

@lpope187 said:

This is what I ran on a 2005 SP2 and 2000 SP4 box.

Your numbers aren't valid, though, as you didn't take the caching of the query into consideration.

You can clear the cache with a couple of SET statements. Can't remember which two off the top of my head, though. You can probably find them at http://www.sqlteam.com by searching the forums there.

ammoQ

SELECT @uid = [uid] from dbo.Table_1

What if you replace dbo with the actual user's schema? (Of course you have to create the table there, too)

Still any performace gain or not?

sootzoo

You can clear the cache with a couple of SET statements. Can't remember which two off the top of my head, though.

DBCC DROPCLEANBUFFERS ?

xtremezone

I'm a young developer so take my opinion for what it is.

Personally, I don't like spaces in database object names (nor do I like spaces in filenames). It makes coding for them annoying and I must say looking at large queries can be overwhelming enough without seeing a million [ and ] pairs, just like writing commands is troublesome when every file needs to be enclosed in quotes:

SELECT * FROM [Customer Request Information] WHERE [Customer Request Information].[Date Of Request] = '03/09/2007';

SELECT * FROM CustomerRequestInformation WHERE CustomerRequestInformation.DateOfRequest = '03/09/2007';

cp "Filename with spaces" "Another filename with spaces"

cp FilenameWithoutSpaces AnotherFilenameWithoutSpaces

The attempt to group the words into a single object isn't immediately obvious unless you trace the enclosing characters from the beginning to the end. This is more and more of a problem as the length of the name increases. Quotes are a worse example because there is no directionality, but you get the point.

As far as plural or singular table names, I was taught to use singular and I do prefer it. If you look at the definition of a table (CREATE TABLE syntax) it is defining a single record/instance. For example, a Customer, not Customers.

I also name arrays in the same way because generally I am accessing a single element by it's index.

CUSTOMER *objCustomer = new CUSTOMER[200];

objCustomer[0]->FirstName("John"); // Set customer 0's FirstName to "John"
objCustomer[0]->LastName("Doe"); // Set customer 0's LastName to "Doe"

// Otherwise you would be saying objCustomers[0]->FirstName("John"); which to me isn't logically correct.

In the above example objCustomer represents only an array of data. A collection contains the array of data, as well as it's own properties, such as the number of elements in the collection, etc., and often methods to add and remove elements. In that case, I think I would prefer to call something plural.

Dim objCustomers As Collection

lpope187

@KenW said:

Your numbers aren't valid, though, as you didn't take the caching of the query into consideration.
You can clear the cache with a couple of SET statements. Can't remember which two off the top of my head, though. You can probably find them at http://www.sqlteam.com by searching the forums there.

Good point, I was trying to show only the difference between finding the tables. Interestingly enough, granting the user sysadmin rights (for DBCC DROPCLEANBUFFERS) screws up the default schema ie a Select * from Table_1 actually selects the table in dbo not from the tooling schema I setup. I take off the sysadmin role and the user selects the table from the tooling schema. It could be because because I'm using my local sql server and I'm forever screwing around with it. Maybe security is all hosed.

@ammoQ said:

What if you replace dbo with the actual user's schema? (Of course you have to create the table there, too)

Still any performace gain or not?

Disregarding the caching, here's what I got. User was setup with default schema of tooling.

0.310s SELECT @test = test from Table_1 where uid = 1

0.157s SELECT @test = test from dbo.Table_1 where uid = 1

0.173s SELECT @test = test from tooling.Table_1 where uid = 1

I verified they were selecting the correct table by printing out the value when counter was zero. Two tables with different values for test when uid = 1.

Perhaps this weekend, I'll clean up my local sql server and see if I can get the caching issue worked out. Might give me the excuse to play around with the new EXECUTE AS feature.

Oscar_L

@ammoQ said:

@lpope187 said:
The second reason is performance as the engine doesn't need to search for the object

Any idea how much faster it is when you write "select * from myschema.mytable" than "select * from mytable"?

I was going to say, "of course not, you're not allowed to benchmark MS products without jumping through a lot of hoops," but it looks like I'm a little late for that one.

Oscar_L

@ammoQ said:

@Oscar L said:
and as AmmoQ suggested, your table names should be properly qualified.

... no, I haven't suggested that. Honestly, I wouldnt suggest that anyway. I'm pretty confident that SQL*Server - like any sane DBMS - looks in the current schema first.

As AmmoQ again pointed out, some RDBMSs require globally unique table names and a max of 8 characters.

... no, I've pointed out that some retarded DBMSs require globally unique column names.

Sorry, it looks like I meant to misquote lpope. On the bright side, according to recent press, I could still get a job as a fact checker.

ammoQ

@Oscar L said:

I was going to say, "of course not, you're not allowed to benchmark MS products without jumping through a lot of hoops," but it looks like I'm a little late for that one.

Oooooooops.

lpope187

I finally got around to trying this after clearing the cache after every select. From what I remember the results were similar (I don't have hard numbers offhand). Not a real big advantage to fully qualifying the tables from a performance perspective. Not that it matters much anyway, in a production environment you aren't going to be clearing the cache after every statement anyways, but it gave me an excuse to play around with EXECUTE AS.

ammoQ, I am curious as to why you stated you wouldn't necessarily recommend fully qualifying objects. I'd be interested in hearing your perspective - perhaps you've run across some instances where not qualifying the objects has a clear advantage. Offhand, I can only think of a few times where doing so would be nice, but they don't seem like they would be typical applications.

ammoQ

@lpope187 said:

ammoQ, I am curious as to why you stated you wouldn't necessarily recommend fully qualifying objects. I'd be interested in hearing your perspective - perhaps you've run across some instances where not qualifying the objects has a clear advantage. Offhand, I can only think of a few times where doing so would be nice, but they don't seem like they would be typical applications.

First of all, lazyness. I write a lot of code in PL/SQL, and always writing the schema name means a lot more code to write.

Second, reusability. Some code I write goes into a schema called foobar_abc, but will later be reused in a schema called foobar_xyz. (The naming convention at this company says the schema name is the name of the product + a postfix for the customer; so foobar is the product, abc the customer). Using fully qualified object names means I always have to search-and-replace.

Third, in Oracle, there is no measurable performance gain for fully qualified objects, even when they are in a foreign schema and found through a public synonym.

xtremezone

@ammoQ said:

Third, in Oracle, there is no measurable performance gain for fully qualified objects, even when they are in a foreign schema and found through a public synonym.

It makes sense to me that there wouldn't be a noticable performance gain for fully qualified objects. There are only so many places to check and so many possibilities for the system. The objects would be qualified once before the query is executed and during execution the objects are known. The query itself should take WAY more time than the qualifying of objects, especially for large queries/systems.

It might be possible for unqualified objects to lag the query if the DBMS would search every schema until it found a match and the only match was in the last checked schema of a large system. I'm just speculating since I don't have enough experience to know how the DBMS would qualify the object names; nor how large an enterprise system can actually get.

It might help other coders maintaining queries though if you qualify the names. There are pros and cons, I guess.