Auto Increment is for WHIMPS!!!!

YishTish

Have you ever needed to make an id field that changes automatically every time you add a row? Isn't it frustrating to come up with an algorithm for it?<o:p></o:p>

<o:p> </o:p>

Well, worry you should not! I have inherrited some code that takes care of all that nonsense, that I am sure can be implemented in .NET and Java just as well as PHP:<o:p></o:p>

<o:p> </o:p>

<?<o:p></o:p>

require_once "real_mysql.inc";<o:p></o:p>

/**<o:p></o:p>

* Return the next id of a specified type<o:p></o:p>

*<o:p></o:p>

* @param string $type the group of the desired id<o:p></o:p>

*/<o:p></o:p>

function get_unique_id($site,$type) {<o:p></o:p>

$type_id=array(<o:p></o:p>

'billing'=>1,<o:p></o:p>

);<o:p></o:p>

if(!isset($type_id[$type])) {<o:p></o:p>

return false;<o:p></o:p>

}<o:p></o:p>

$id=$type_id[$type];<o:p></o:p>

$db=new sitedb($site,'master');<o:p></o:p>

$sql="update unique_id set row_id=last_insert_id(row_id+1) where id=$id";<o:p></o:p>

$db->query($sql);<o:p></o:p>

$ret=$db->last_inserted();<o:p></o:p>

return $ret;<o:p></o:p>

}<o:p></o:p>

?><o:p></o:p>

<o:p> </o:p>

On the other hand, you can try altering the table to have the field auto_increment…. I am sure it takes less CPU time, and is easier to maintain. Up to you. <o:p></o:p>

<o:p> </o:p>

Now – back to coding….<o:p></o:p>

<o:p></o:p>

Nelle

@YishTish said:

require_once "real_mysql.inc";

i would like to see what is in the "fake_mysql.inc" or does he have a "float_mysql.inc" as well ...

TheRider

I needed to implement several of these auto-incrementing algorithms. Of course, most of current database systems offer some way of automagically assigning unique ids ("sequencing") to new rows in a table, but only some of them are useful, at least in some contexts.

As far as I know, sequencing is implemented in Access and IBM DB2 such that you define a field in a table as being the unique index and auto-indexing, and when inserting new rows, you just leave out that field, and the database fills it in automatically. The problem is that you don't know what id has been assigned until "after the fact". If you don't have other unique field combinations in that table, then you don't have an easy way to find out the exact id assigned to the last row inserted. Which means that you can't easily insert rows in several, linked tables at once.

Oracle is the only database I know which has a way to find out a new id *before* adding the row, and thus you are able to use the new id both for inserting the row in table 1, and use that id as a foreign key to table 1 in a new record in table 2.

So, yes, I implemented sequencing mechanisms in several different projects several times, simply to be sure that whatever underlying database we use, this will always work, not only under Oracle. In that sense: No, I don't consider this a WTF.

PJH

@TheRider said:

The problem is that you don't know what id has been assigned until "after the fact".

Given a reasonable database schema, I'm failing to see the problem.

Do you have a live/contrived example where you feel this is a problem?

m0ffx

@Nelle said:

@YishTish said:
require_once "real_mysql.inc";

i would like to see what is in the "fake_mysql.inc" or does he have a "float_mysql.inc" as well ...

It's worse than than. real_mysql.inc contains some 'improvements' to the normal MySQL by, of course, RealMedia.

AbbydonKrafts

@TheRider said:

and thus you are able to use the new id both for inserting the row in table 1, and use that id as a foreign key to table 1 in a new record in table 2.

Some other databases have a query for retrieving the last ID generated, such as SELECT @@identity

TheRider

@PJH said:

@TheRider said:
The problem is that you don't know what id has been assigned until "after the fact".
Given a reasonable database schema, I'm failing to see the problem.
Do you have a live/contrived example where you feel this is a problem?

Yes, I do

Imagine the following two tables:

table ORDER, with fields id (autoincrement), customerid (foreign key to CUSTOMER), orderdate

table ORDERENTRY, with fields id (autoincrement), orderid (foreign key to ORDER), articleid (foreign key to ARTICLE), itemcount, price

To enter a new order with an order entry into the database, you have to do something like:

insert into ORDER (customerid, orderdate) values ('cust-1234', current date);

insert into ORDERENTRY (orderid, articleid, itemcount, price) values (?!?, 'article-5678', 5, 99.95);

==> what is the number that you insert into the orderid field? You cannot reliably find out what id was assigned to the new ORDER row, particularly if, for example, two orders have been entered for the same customer on the same date. Except, of course, if there is some mechanism like the one AbbydonKrafts is describing here:

@AbbydonKrafts said:

@TheRider said:
and thus you are able to use the new id both for inserting the row in table 1, and use that id as a foreign key to table 1 in a new record in table 2.
Some other databases have a query for retrieving the last ID generated, such as SELECT @@identity

I don't know by heart, but in oracle you would do something like this:

select new-sequence-number for table ORDER -- the exact syntax can surely be found in the Oracle manuals...

==> this would return a new uniqe key for said table, such as 4711.

insert into ORDER (id, customerid, orderdate) values (4711, 'cust-1234', current date);

insert into ORDERENTRY (orderid, articleid, itemcount, price) values (4711, 'article-5678', 5, 99.95);

Ragnax

@TheRider said:

Oracle is the only database I know which has a way to find out a new id before adding the row, and thus you are able to use the new id both for inserting the row in table 1, and use that id as a foreign key to table 1 in a new record in table 2.

I'm hoping the Oracle guys were smart enough to increment the ID the moment it is requested, because otherwise: race conditions ahoy!

DB        :  new ID;
DB->ProcA :  ID;
DB->ProcB :  ID;
ProcA->DB :  (ID, ...);
DB        :  new ID;
procB->DB :  (ID, ...);   // Uh-oh!

drbhoneydew

@PJH said:

Given a reasonable database schema...

Therein is often the problem. In my last job, I had to integrate with a rubbish call logging "system" that didn't do autoincrements. Instead, it stored a "last_used" value in a totally separate database. It was one of the many causes of my work-related Tourette's at the time.

KattMan

@TheRider said:

I don't know by heart, but in oracle you would do something like this:

select new-sequence-number for table ORDER -- the exact syntax can surely be found in the Oracle manuals...

==> this would return a new uniqe key for said table, such as 4711.

I don't know oracle, but doesn't this give the opportunity for two process to have the same ID?

Process1 askes for next ID, Process2 asks for next ID. Process1 saves, Process2 saves.

Do they have the same ID? If they don't, what happens to the ID process1 grabbed if it decides not to save or fails? This method seems to cause problems.

MS SQL uses @@Scope_Identity (don't use @@identity as that is not session specific) to grab the latest ID the current session created. So you insert your parent, request it's ID and save the child. No loss of ID's and no stepping on other process and all this without abiguity.

Nelle

@TheRider said:

insert into ORDER (customerid, orderdate) values ('cust-1234', current date);
insert into ORDERENTRY (orderid, articleid, itemcount, price) values (?!?, 'article-5678', 5, 99.95);

MySQL :

START TRANSACTION;
INSERT INTO ORDER (customerid, orderdate) VALUES ('cust-1234', current date);
SELECT @order_id = LAST_INSERT_ID();
INSERT INTO ORDERENTRY (orderid, articleid, itemcount, price) VALUES (@order_id, 11, 1, 2.0)
COMMIT;

TransactSQL:

DECLARE @order_id INT;
BEGIN TRANSACTION;
INSERT INTO ORDER (customerid, orderdate) VALUES ('cust-1234', current date);
SELECT @order_id = SCOPE_IDENTITY();
INSERT INTO ORDERENTRY (orderid, articleid, itemcount, price) VALUES (@order_id, 11, 1, 2.0)
COMMIT TRANSACTION;

So what exactly is the problem ?

Daid

@Nelle said:

So what exactly is the problem ?

Lack of knowledge of the tools people use.

PJH

@TheRider said:

To enter a new order with an order entry into the database, you have to do something like:
insert into ORDER (customerid, orderdate) values ('cust-1234', current date);

... retreive id from last inserted record and use in the next statement...

insert into ORDERENTRY (orderid, articleid, itemcount, price) values (?!?, 'article-5678', 5, 99.95);
what is the number that you insert into the orderid field?

The one retrieved after the first insert. I'm still failing to see the problem.

@TheRider said:

You cannot reliably find out what id was assigned to the new ORDER row,

In which database product? MySQL for example provides mysql_insert_id() which would be used in the above situtation

particularly if, for example, two orders have been entered for the same customer on the same date.

The two orders would be presumably be created in distinct threads/whatever with their own order ids and order items (and results from mysql_insert_id() since they are per connection, and one would assume the insert then id-retrieval above would be in the same context.) I don't see a conflict with this example.

MarcB

@m0ffx said:

It's worse than than. real_mysql.inc contains some 'improvements' to the normal MySQL by, of course, RealMedia.

Uhoh...

INSERT INTO sometable {BUFFERING.... BUFFERING... BUFFERING...} (id, somefield) {WOULD YOU LIKE TO UPGRADE TO MYSQLPLUS?} VALUES (12, '...{ARE YOU SURE YOU DON''T WANT TO UPGRADe!?!?!?!?! IT REALLY IS SO MUCH BETTER});

jnz

I don't think we can lay all the blame on the original coder. The MySQL manual actually [url=http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_last-insert-id]suggests this technique[/url] (scroll down a bit)

morbiuswilters

@jnz said:

I don't think we can lay all the blame on the original coder. The MySQL manual actually suggests this technique (scroll down a bit)

It looks like that's just a method to emulate sequences which normally shouldn't be needed in MySQL. Auto-increment works just fine as-is and I can't imagine a situation where "rolling your own" would be preferable. Just because MySQL provides a way to do it doesn't mean it's a great idea.

PJH

@jnz said:

The MySQL manual actually suggests this technique (scroll down a bit)

Could you identify the start of the section you allude to? The closest I can see is the part starting

Important

          If you insert multiple rows using a single
          <code class="literal">INSERT</code> statement,
          <a href="http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_last-insert-id"><code class="literal">LAST_INSERT_ID(<wbr>)</code></a> returns
          the value generated for the <span class="emphasis"><i>first</i></span>
          inserted row <span class="emphasis"><i>only</i></span>.</blockquote></p><p>Which isn't a reason for the code used in the OP.&nbsp;</p>

AisA

@TheRider said:

select new-sequence-number for table ORDER -- the exact syntax can surely be found in the Oracle manuals...
==> this would return a new uniqe key for said table, such as 4711.
insert into ORDER (id, customerid, orderdate) values (4711, 'cust-1234', current date);
insert into ORDERENTRY (orderid, articleid, itemcount, price) values (4711, 'article-5678', 5, 99.95);

Oracle has it much easier. You can always do...

insert into table ... returning fieldlist into variablelist;

eg: insert into ORDER (customerid, orderdate) values ('cust-1234', current date) returning id into varID;

the returning clause will fill the variables after the row has been inserted, all default fields populated and all triggers have run

Cheers!

ammoQ

@Ragnax said:

I'm hoping the Oracle guys were smart enough to increment the ID the moment it is requested, because otherwise: race conditions ahoy!

You must be joking.

Of course sequences (that's how those number generators are called) are smart enough. In fact, it would be hard (though not impossible) to build something like that on your own. Sequences are incremented the moment they are queried, but the access to the sequence is non-blocking. So another transaction can query the next number withouth ever creating a collission. If the transaction is rolled back, the number is still gone, obviously. If a sequence is queried very often, it can be set to "caching"; so if you retrieve one number, the sequence is increased by e.g. 20; the next 19 numbers can than be retrieved with less effort. Sequences can have a lower bound and an upper bound, they can cycle or simply stop if they reach the upper bound.

TheRider

@Nelle said:

MySQL :
SELECT @order_id = LAST_INSERT_ID();
TransactSQL:
SELECT @order_id = SCOPE_IDENTITY();
So what exactly is the problem ?

Ok, I fold.

TRWTF is, then, that there is no standardized way to do it (like, in a SQL standard) so that every db maker had to roll their own. And I still maintain that MySQL and TransactSQL's way is more quirky than Oracle's.

@Ragnax said:

I'm hoping the Oracle guys were smart enough to increment the ID the moment it is requested, because otherwise: race conditions ahoy!

Yes, they were. That function is not implemented as a simple SELECT MAX(id)+1 FROM TABLE. Thanks, ammoQ, for more in-depth.

AbbydonKrafts

@TheRider said:

TRWTF is, then, that there is no standardized way to do it (like, in a SQL standard) so that every db maker had to roll their own. And I still maintain that MySQL and TransactSQL's way is more quirky than Oracle's.

Oh, really? Please illustrate how Oracle isn't fubar'ed with all their nonsensical crap like date handling.

TheRider

Don't get me wrong. I ain't saying Oracle is the best database on the market. Just that I like its implementation of sequences best. Thats all.

jnz

@PJH said:

@jnz said:
The MySQL manual actually suggests this technique (scroll down a bit)
Could you identify the start of the section you allude to?

I was referring to the following section:

If expr is given as an argument to LAST_INSERT_ID(), the value of the argument is returned by the function and is remembered as the next value to be returned by LAST_INSERT_ID(). This can be used to simulate sequences:
Create a table to hold the sequence counter and initialize it:
mysql> CREATE TABLE sequence (id INT NOT NULL);
mysql> INSERT INTO sequence VALUES (0);
Use the table to generate sequence numbers like this:
mysql> UPDATE sequence SET id=LAST_INSERT_ID(id+1);
mysql> SELECT LAST_INSERT_ID();
The UPDATE statement increments the sequence counter and causes the next call to LAST_INSERT_ID() to return the updated value. The SELECT statement retrieves that value. The mysql_insert_id() C API function can also be used to get the value. See Section 24.2.3.37, “mysql_insert_id()”.

I assume the $db->last_inserted() line from the PHP code calls down to mysql_insert_id().

morbiuswilters

@jnz said:

I assume the $db->last_inserted() line from the PHP code calls down to mysql_insert_id().

Obviously it does. Could you please answer my question and tell me why this is a problem? Nobody should be creating their own sequences like this.

jnz

@morbiuswilters said:

Could you please answer my question and tell me why this is a problem? Nobody should be creating their own sequences like this.

Tell you why what is a problem? I don't know why somebody would want to create their own sequence like this except if they found the section I quoted and thought, "Here is a solution to my problem from an authoritative source, complete with code snippets. I guess I'll use it."

ammoQ

@AbbydonKrafts said:

Oh, really? Please illustrate how Oracle isn't fubar'ed with all their nonsensical crap like date handling.

What's wrong with Oracle's date handling?

AbbydonKrafts

@ammoQ said:

What's wrong with Oracle's date handling?

In order to insert a date, it has to be cast to a string of a certain format, then use TO_DATE() with the format provided.

@TheRider said:

TRWTF is, then, that there is no standardized way to do it (like, in a SQL standard) so that every db maker had to roll their own.

SQL is just a grammar. The reason that there is no "standardized" way to do it is that identity columns themselves are non-standard. MS SQL uses identity columns. Oracle uses sequences instead. DB2 supports both now, but originally had neither and required using a special function called GENERATE_UNIQUE. Postgres uses some weird combination of a generated column and a sequence. Mysql is kind of similar to MS SQL Server because, well, they pretty much copied SQL Server.

Given all of these differences in addition to the fact that there are some fanatical whiners out there who eschew surrogate keys and insist on "natural" keys, it's not hard to see why the standard doesn't include it. It's simply out of scope. You think it should be standard because you're accustomed to one specific product.

morbiuswilters

@Aaron said:

Mysql is kind of similar to MS SQL Server because, well, they pretty much copied SQL Server.

Really? You got some proof to back that up? Considering MySQL and SQL Server are probably about as far apart as you can get in terms of design, I'm calling bullshit.

@Aaron said:

Given all of these differences in addition to the fact that there are some fanatical whiners out there who eschew surrogate keys and insist on "natural" keys, it's not hard to see why the standard doesn't include it. It's simply out of scope. You think it should be standard because you're accustomed to one specific product.

What are you babbling about? TheRider said the exact same thing -- this should already be part of the SQL standard so each vendor didn't have to roll-their-own. The reason it's not part of the standard is because auto-increments (despite being quite fantastic) are not allowed in a strict relational database.

bstorer

@morbiuswilters said:

What are you babbling about? TheRider said the exact same thing -- this should already be part of the SQL standard so each vendor didn't have to roll-their-own. The reason it's not part of the standard is because auto-increments (despite being quite fantastic) are not allowed in a strict relational database.

Yeah, but strict relational databases, like everything else that clings to nice, clean theory, are completely worthless. I say we get our heads out of our ass and define a useful standard.

morbiuswilters

@bstorer said:

Yeah, but strict relational databases, like everything else that clings to nice, clean theory, are completely worthless. I say we get our heads out of our ass and define a useful standard.

I agree, I'm just pointing out why it's not in the SQL standard. There's an old saying in Massachusetts: In theory, theory and practice are the same. In practice, they are not. You'd probably have it in Virginia, too, if anyone there had any teeth left to actually say it with.

bstorer

@morbiuswilters said:

@bstorer said:
Yeah, but strict relational databases, like everything else that clings to nice, clean theory, are completely worthless. I say we get our heads out of our ass and define a useful standard.
I agree, I'm just pointing out why it's not in the SQL standard. There's an old saying in Massachusetts: In theory, theory and practice are the same. In practice, they are not. You'd probably have it in Virginia, too, if anyone there had any teeth left to actually say it with.

And I'm just saying I'm sick of the stupid standards that don't have any practical use. Also, I have teeth right here in this jar next to me.

medialint

When we implemented (peoplesoft) eRecruit a few years ago for reasons I will probably never understand the numbering method for the requisitions went like this:

1. Increment a counter

2. Take that number, zero pad it to 7 places and ...

3. Reverse it

So in sequence:

1000000

2000000

3000000

...

0100000

1100000

2100000

etc.

MarcB

@medialint said:

2. Take that number, zero pad it to 7 places and ...

One client was quite upset when the ecommerce system I'd built for them was issuing invoice numbers in the low 100's (using the order's DB primary key as the invoice number). They were concerned that their clients would see the low numbers and take them for amateurs who'd just started out (never minding this company celebrated its 100th anniversary not too long ago).

So the solution was to reset MySQL's auto_increment counter for that table to start in a random 100,000's spot.

So maybe the reverse/zero-pad thing was eRecruit's way of solving their version of this problem.

Or maybe their OCR scanner could only go R->L on the wooden table the requisitions were placed on after printing.

Marshall

I'm using (open source) Firebird which implements SQL 2003's functionality in this respect.

(From Google: see http://www.sigmod.org/record/issues/0403/E.JimAndrew-standard.pdf .. SIGMOD Record, Vol. 33, No. 1, March 2004 p124

Sequence Generators

.. . or one can get SQL to generate unique values automatically. SQL:2003 provides a new feature, sequence generators, for this purpose.

CREATE SEQUENCE PARTSEQ AS INTEGER
START WITH 1
INCREMENT BY 1
MINVALUE 1
MAXVALUE 10000
NO CYCLE

(Marshall again)

A single call of "NEXT VALUE FOR PARTSEQ" gives you a unique value which can be then used multiple times to insert related records (the original question). Including "NEXT VALUE FOR PARTSEQ" in a command that operates against multiple records gives a different number for each record - I used this aspect in a script used to move data from a heritage table with multiple keys to one with a sequence no as primary key.

SQL 2003 also implements auto-generation via "Identity Columns" (p125) but my need is program-driven so I prefer to retain control in the application code.

AbbydonKrafts

@Marshall said:

"NEXT VALUE FOR PARTSEQ"

WTF. At least Whoracle provides a saner way: PARTSEQ.NEXTVAL. That's much easier to use in Inserts, etc.

@medialint said:

Or maybe their OCR scanner could only go R->L on the wooden table the requisitions were placed on after printing.

[img]http://www.zweg.com/ico/potd.gif[/img]

Eternal_Density

@morbiuswilters said:

There's an old saying in Massachusetts: In theory, theory and practice are the same. In practice, they are not.

I first heard that from a Romanian who lives in Canada.

AisA

@AbbydonKrafts said:

@ammoQ said:
What's wrong with Oracle's date handling?

In order to insert a date, it has to be cast to a string of a certain format, then use TO_DATE() with the format provided.

There is a global setting variable, forgot the name, but if the date is in that format, you don't need to do what you said above. In my system it is dd-mon-yyyy. So I always use, say '1-Jan-2008' directly and Oracle handles it correctly. As for the use of to_char and to_date, IMO, it is more error free and a better option. Imagine someone entering 5/6/2008. Is that 5th June or 6th May. Using format strings it makes it clearer.

Cheers!

medialint

@MarcB said:

One client was quite upset when the ecommerce system I'd built for them was issuing invoice numbers in the low 100's (using the order's DB primary key as the invoice number). They were concerned that their clients would see the low numbers and take them for amateurs who'd just started out (never minding this company celebrated its 100th anniversary not too long ago).
So the solution was to reset MySQL's auto_increment counter for that table to start in a random 100,000's spot.
So maybe the reverse/zero-pad thing was eRecruit's way of solving their version of this problem

No this wasn't an issue at all. These numbers were job requisitions so if you need a Senior Lead Janitor you'd use the system to post this job opening. There was some exposure to the outside world of these numbers, especially in regard to sites like monster (because when first deployed--since fixed--you couldn't just hyperlink to the job post on our site you had to instruct people to search for this number) but the reason I was told had to do with the way they were storing the data. Because the amount of data is massive (and by massive I mean you're collecting a few million job applications a year) having nice ordered sequential indices would create clusters of data that would clog up the proverbial tubes or something. I was glad I wasn't on the implementation team but it's not like I haven't [url=http://www.sumtotalsystems.com/]more perplexing data practices[/url] since then ...

ammoQ

@AisA said:

There is a global setting variable, forgot the name, but if the date is in that format, you don't need to do what you said above.

NLS_DATE_FORMAT

It's pretty okayish to use it in ad-hoc statements, I guess. (Since I've learned typewriting (*), a little to_date(xx,'dd.mm.yyyy') means nothing to me)

You would be pretty ignorant to rely on NLS_DATE_FORMAT in your programs, though.

BTW, a few years ago (as in: March, 1995), I learned (*) that it's not a good idea to convert Oracle dates to strings and back. Maybe it's because the ä in "März" (German for "March") cause a little program to fail a little bit.

(*) the hard way. You guess.