From: tom <tmb at ncal point verio point com x@x.x>
Subject: Re: automatically calculating ID field on INSERT
Date: 2000/06/23
Message-ID: <pb8aegb7nb1.fsf@aimnet.com>#1/1
Sender: tmb@aimnet.com
References: <8di290$iq1$1@sunsystem5.informatik.tu-muenchen.de> <8dij02$gnu$1@nnrp1.deja.com> <38FD41AB.813F1C40@elbanet.co.at> <8dkdgl$gin$1@nnrp1.deja.com> <3950CFB9.869D8FC7@ieee.org> <3951B725.F95EA9F6@hp.com>
Content-Type: text/plain; charset=us-ascii
X-Complaints-To: abuse@verio.net
X-Trace: nuq-read.news.verio.net 961818655 206.184.58.82 (Sat, 24 Jun 2000 03:50:55 GMT)
Organization: Verio
User-Agent: Gnus/5.0803 (Gnus v5.8.3) XEmacs/21.1 (Arches)
MIME-Version: 1.0
NNTP-Posting-Date: Sat, 24 Jun 2000 03:50:55 GMT
Newsgroups: comp.databases.theory


Ruud de Koter <ruud_dekoter@hp.com> writes:
> William Boyle wrote:
> > Joe Celko wrote:
> > > Here is the SQL-92 version:
> > >
> > > INSERT INTO MyTable (id, a, b, c, ...)
> > > VALUES (COALESCE(SELECT MAX(id) FROM MyTable) + 1, 1),
> > >           new_a, new_b, new_c, ..);
> > [...]
> > IdTable has 2 columns: tablename (char or varchar) and id (number or
> > int)
> > 
> >         update IdTable set id = id + 1 where tablename = 'MyTable';
> >         insert into MyTable (id, a, b, c, ...)
> >                 values ((select id from IdTable where tablename='MyTable'),
> >                         new_a, new_b, newc, ...);
> >         commit;
> > 
> 
> Though this approach will work in a lot of situations, is main
> drawback is that in the end it lacks scalability. For every insert
> being done the IdTable will be locked, and other transactions 
> will have to wait till the transaction commits.

In most applications, it isn't necessary for IDs to be sequential and
without gaps.

A good solution is therefore to preallocate a range of IDs to each
client.  Each client just consumes its range of IDs as it creates new
records, and when they are used up, it allocates a new range of IDs.

Even simpler, you can pick just pick a random 32 bit or 64 bit number
as an ID on insert; if the ID is a primary key, if you accidentally
pick an existing ID, you will get an error from the database and you
simply try the insert again with another random number.  If you use a
good random number generator, you get the additional advantage that
IDs cannot be forged easily (a problem with social security numbers,
as well as most employee id numbers, which are allocated
sequentially).

That kind of approach is going to be faster than any of the SQL-based
solutions and it will work even with simple relational databases like
MySQL.  Furthermore, in many applications, the client that is doing
the insert actually needs to know the ID that it just allocated,
something that neither of the methods suggested by Celko or Boyle
addresses (although they could be patched up to handle that case).

Tom.


