Re: UUID vs Compound Key

From: Roy Hann <specially_at_processed.almost.meat>
Date: Mon, 12 Jul 2010 18:19:23 -0500
Message-ID: <q_6dnfLVvNpmP6bRnZ2dnUVZ8o2dnZ2d_at_pipex.net>


Mark S. (UK) wrote:

> Hi all,
> I'm currently designing a system that has many databases (1 at each
> "site"), and the data from each site will be periodically backed up onto
> a single centralised database. The centralised database will also allow
> users to access the data via the Internet.
>
> Each "site" has a unique alphanumeric identifier. I use this to
> uniquely identify the "site" in the centralised database by making a
> compound key out of each table's primary key plus the identifier:
>
> For example:
>
> SiteUserTable
>
> sitekey CHAR(16),
> id AUTO INCREMENT,
> firstname VARCHAR(255),
> surname VARCHAR(255),
> timestamp TIMESTAMP,
> etc..
> PRIMARY KEY sitekey, id;
>
> Please note this is not the actual table definition - it's for
> illustration only. Also, there are *many* tables using this composite key.
>
> So basically, I end up with data like this on the centralised database:
>
> SITE_1, 1, Fred, Flintstone
> SITE_1, 2, Wilma, Flintstone
> SITE_2, 1, Homer, Simpson
> SITE_3, 1, Foo, Bar
> SITE_1, 3, Barny, Rubble
>
> Now, the only downside I've found with this is that because I'm using a
> compound key, it makes me queries more complicated. This is fine, but
> I'm wondering if I'm missing a trick.
>
> If I use GUID's then I can just use that as the primary key, and I won't
> have to constantly use the sitekey.
>
> However, I'm then storing a much larger primary key and I've read this
> can introduce performance problems amongst other things.
>
> Does anyone have any thoughts on this? Does the way I'm doing it seem
> sensible? I have a mock up being tested at a few sites and so far so
> good - there's been no issue with replication so far, and I don't expect
> there to be because the compound key is guaranteed to be unique because
> of the sitekey (and there's only ever one database per sitekey).
>
> PS - I would be wrong to rely on the ID being auto-incrementing wouldn't
> I? My original replication algorithm worked by looking at the latest ID
> I had, and then requesting all ID's greater than that. I've since
> changed it to use timestamps.. Better?

Ignoring all the myriad objections to spurious synthetic keys and how they will require additional constraints to prevent otherwise undetectable duplication/contradiction of facts, I will just point out that UUIDs come in five "flavours". If you want to have a timestamp element that suggests you'd want to be looking at a Version 1 (MAC address) UUID. What kind of UUIDs does your system provide?

If you are seriously concerned about the performance implications of a 16-byte key versus a smaller one, you must have a system of unprecedented efficiency. Normally it takes just one half-wit programmer to use a cursor to write some row-oriented code and every transaction it does will cost you a thousand times more than the extra bytes of the UUID ever will.

-- 
Roy
Received on Tue Jul 13 2010 - 01:19:23 CEST

Original text of this message