Re: Natural keys vs Aritficial Keys

From: Walter Mitty <>
Date: Sat, 16 May 2009 17:22:09 GMT
Message-ID: <5DCPl.1715$>

"paul c" <> wrote in message news:0drPl.28343$PH1.24283_at_edtnps82...
> Tony Toews [MVP] wrote:
> ...
> Basically, it's just a phony issue foisted by db semi-literates, usually
> product-oriented to the extreme, like the kind who get into arguments
> about whether table names should have an 's' at the end or who grew up
> with 1980's 4GL's and their puerile successor products that couldn't
> support composites. A table/relation/relvar should have exactly the
> attributes and exactly the keys that the app needs. It is just
> self-flagellation to ignore a dbms feature that will generate some of them
> (assuming the dbms doesn't insist on them).
> The so-called 'natural' key for all the airwaybills in a ULD on a typical
> airline flight segment is about seven or eight attributes. When the cargo
> master, at the last minute, offloads a container or two onto the tarmac,
> you don't want a ponderous logic to un-assign it and put it on the plane
> at the next gate. You need a form of indirection, for that matter the
> same is needed in any robotic application such as a freight warehouse.
> Somebody in this thread suggested, if I took the meaning right, that a
> design could actually profit from using a single attribute to stand in for
> a composite key, where both are part of the design. I've seen such
> systems and considered that to be perfectly reasonable, although one of
> the big names in the field told me I was confusing physical with logical.
> But I say the big names aren't right all the time. I think there is such
> a thing as logical efficiency, not just physical efficiency and sometimes
> it is a requirement that a 'time-critical' massive update be coded as
> simply as possible, leaving the slow-poke work-a-day requirements to
> handle the ponderous side. I admit that doesn't happen very often now
> with all the fast machinery around. The decision for me would be based
> entirely on whether the application required it, and nothing to do with
> any dogma.

Thanks for your reply. I can't tell you the number of times when I've read one of your comments, agreed with it completely, and never bothered to write a response.

In the case of stackoverflow, I don't feel quite comforatble dismissing those who hold opinions that differe from mine as "semi literates". The rest of what they write suggest that they have been building databases for some time. At least, that's true for some of them.

I have to admit that I've got a firm opinion on the "tables should be plural" controversy myself. Furthermore, my own practice is inconsistent because when I use the "table_name AS table_alias" construct, the alias is almost always singular, or an abbreviation of a singular, even though my table names are almost always plural. I don't regard that opinion of mine to be product driven, and I don't think it's much more than esthetic, and ever so slightly mnemonic.

I agree that the big names aren't right all the time. The ones who really deserve the respect they claim welcome somebody who questions the truth of the wisdom they expound. If they are proven right, it just confirms the value of their work. If they are proven wrong, everybody wins. The big names who adopt the rhetoric of "because I'm an expert and you're not" have gotten too big for their britches.

I also have seen cases where concocting a simple key to stand in for a very complicated compound key makes sense. There's even an academic term for that, at the conceptual level. It's called "reifying a relationship." The confirmation number I get from Hertz rental, and that I use when I approach the counter at my destination airport, is an example of such a key, although it is obviously not hidden from the users. And the confirmation now takes on the role of an entity, instead of a relationship,. But the existence of such cases is a far cry from "always create an autosequnce field, called ID, for every table".

I think the desire for a unique ID for every row in every table pursues the same dream as accessing rows by address instead of by contents. And while an ID column doesn't pin a row the way an address stored in other rows would, it still reduces the relational model to the graph model in some way that I can't describe.

My big bugaboo with the ID field is that if two rows have different IDs but otherwise identical data, there tends to be a disconect between developers and analysts as to whether this is harmful duplication or not. It gets even worse if the two rows have different ID fields, identical key values, and at least one non key value different. I've actually seen a case like this in the field. In this case, the two rows are contradicting each other in the anlayst's view, but not in the developer's view. There's something wrong with that.

Again, thanks for a considered response. Received on Sat May 16 2009 - 19:22:09 CEST

Original text of this message