Re: What are the design criteria for primary keys?

From: Cimode <cimode_at_hotmail.com>
Date: Fri, 3 Sep 2010 15:21:11 -0700 (PDT)
Message-ID: <bf8bdc43-1f82-4db3-bdcb-acc4d98bd207_at_l17g2000vbf.googlegroups.com>


On 3 sep, 22:48, paul c <toledobythe..._at_oohay.ac> wrote:
> On 02/09/2010 9:20 PM, Bob Badour wrote:
>
>
>
>
>
> > Choosing good *primary keys* and candidate keys is a vitally important
> > *database design* task--as much art as science. The design task has very
> > specific design criteria.
>
> > *What are the criteria?*
>
> > As an experiment, I asked the above question on StackOverflow.com having
> > first verified it wasn't already answered on the site. It is a very
> > important question that has a very simple and clear answer. I had
> > planned to offer a sizable bounty if nobody gave the correct answer
> > after the 1st day and answer it myself if nobody claimed the bounty.
>
> >http://tinyurl.com/ignorancetothefifth
>
> > Five people identified as John Saunders, David Stratton, Claudio Redi,
> > wallyk, and rockinthesixstring voted to close it with nothing
> > approaching a correct answer supposedly because "It's difficult to tell
> > what is being asked here. This question is ambiguous, vague, incomplete,
> > or rhetorical and cannot be reasonably answered in its current form."
>
> > I thought the question was clear enough. The answer, of course, is:
> > uniqueness, irreducibility, simplicity, stability and familiarity.
>
> > To anyone who uses that site as a resource, all I can say is: Caveat
> > lector!
>
> Same goes for most sites to do with db, I'd say.  These five are typical
> of most of those in IT, eg., any of inability to abstract, to focus,
> considerable ability to produce so much verbiage that the overall
> meaning becomes vacuous, in other words over-developed talent for
> obfuscation combined with a protective herd mentality that shouts down
> the occasional sane minority voice which threatens the courtiers by
> pointing out that the Emperor has no clothes.  Apparently it's not
> illegal to use whatever steroids improve the obfuscation faculty.
>
> I believe that the coherent, concise answer (which you have given before
> here I think) could be assessed by any non-IT person who has had an
> education that cultivated a modest level of literacy.
>
> Ever since I first saw it mentioned I've especially liked 'familiarity'
> but it really surprises me when people compare that with
> 'not-invented-here'.  If one has never seen a particular system,
> everything, with a capital E, is not familiar but soon becomes so.
> Also, I don't see anything wrong with every relation having two or more
> candidate keys, the ones that aren't primary being very useful
> connections to useful features that are outside the RM, but innate to
> most physical implementations, such as ordering.  No matter what of
> several definitions one uses for 'surrogate', they will soon become
> familiar.  Celko wants everybody to use published keys, sometimes that's
> advantageous but it's not essential, after all the published keys one
> has never seen before aren't familiar before one adopts them.  Part of
> the power of the Codd's RM (pick any version) is that it's open-ended
> about such bells and whistles.
For a change from the usual sterile rantings about current state of contemporary database design normative habits, and to play somehow devil's advocate, I'd have to say that the concept of *key* is almost orthogonal to RM. After all, the concept of *keys* mostly embodies the IBM context in which RM was theorized by Codd rather than the logical concept of unique tuple identifier in an algebraical sense.

Additionally, defining quasi metaphysical subjective criteria such as *familiarity* somehow sounds wrong as opposed to the rigor of the fundamental mathematics that are supposed to act as a foundation for RM structural information representations. For some reason, I noticed such criteria is oftent an open door to diversion from the concept of *distinguishability* (which at least can be expressed algebraically) which seems more important than others.

Finally, I'd have to say that looking for database theory definition into a board, we know is filled by undeducated audiences is at best pointless, at worst, sterile.

In conclusion, I think uniqueness (I'd add distinguishability), irreducibility, stability are more useful for definition rather than *simplicity*, *familiarity* since the last two can hardly be formalized. Received on Sat Sep 04 2010 - 00:21:11 CEST

Original text of this message