Re: candidate keys in abstract parent relations

From: vldm10 <vldm10_at_yahoo.com>
Date: 22 Jan 2006 12:18:24 -0800
Message-ID: <1137961104.887949.319850_at_f14g2000cwb.googlegroups.com>


David Portas wrote:

> Yes it's a bad idea. Here's a generic example. Consider two tuples:
>
> a tuple = {'A','B','C'}
> a tuple = {'A','B','C'}
>
> They are the same. It serves no purpose to represent the same
> information twice, which is why we can safely say that a relvar must
> have at least one candidate key - the generality of the model isn't

Here we have the following problem: Is it always possible to represent the entity's ( or relationship's) instance in a RM? If you know for some theory which support these ideas (even if it is a philosophical theory) let us know.
I don't think that regarding the real world situation "we can safely say that relvar must have at least one candidate key" as you stated. Regarding some abstract mathematical relation, which doesn't represent reality, your sentence is true.

Example:
In the case that the entity's (or relationship's) instance repeats the same values or in the case when we have the two entity's (relationship's) instances with the equal values and the relation which has "natural key", then it is not possible to represent these entities' (relationships') instances in the relation. But, if the key is "surrogate key", then we can represent mentioned instances in the relation of the RM.

In the case that your database is in a production and that you have "natural key" which can't solve above mantioned problems, you can consider to redesign your application (this can be very complex and with millions of data). This redesign can have the limitations what is another long story.

> Now let's add an add an arbitrary
> artificial key on the understanding that this key has no external
> meaning whatever:

Do you mean key which is not externally verifiable?

>
> a tuple = {1,'A','B','C'}
> another tuple = {2,'A','B','C'}
>
> Isn't this still just as redundant as in our first case? The artificial
> key isn't supplying any extra information in either tuple. In fact the

As you explained here the externally not verifiable keys haven't some great usage. Key should be externally verifiable, but this is not a "must" - it is not even mentioned in the definition of the key.

If we can connect our database objects with reality on some other way, it is fine. (You can see my solution regarding this on www.dbdesign10.com). I also think that here the best term is "identify", i.e. we are trying to identify the real world objects.

> artificial key usually isn't exposed to business users AT ALL. A

Let me give you one example from the business environment but from a different point of view.
A Honda dealer has received 100 new 2006 Honda civic cars and these 100 cars have the list with the 50 same attributes. You should design the Database for these entities using "natural key" and without using Vehicle Identification Number (VIN) or any other "surrogate
key" or "artificial key". And you will see that this is impossible, because all "fields" have the same corresponding values. The point is not that you can't work here with "natural key", the point is that you can't even represent these entities in the RM if you are using "natural key". Same thing if you are selling the Microsoft Windows on CD or the desktop computers...

> relation containg these two tuples won't violate any principle of RM
> but it would cause us a major problem - the redundancy means that we
> have the potential for anomalies to arise and so we may get incorrect
> results from our database.
>
> Possibly {'A','B','C'} represents some entity that we are interested in
> counting - we want to know how many there are in our universe - but we
> don't need to add an artificial key to do that. We can just add another
> attribute ("quantity") and then a single tuple would be suffficient to
> represent any quantity of whatever we are modelling.
>
> So conceptually nothing is gained by adding the artificial key and
> nothing is lost by removing it. An artificial surrogate key may have
> certain practical advantages but an artificial key is redundant in the
> logical model by definition.
>
> The candidate key is defined by a subset of the attributes, so the real
> question is: what attributes do you need to model about an entity? If
> you aren't sure what the key is then you probably haven't answered that
> question properly.
>
> Now let's reconsider your example. You define an album by the tracks
> that it contains. There may be more than one album with the same name
> and in fact with all other attributes in common and those albums will
> be distinguishable only by the fact that they represent a different
> collection of tracks. Your mistake though is to identify a one-to-one
> correspondence between rows in your Album table (information about an
> album) and the album itself (a collection of tracks). Those are two
> different things. The missing piece in your model is a table or
> attribute(s) to supply the information that relates the two entities. I
> would have expected to see a "joining" table to represent the concept
> of a many-to-many relationship between the Album table and the Tracks
> table.
>
> --
> David Portas

Vladimir Odrljin Received on Sun Jan 22 2006 - 21:18:24 CET

Original text of this message