Re: Clean Object Class Design -- What is it?

From: Bob Badour <bbadour_at_golden.net>
Date: Thu, 19 Jul 2001 13:53:10 -0400
Message-ID: <0gF57.164$rT5.43127953_at_radon.golden.net>


>> >What should be used for a primary key in a relational design? Is it
 better
>> >to use real-world values (e.g. SSN, although this would only work for US
>> >employees) or is it better to generate a key?
>>
>> Fabian Pascal once posted a comprehensive list of primary key selection
>> criteria. I believe there were five elements to it, but I am not sure
>> whether I remember it correctly:
>>
>> Familiarity, Simplicity, Size, Generality, Stability ? I just cannot
>> remember whether that's the list; although, at the moment, it seems close
 to
>> me but perhaps not completely correct. (If Lee is watching, maybe he can
 ask
>> Fabian for a refresher.)
>
>I checked the firstsql site and couldn't find anything (a search feature
>would be nice).

That wouldn't help. If I recall correctly, Fabian posted the list on a compuserve forum six or seven years ago and not on his (much more recent) website. He may have changed his opinions regarding primary keys entirely in the intervening years.

I see that Chris Date considers the choice of primary key "essentially arbitrary."

>But practically speaking, aren't multi-valued or
>string-based keys much slower than using integer values?

Hence the criteria of Simplicity and Size above.

Compression techniques for indexing reduce the performance differences in more recent DBMS products, and hashing also reduces the performance difference.

In any case, candidate keys are logical constructs. Nothing prevents the DBMS from exposing one thing as the key to users and internally using another thing for performance reasons.

>I think a case
>could be made that almost any domain attribute would fail the "stability"
>test.

Stability is not the only criterion. See above.

>Someone told me once that you should always generate
>integer keys when using a RDBMS,

Have you ever entertained the idea that the person was just a fool?

>> I find it annoying that SQL combines a logical constraint (uniqueness)
 with
>> a physical structure declaration (create index).
>
>I find SQL quite cumbersome in general. Fabian Pascal doesn't have much
 good
>to say about it either, but we are stuck with it.

For now. I still have hope that we will eventually cut the dead weight loose. I know that legacy SQL systems will always remain, but I hope another truly relational language eventually overtakes it in terms of commercial acceptance.

>for application
>development I think a better alternative is needed.

Of course. A truly relational DBMS is needed.

>> >How is generating a key better than the ODBMS equivalent of a
>> >system-generated Oid?
>>
>> The non-relational ODBMS always creates the OID and requires one to use
 the
>> OID whereas the relational ODBMS allows one to use a familiar key when
>> available and appropriate.
>>
>> The OID approach encourages designers to ignore other identifying
>> attributes, which can sometimes lead to disastrous duplication.
>
>My feeling is that SQL databases also require a lot of application code to
>keep duplication down, and some duplication is unavoidable without expert
>audits of the database.

Not if you declare the uniqueness of all identifying attributes.

>I was surprised to find that Oracle does not
>advocate 3NF (see http://www.firstsql.com/dbdebunk/fp4a.htm).

I'm not surprized. I long ago recognized the widespread ignorance in the industry and the role vendors play in reinforcing ignorant prejudice and misconception.

>One common
>mistake in building relational databases is using inappropriate keys.

I disagree. On what basis do you make such an extraordinary claim?

>generated keys like OIDs are really useful for implementation issues like
>efficient Set operations

Could you define what you mean by a Set operation? Are you referring to assignment, to set operators (union, intersection, join etc) or to set-at-a-time operations?

>primary keys are really important for logical
>issues like uniqueness of values.

All candidate keys are really important for logical identity. The distinguishing feature of a primary key is a convenient short-hand for declaring referential integrity; however, I see no reason the DBMS cannot support referential integrity for references to any candidate key.

>Maybe instead of "post-relational" people should be talking about
>"post-SQL".

There is no such thing as post-relational. So far, all of the non-relational crap just regresses to pre-relational nagivational models.

> why not use a real
>language? ;-)

Indeed. I suggest we overcome "impedance mismatch" by improving our application programming languages rather than retarding our DBMSs.

>But more seriously, I had never heard of 5NF until I saw an earlier post
 you
>made -- do you have an online reference I can look at? My database textbook
>is rather old and only talks about BCNF and 4NF (Elmasri/Navathe 1989).

I am certain that 5NF predates 1989 by a decade or more. You probably just need a better textbook. You cannot go wrong with Chris Date's _An Introduction to Database Systems_ which is currently in its seventh edition.

I see that Toby Teorey updated his book on the subject in 1998. I have an older edition that covers normalization quite well.

>One
>of the comments it makes (I just noticed this morning for the first time)
 is
>"Recent research in the relational model attempts to remove [restrictions
 on
>mult-valued and composite attributes] by using the concept of *non-first
>normal form* or *nested* relations." (their emphasis).

I would not suggest using NF^2 databases. If you need to nest a relation, use a relation valued domain.

>In using GemStone/S
>for the past 10 years, I have come to depend on the fact that I can have an
>object with a Set-valued attribute. How does your position that the
>relational model subsumes the OO model address set-valued attributes?

All domains have internal structure. Domains can have or can be relation values.

>Does
>PostGreSQL (for example) support set-valued attributes?

I doubt it. Does it support user-defined data types of any kind?

>How does SQL deal
>with these things (and if non-standard extensions are required, why use SQL
>at all?).

SQL provides little or no support for domains at all. SQL is the most commercially accepted DBMS language based on relational principles; other than that, I see no reason to use SQL.

I consider SQL the most disastrously successful experiment in our industry. Received on Thu Jul 19 2001 - 19:53:10 CEST

Original text of this message