Re: identity columns

From: Jan Emil Larsen <jel_at_g-it.dk>
Date: Sun, 13 Jan 2002 11:59:16 +0100
Message-ID: <3c41686e$0$89082$edfadb0f_at_dspool01.news.tele.dk>


"--CELKO--" <71062.1056_at_compuserve.com> skrev i en meddelelse news:c0d87ec0.0201121445.4c0ce7b7_at_posting.google.com...
> >> I disagree. The entity type and the base table derived from it
> should by named as an abstract representation of its instances, thus a
> singular noun, such as Employee.
>
> I noticed that you said "representation of its instances" and not
> "set" or "class" and that might be important. OO people think of
> instances rather than sets. I would say that a forest contains trees;
> you would say that there is no such construct as a forest, but only
> instances of trees. This goes back to the concepts that Cantor
> introduced with set theory.

I don't understand what you mean by "OO people think of instances rather that sets".
Certainly, if the instances forms a collection (eg a "Bag"), we don't think of it (the collection) as a set. But what is the point? In OO-terminology "class" is the abstract representation of its "objects". In the context of this debate, "class" and "entity type" may be regarded as synonyms.
Why do you think that I would say, "that there is no such construct as a forest"?
You are right that I might say "TREE" and not "FOREST" ("TREES") in a semantic model for trees; that depends on whether the "unit of thought" is "tree" or not in the problem domain.

> >> Consult Date or another standard textbook. <<
>
> Date uses both plural and singular names. I am re-writing mine to
> favor collective nouns. A quick look at the stuff in the SQL books on
> my shelf shows more plurals in later books. Try the INCITS L8
> Metadata Standards Committee data element naming rules. Scalars are
> singular in their rules.

Please acknowledge that I corrected my self; indeed Date is not quite consistent.
In his introduction (Ch. 1, An overview ...) he shows and E/R-diagram with plural nouns (Fig 1.6).
But anywhere else (as far as I can see), he uses the singular form, in special in
his more complete treatment of semantic modelling and E/R (eg. Fig. 13.2). I'm not quite sure what you mean by "SQL books". For me, SQL is a language, not a tool for semantic modelling, and not a synonym for (relational) databases
and data modelling. In any case, my shelf shows the opposite, and if I take modern books on modelling in general, and OO and UML in particular, in account, a reach the conclusion that the semantic entity-types/classes is proposed to be named in the singular form. There are very good reasons for that,
but this would lead this thread too far. When it comes to naming the (relational) tables, I think it is at matter of opinion
and taste; but it is certainly valid (and for me: advantageous) to make a smooth
transition from the semantic model by keeping a singular wording. If need be,
one could go EMPLOYEE -> EMPLOYEE_TBL (or even ->EMPLOYEES).

>> I disagree. Your argument shows exactly why a key should not carry
>> information: what if the company wants to keep a record of fired
>> employees.
>> A record in a table should be "a fact"; that is not the case if the
>> person
>> no longer holds the email-address. A key should be immutable. <<

> A data warehouse is a different problem. If you REALLY want an
> immutable key for a person, then we use fingerprints and DNA -- the
> prison system again. Most employers are not that concerned about the
> location of ex-employees; most of them are not crooks <g>.

I take it as you agree that no usable, non-mutable natural key exists (DNA patterns might be expressed as a number, but not in a practical way <g>).
I think you take the example too far in the wrong direction regarding the topic
of this exchange - my point does not relate to whether employers are concerned
about their ex-employees, but should be understood for the sake of the argument:
the key should not change, and should certainly not lead to a "non-fact" in the DB.

> But a company assigned email address can be a unique string that is
> checked to see that it is issued only once. AOL, Yahoo, et al seem to
> be able to do this with lots of people by adding a number after a name
> the customer can remember. They have a high turn over rate as well.

I frankly don't see your point here.

>> As an example you might think of records of invoices, which may
>> have a
>> natural candidate key combined of foreign keys to Customer and Order.
>> Nevertheless, a running integer is often used as a primary key, eg. to
>> document (for the auditor) that no sale is forgotten. <<
>
> If my business rules say that an invoice has one customer number and
> one order number, then I want to have a UNIQUE(customer-nbr,
> order_nbr) constraint in my schema.

Certainly. This was also implied in my presentation: ("But you must ... declare constraints (such as unique index) on the (natural) candidate-keys to keep the database healthy")

> The advantage of the separate invoice number is that you can relate it
> to a physical piece of paper or to multiple orders on the same invoice
> or multiple invoices for a single order. The customer has his own
> purchase order number on the invoice, but it is possile that two
> customers might duplicate a purchase order number.

I think this issue is covered in another sub-thread. Received on Sun Jan 13 2002 - 11:59:16 CET

Original text of this message