Re: I am confused as to whether I should use numeric IDs or not

From: Joseph Godfrey <jgodfrey_at_ecologic.net>
Date: 2000/02/05
Message-ID: <389C8436.427170AD_at_ecologic.net>#1/1


I think this pretty much covers it. I'll just add a few more comments to clarify...

Harry Chomsky wrote:

> Joseph Godfrey wrote in message <389A54CD.415CF4B5_at_ecologic.net>...
> >I think the point about an OID not being an explicit attribute is one way
 to
> >approach the question. But in object-relational databases OIDs are
> explicit.
>
> Are they explicit _attributes_ that work just the way other attributes work,
> with all the same syntax and capabilities? I'm not familiar with any actual
> ORDBMSs so I honestly don't know the answer to this question. Date seems to
> say that OIDs are "special", different in some way, and that's why he
> objects to them.
>

ORDBMS treat OIDs as unique, immutable identifiers of an object. What Date most has a problem with, I believe, is that OIDs can be used as pointers, i.e., hidden, hardwired joins. He regards this as repeating the CODASYL mistake.

>
> >Furthermore, autonumbering (as in MS Access, Informix, Oracle, etc) is not
 the
> >same thing as an id that has semantics, e.g., a social security number,
 order
> >id, customer number, etc.
>
> I'm not sure I understand the distinction. What semantics does my SS# have?
> As far as I know it's just a unique id that the government uses to keep
> track of my identity. I don't know what algorithm was used to generate it,
> but I'd assume it's something resembling an autonumbering technique. What's
> the difference between a unique id generated by autonumbering, and a unique
> id like SS#?
>

I am sensitive to your question here. And I admit that I do not have an answer with which I am satisfied. This is why I posted my remarks as a question. I even went so far as to note that a person's name is essentially equivalent to an autonumber. But where names and SS# seem to differ from the autonumber concept is that these are external and independent of the database system. Attributes such a product number, customer id, etc., if these are directly linked to the database implementation, begin to blur the distinction between logical and physical models. The observer and the observed, if you will, become linked. The danger, I suppose, is that this link could become inextricable. Why should the business process and rules depend in an essential way on the tools used to run the business? Again, I'm posing the question theoretically not pragmatically. (As an aside, my attitude is that one should feel free to take short cuts so long as one understands what one is doing -- the risks and potential ramifications. Hence my concern here with the theory).

>
> >Very often the ID is hidden from view, as in the
> >question that initiated this thread. An autonumber in this case is not an
> >attribute of the data but a bookkeeping deviceof the database system very
 much
> >akin, I think, to an OID.
>
> I guess the question here is, whose view is it hidden from? Anybody
> manipulating the data at the SQL level certainly needs to "see" the id
> attribute, whether they're writing an application or just typing in queries
> interactively. What about users who enter and browse data using forms:
> should they see the id attribute or not? Well, I think they should be able
> to see it if they want to, but that's probably a question best left to the
> UI designers. The most important thing is that the _programmers_ be able to
> use the id just like every other attribute.
>

Yes, this is the key question. If the autonumber ids are hidden, they function very much like OIDs, aside from enabling hidden, hardwired joins, and can lead to inconsistent views of data -- the very same paradoxes one can get from duplicate rows. It all depends on the UI designers, as noted. Data integrity, however, should probably not be left to UI designers, especially if they do not fully understand the data model.

>
> >Furthermore this way of using ids can circumvent
> >referential integrity constraints. If one includes the abstract ID as part
 of
> >the primary key and keeps the entire key explicit, I understand that this
> >conforms with relational theory and referential integrity can be enforced..
 But
> >autonumbering is often not used this way -- and that I think is trouble.
>
> Agreed. The id absolutely needs to be declared as a key. Usually it should
> be the primary key, I would think (though relational theory doesn't have
> much to say on the issue of _primary_ keys per se). The post that
> originated this thread did assume that the id is the primary key, so in this
> case I don't think the trouble you're concerned about will occur.

Of course I agree here. In saying this I'm hoping to make clear that my struggle is more theoretical and not with the responses to the original posting. Thanks! Received on Sat Feb 05 2000 - 00:00:00 CET

Original text of this message