Re: database design method

From: Jan Hidders <hidders_at_REMOVE.THIS.uia.ua.ac.be>
Date: 16 Nov 2002 11:12:02 +0100
Message-ID: <3dd619f2$1_at_news.uia.ac.be>


Lauri Pietarinen wrote:
>Jan Hidders wrote:

>> What I meant could also be called "abstract value" and is in some sense
>> the theoretical clean version of surrogate identifiers. The big
>> difference with surrogate identifiers is that the user can never get to
>> see their value and the only operation that is defined for them is
>> equality. Such a restricted type may seem a bit strange and not very
>> useful at first sight but it makes the life of the DBMS easier (for
>> example the DBMS can reuse identifiers that are no longer used) and
>> allows the user to define views and queries in which new identifiers are
>> generated without using some clever arithmetic to generate new surrogate
>> identifiers. In fact, if such object identifiers are possible (including
>> special operators to generate them) IMO the good old flat relational
>> model would be all we need.
>
>So is the function of these identifiers to act as an "internal glue"?

In a sense, yes. Suppose you want to represent a nested relation R(a,B) (B is the nested column) with the flat relations R1(a,c) and R2(c,b) where in R1.c you use an identifier to replace the set in R.B, and in R2 you record what the elements of the sets. You could indeed say that the identifiers in colomns c is what allows us to glue the two relations together and reconstruct the nested relation R.

>What is the problem with generating such values? If they are just integer
>values (say) and there is a process inside the engine that gives the system
>the next integer? Or maybe generates a random value within a large space
>and just quickly checks if it is reserved?

For surrogate identifers the DBMS must somehow remember all identifiers that have ever been used, also those that are no longer used in any relation. It could do this by either storing the set of all identifiers that were ever used (could be big) or remembering the highest integer that it has every generated, but then it cannot reuse identifiers that are no longer used.

Another problem is if you define views with new identifiers in them. Should these be regenerated every time the view is accessed? What would these identifiers then mean?

>And one more question: what would be a practical query in which
>these abstract identifiers would be needed.

Suppose I have a relation

  Bought(customer, product)

and I want as the result a relation

  Cust_class(class_id, customer)

that groups the customers into classes that have bought the same set of prodcucts and gives them a new identifier.

  • Jan Hidders
Received on Sat Nov 16 2002 - 11:12:02 CET

Original text of this message