Re: some ideas about db rheory

From: none <rp_at_raampje.>
Date: 20 Jul 2009 21:09:48 GMT
Message-ID: <4a64dd1c$0$27267$703f8584_at_news.kpn.nl>


vldm10 wrote:

>Recently I found this article on the internet
>http://www.cs.mu.oz.au/~rui/publication/vldb08_TransactionTimeIndexing.pdf
>and have decided to write a reply on the following Reinier’s message
>from a union is always a join! :

Thank you ...

>On Apr 1, 11:31 pm, rp..._at_pcwin518.campus.tue.nl (rpost) wrote:
>> If I understand it correctly, you propose to annotate all facts
>> (tuples) in the database with metadata about their insertion and
>> retraction (when, by whom, possibly more). Every INSERT, UPDATE
>> and DELETE becomes an INSERT. The database records not only the
>> present state of affairs, but also all past states, and more
>> (e.g. for every state change, who made it). Essentially you
>> add version control to the database. System exist that do this.
>
>The above paper uses the terms “Immortal Database” and “record
>version”, and in my opinion, this solution is technical rather than
>theoretical. Audit files and “logical delete” also saved data and
>they are in my opinion also technical solutions.
>I am afraid you do not differentiate between a technical solution and
>a data model.

I am afraid I do not see the difference. I wrote that reply to ask you what the difference is. First, as far as I can see, your model can be implemented as a particular way of using relational databases. Second, it seems to me that your use of identifiers can be eliminated systematically and *losslessly*, i.e. without diminishing expressive power. I'm curious whether this is the case. These are mathematical propositions that either hold or do not hold. Simply *declaring* your model to be 'theoretical' rather than 'technical' neither confirms nor refuses it.

>What I have done is a data model, in which time is not in a “record
>version”, but rather in the concept. In contrast to the existing
>theory, time in my data model is not an attribute of the concept, but
>the knowledge about an attribute. Also, time in my model is not part
>of the entity (this is a bad idea in the existing and current theory),
>but is located in the concept of state (the entity’s state).

You represent these 'meta-data' in a separate table and then link them to the 'concept' using an identifier attribute.

But you don't demonstrate that this representation is necessary, that it allows things to be expressed that a representation of meta-data as extra attributes in the 'concept' table doesn't allow.

[..]

>> Why do you describe this as a new data model? It seems more palatable
>> and more useful to describe it as a particular way of using relational
>> databases.
>
>It appears you do not understand the nature of relational (and
>conceptual) model. In conceptual model functional dependencies do not
>exist – at least not in my model. They have been replaced by the
>option “Intrinsic Properties”. “Intrinsic properties” is more general

How is it more general?

>option. For example, in the relational model in the design phase you
>often construct a wrong relation and fix it with the help of normal
>forms. With “Intrinsic Properties” I get the right entity right away

It seems to me that this is exactly what Brian and much of OO literature is saying: the objects are 'out there' so we aren't going to make mistakes in modelling them. If this is true, there has got to be a way to *mathematically* support it. Give an example of a typical modeling error committed when using standard relational techniques, show how your approach avoids it, and (to justify calling your approach a 'model') show that this approach is not just a particular way of using relational modelling.

[...]

>> (I also notice that all of your tuples have an object id,
>> but that topic has been trampled to death in this group.)
>
>It seems to me that you have not understood the nature of my
>identifier of a state. In contrast to the Object ID, the identifier is
>always real and can always be determined in the real world.

I don't understand. Suppose your database has an entity Person. I am a person. What is my identifier?

Note that I do recognize your (and Brian's) desire to systematize the time-varying aspect of object identity. It can indeed be important to express the identity of objects across time (or rather, across changes. But that doesn't mean you need to introduce identifiers into the model.

>With the
>help of the identifier of state it can be determined who entered any
>piece of data, even if that person tried to break the system by
>purposely entering false data.

I don't know if you read it - Brian didn't - but this is why I wrote the story about the veterinarian and the goldfish. Its moral is that if you have no way to track identity across changes in real life, adding it as a modeling feature (either with explicit identities or by distinguishing between updates and deletes+inserts, as Brian proposes) isn't going to help a bit.

>I am not sure that you understood this
>idea. It deals with the following: in order to determine how every
>object in a small world is constructed, you have to first know who (or
>which procedure) constructed it.

This is possible in OO models, in which you actually control the actual objects: they have constructors that actually create the objects, not merely statements about properties of objects. In (typical) relational databases you do not actually control objects, you merely register statements about them. I also explained this to Brian but he chose to ignore it.

>There are some advantages using the
>identifiers, for example these identifiers can be successfully used to
>classify database theory:
>1) DB theory for simple databases, which is about entities with only
>one state .Here the identifier of an entity is semantically equal to
>the identifier of a state. See section 1 in my data model. I have
>strong impression that existing DB theory mostly belongs to this case.

Well, you're not alone in the belief that the relational model requires an extension to properly deal with recording 'temporal' data, e.g. last time I checked, Chris Date was with you on this.

>2) general DB theory – where DB can maintain entities with many
>states. Here these identifiers are not same.

Demonstrate how identifiers are actually used, and how this helps.

-- 
Reinier
Received on Mon Jul 20 2009 - 23:09:48 CEST

Original text of this message