some ideas about db rheory
Date: Tue, 7 Jul 2009 11:50:59 -0700 (PDT)
Recently I found this article on the internet http://www.cs.mu.oz.au/~rui/publication/vldb08_TransactionTimeIndexing.pdf and have decided to write a reply on the following Reinier’s message from a union is always a join! :
On Apr 1, 11:31 pm, rp..._at_pcwin518.campus.tue.nl (rpost) wrote:
The above paper uses the terms “Immortal Database” and “record
version”, and in my opinion, this solution is technical rather than
theoretical. Audit files and “logical delete” also saved data and
they are in my opinion also technical solutions.
I am afraid you do not differentiate between a technical solution and
> If I understand it correctly, you propose to annotate all facts
> (tuples) in the database with metadata about their insertion and
> retraction (when, by whom, possibly more). Every INSERT, UPDATE
> and DELETE becomes an INSERT. The database records not only the
> present state of affairs, but also all past states, and more
> (e.g. for every state change, who made it). Essentially you
> add version control to the database. System exist that do this.
The above paper uses the terms “Immortal Database” and “record version”, and in my opinion, this solution is technical rather than theoretical. Audit files and “logical delete” also saved data and they are in my opinion also technical solutions. I am afraid you do not differentiate between a technical solution anda data model.
What I have done is a data model, in which time is not in a “record
version”, but rather in the concept. In contrast to the existing
theory, time in my data model is not an attribute of the concept, but
the knowledge about an attribute. Also, time in my model is not part
of the entity (this is a bad idea in the existing and current theory),
but is located in the concept of state (the entity’s state). In my
data model a new state is always initiated by an event in the real
world. This can happen, for instance, when John deposits $1000.00 into
his bank account. The real world event could also be when this deposit
is recorded in the database. In real world event could be when the
deposit is transferred from one database to another. Briefly, in my
model one pair of StartDate – EndDate can exist, or two pairs can
exist or three pairs can exist… See Example 9 at www.dbdesign11.com ,
version from August 21, 2008. Note that in Example 9 this is done on
the attribute level, that is, on the level of binary concepts (binary
Note that in 2005, I posted some of these ideas on my website (see www.dbdesign11.com). More than 20 years ago I worked in a company where all the data from every project were saved, that is, only the operation “add new data” existed, while delete data and update data did not.
Above mentioned technical solutions can have troubles in real world situations. For instance, if John deposited $1000.00 into his account on March 31, 2009, but this date is entered into the database 3 months later, on July 31, 2009, who will pay the possible late fees? Not to mention complex real world situations.
In my model, I get all the states of an entity immediately, with the help of the identifier of that entity – I don’t need trees, indexes or anything similar.
So I would say there is fundamental difference between my solution and version control.
> Why do you describe this as a new data model? It seems more palatable
> and more useful to describe it as a particular way of using relational
It appears you do not understand the nature of relational (and conceptual) model. In conceptual model functional dependencies do not exist – at least not in my model. They have been replaced by the option “Intrinsic Properties”. “Intrinsic properties” is more general option. For example, in the relational model in the design phase you often construct a wrong relation and fix it with the help of normal forms. With “Intrinsic Properties” I get the right entity right away (See intrinsic properties in Section 2). This is one of the reasons why I begin with a conceptual model, but it isn’t the only reason.
>You might add descriptions of transformations of database
> schemas, instances and queries that convert an arbitrary relational
> database into this form, and of the reverse transformations that,
> given a database in this form, convert it to one that only ever holds
> the current state of affairs (or transformations that take a read-only
> snapshot of any particular state, like in version control systems).
> Such transformations can help demonstrate that your database
> operates as designed.
> (I also notice that all of your tuples have an object id,
> but that topic has been trampled to death in this group.)
It seems to me that you have not understood the nature of my
identifier of a state. In contrast to the Object ID, the identifier is
always real and can always be determined in the real world. With the
help of the identifier of state it can be determined who entered any
piece of data, even if that person tried to break the system by
purposely entering false data. I am not sure that you understood this
idea. It deals with the following: in order to determine how every
object in a small world is constructed, you have to first know who (or
which procedure) constructed it. There are some advantages using the
identifiers, for example these identifiers can be successfully used to
classify database theory:
1) DB theory for simple databases, which is about entities with only one state .Here the identifier of an entity is semantically equal to the identifier of a state. See section 1 in my data model. I have strong impression that existing DB theory mostly belongs to this case. 2) general DB theory – where DB can maintain entities with many states. Here these identifiers are not same.
Vladimir Odrljin Received on Tue Jul 07 2009 - 20:50:59 CEST