Re: The (incomplete) Mathematical Foundation of OODBs (XDb)

From: Bob Badour <bbadour_at_golden.net>
Date: Sun, 9 Jun 2002 20:25:57 -0400
Message-ID: <vnSM8.183$qG1.10676338_at_radon.golden.net>


Well, your definition is close, but slightly misstated:

"James" <jraustin1_at_hotmail.com> wrote in message news:a6e74506.0206091520.53a1e416_at_posting.google.com...
> If a RDB can be defined as follows:
> ------------------------------------
> A RDB represents data as values in relations. A relation is
> approximately equivalent to the mathematical concept of relation. A
> relation has a header consisting of a set of N named, typed attributes
> and a body consisting of a set of N-dimensional tuples where each
> tuple dimension value corresponds to the appropriate named, typed
> attribute.
>
> One can represent a relation as a table of values where each row
> corresponds to a tuple, each column corresponds to a named attribute
> and each cell contains some value stored in the relation.
>
> Then an OODB can be "ROUGHLY" defined as follows:
> ---------------------------------------------
> An OODB represents data as values in single-column relations.

Directly transliterated that should read:

"An OODB represents data as values in one-dimensional relations."

You will recall that a single dimension is linear and is often called a domain or range. One should properly adjust the above to:

An OODB represents data as values of domains.

> A
> single-column relation is approximately equivalent to the mathematical
> concept of relation.

Should read: "A domain is approximately equivalent to the mathematical concept of domain."

> A single-column relation has a header consisting
> of ONE named, VARIANT type attribute and a body consisting of a set of
> tuples where each tuple's value corresponds to the named, variant type
> attribute.

There are no columns in relations -- only attributes. One can represent an attribute on a two-dimensional medium as a column. It's an important distinction, which one should strive to keep clear.

"A one-dimensional relation would have a header consisting of a single named, typed attribute and a body consisting of a set of singles where the value in each single corresponds to the named, typed attribute."

One would ordinarily call such a linear set an "extent".

It's important to note that the relational operators are not closed over one-dimensional relations, and this has dire implications.

> One can represent a single-column relation as a single column of
> values where each row corresponds to a tuple, the column corresponds
> to the named attribute
> and each cell contains some value stored in the relation.

One can represent a domain or an extent as a single column of values where each row corresponds to a single value of the domain.

> For example, data viewed in XDb's Class/Instance Hierarchy

Let's not be bothered with such time-consuming nonsense.

> An OODB can be defined more precisely as follows:
> ------------------------------------------------
> OODB represents data as instances of classes.

Your terms are not very well defined.

What exactly is an instance? Instantiation is a well-known concept in logic, but it's not at all clear that you mean the same thing.

What exactly is a class? The literature has some controversy over whether class corresponds to type.

For instance, see Chris Date's recent article at http://www.pgro.uk7.net/c_substit.htm (One can navigate to this as an Article under the title "What Does Substitutability Really Mean? Part 1" in the Content section of http://www.dbdebunk.com/ -- in case the site moves to a new provider.)

> Each instance can store a value of variant type.

Isn't this the same as a weakly typed variable?

> Class and instances are approximately equivalent to the mathematical
> concept of single-column relation.

I disagree. I suspect your usage of class is approximately equivalent to type, and I suspect your usage of instance is approximately equivalent to variable.

Of course, the literature is sufficiently confused over these terms that nothing is really clear.

> But unlike RDB where each tuple in the relation is typically a simple
> value,

I think you need to reconsider the above statement. A tuple is a set of values and is not a simple value at all.

> in OODB each instance is actually an object which allows it to have
> its own instances.

You just lost data independence by exposing physical structure in the logical interface. That might work well in your application programming environment, but it sucks for data management.

Contrast this with Date's and Darwen's proposals in _The Third Manifesto_ where the logical interface exposes only operators and no physical representation. This, of course, allows for any possible physical representation gaining us the physical independence we so desperately need for effective data management.

> ********************************************************************
> Summary:
> Pure OODB and RDB are fundamentally based
> on the same mathematical concept of relations.

OODB and RDB are fundamentally based on the same mathematical concept of domains; however, OODB lacks the very important concept of relation. The OODB model is the equivalent of nouns without sentences.

> In RDB, it is relation/values.

In an RDBMS, all data are logically accessed as strongly typed values in relations as described above. This empowers the DBMS with the might of modern mathematics -- set algebra or equivalently first order predicate logic to be exact.

> In OODB, the equivalent is class/instances.

Your particular brand of OODB exposes all data as weakly typed, named variables as described above. It has no equivalent of either set algebra or predicate logic, and it exposes physical structure in the logical interface.

Any other brand of OODB will do things differently.

> In OODBs, the class (obj) and its instances (obj) are of the same
> type.

Here is a perfect example of the confused thinking in the OODB world. It's not clear at all what a class really is. Is it a type? Is it an object? An object was defined as an instance of a class above, but now instances are the same as classes. An object is an instance of an instance? A class is an object? Doesn't that make an object an instance of an instance of an instance? And so on ad infinitum?

All this and no generic operators too!



To finally restate things:

An RDBMS equates type with domain. Each scalar value in the RDBMS is a value of some domain. Appropriately enough, a domain is approximately equivalent to the mathematical concept of domain as pertains to relations.

I suggest we drop terms like "class" and "domain" to focus on "type". Received on Mon Jun 10 2002 - 02:25:57 CEST

Original text of this message