The Fact of relational algebra (was Re: Clean Object Class Design -- What is it?)

From: Bob Badour <bbadour_at_golden.net>
Date: 8 Sep 2001 12:38:53 -0700
Message-ID: <cd3b3cf.0109081138.731c2192_at_posting.google.com>


"Adrian Veith" <adrian_at_veith-system.de> wrote in message news:<9na529$6u5$1_at_wrath.news.nacamar.de>...
> "Bob Badour" <bbadour_at_golden.net> schrieb im Newsbeitrag
> news:v2Vk7.768$LE3.134318184_at_radon.golden.net...
> >
> > Please show me a complex system so complex that it has an integrity
> > constraint one cannot declare with a well-formed-formula.
>
> There is an easy and complex answer to this:
>
> easy answer:
>
> If relational algebra is that powerful, that you can declare any possible
> integrity constraint, why is it not possible to have relational integrity in
> an OODB ?

I have previously stated several times that relational dbmses are object oriented dbmses. How do you arrive at the conclusion I think it impossible?

> solution a: relational algebra can describe anything - than it is possible
> to have relational integrity in an OODB

Since relational dbmses are object oriented dbmses, this is trivially true.

> solution b: relational algebra cannot describe the relations in an OODB -
> than we need a more powerful concept.

When one mistakenly equates an object collection with a relation, one loses the ability to describe relations. Relations can describe anything one can describe in a non-relational dbms; except that relations do a much better job.

> complex answer:
>
> I believe, that relational algebra works also in a proper defined OODB.

In that case, you merely agree with me that relational dbmses are object oriented dbmses.

> If we look a typical object hierarchy:
>
> class media
> title: string
> end
>
> class cd < media
> number_of_songs: integer
> end
>
> class book < media
> number_of_pages: integer
> end
>
> one possibility to implement this hierarchy in an RDMS is to flatten out the
> hierarchy (Not very effective, but possible)

The "possibility" you describe is nothing more than a very poor design, and it makes the fundamental mistake of equating object classes with relations. An object class describes an encapsulated data type, while a relation describes an unencapsulated set. Your example is nothing more than a straw man.

Further, no competent relational database designer would use the logical design you propose where a media relation has the union of all media subtype attributes. I suggest you get ahold of a copy of Fabian Pascal's latest book _Practical Issues in Database Management: A Reference for the Thinking Practitioner_ (Addison Wesley, 2000). Fabian wrote an entire chapter to properly explain the subject.

Your example further assumes NULL, which is controversial at best. NULL violates the information rule.

> What does this example show:

It shows only that you have a very confused understanding of relations. At a very fundamental level, your example confuses an unencapsulated set with an encapsulated data type.

> 1. You can transform any object hierarchy to a flat structure and some
> integrity constraints

While one can easily and simply represent a multidimensional construct like a relation in a two-dimensional "table", a relation is anything but flat. In fact, a single tuple contains an entire set of arbitrarily complex object values or variables.

> 2. You can translate a collection of objects into a table with 1:n relation

You can translate a collection of objects into a single table of degree 1. A table, which is a relation, cannot have a 1:n cardinality. Two tables associated by a foreign key can have 1:n relative cardinality.

> 3. You can translate a reference into a 1:1 relation

I am not entirely sure what you mean here. The statement above is nonsensical. A reference is a pointer. A relaion is what you call a table. A table or relation cannot have 1:1 relative cardinality, which requires an association between two relations.

> 4. The design with an OODB is much easier and more logical

The illogic of your straw man only demonstrates the weakness of your understanding of relations and the weakness of your own design skills.

> 5. In an OODB you need far less constraints, because they will be fulfilled
> automatically .

Statement #5 is simply untrue.

> 6. Not shown but obvious: Queries are much easier to formulate in an OODB,
> than in a RDBMS

Not shown nor obvious nor true.

> 7....100: OODB rules! :-)

It doesn't suprize me that your position arizes from fanatacism rather than reason.

> and finally. Of course is it possible to define relational constraints in an
> OODB.
Provided the object oriented dbms is a relational dbms. All truly relational dbmses are object oriented, while only some object oriented dbmses are relational.

> But most of the constraints are implemented automatically and you
> don't have to formulate all these things over and over again.

Given the weakness of your straw man, it does not surprize me that you might actually believe such absurdity.

> What about multiple inheritance ?
>
> I prefer to define multiple inheritance with the help of interfaces and
> interface delegation.

Since all relational domains are abstract data types physically independent from their physical representation, in a relational dbms, all inheritance is "interface inheritance".

> In this way, multiple inheritance does not change the
> structure of the object hierarchy.

Given the requirement for physical independence, the above statement has no real meaning unless one makes the mistake of equating the logical data model with the physical storage as non-relational odbmses do.

> Another myth states, that RDBMS are much more effective and quicker than an
> OODB.
Here you have combined two issues -- one of which is true and one of which is false. It is true that relational dbmses are much more effective than dbmses based on any other known logical data model. You apparently misunderstand the implications of physical independence. The actual performance is determined by the physical structures that the dbms supports and not by the logical data model. A relational dbms is capable of exactly the same execution performance characteristics of dbmses based on any other logical data model in any given context.

It is true that relational dbmses will perform better overall than non-relational dbmses, but that performance benefit includes application development time, facility of schema evolution etc.

> Often this myth is explained with relational algebra and the
> possibility to optimize queries in an RDBMS.

Your so-called myth is just another straw man. The relational algebra, however, does facilitate optimization.

> As with all myths, there is true part:
> Relational algebra gives you the possibility to transform a query in order
> to optimize it.

And a whole lot more...

> Wrong is:
> 1. That all optimizations of a query are done with relational algebra. The
> most common optimization is to use an index.

Since physical storage determines execution speed, all optimizations involve physical storage structures. Many different types of indexes exist to change performance characteristics. However, indexes only scratch the surface of performance altering physical structures. You ignore clustering, physical pointers, pointer pools, distribution, parallelization and everything else the human mind can imagine.

> Relational algebra is used to
> transform the query to use the index. But the index in itself is not defined
> by relational algebra.

You apparently misunderstand Dr. Codd's most fundamental goals when he proposed the relational data model. The relational model explicitly avoids any definition of physical structures for the very purpose of allowing physical independence. It permits any physical structure.

By defining physical structures in the logical data model, non-relational odbmses cripple themselves in comparison.

> (Since an index is normally a tree-structure, it has
> more in common with an OODB if you want)

While the most common indexes for databases are variants of the B+Tree and ISAM, many other indexing structures exist.

> But:
> 2. You can use these techniques with an OODB as well.

A non-relational oodb equates its logical data model with physical structures, which is the antithesis of physical independence. You cannot "use" relational algebra, physical independence, optimization etc. by directly tying logical interfaces to physical structures.

> 3. An OODB has build in optimizations like references (a controlled shortcut
> for a 1:1 relation).

A relational dbms has built-in optimizations too -- it just cannot expose them in the logical interface. The non-relational dbms harms its users by forcing them to navigate pointers.

Apparently, you do not even know what a relation is: "A set of named, typed attributes along with a set of tuples each having a set of corresponding attribute values."

You have used relation as an apparent synonym for foreign key reference. Redefining the term is a straw man at best and demonstrates very poor comprehension at worst.

Unfortunately, vendors actively promote this misconception. Microsoft, for instance, once published a "dictionary" that incorrectly equated "relation" with "foreign key reference", and I see they have recently done the same in their ADO.Net interface.

> Or collections as controlled shortcuts for joins

A "collection" is little more than a single-column relation (or table) with an arbitrary interface and extremely limited utility.

> 1:n relations.

Again, you confuse the term "relation" with the term "foreign key reference".

> 4. An OODB can outperform an RDMS.

A sadly common but untrue misconception. Received on Sat Sep 08 2001 - 21:38:53 CEST

Original text of this message