Re: The Fact of relational algebra (was Re: Clean Object Class Design -- What is it?)

From: Bob Badour <bbadour_at_golden.net>
Date: Sun, 30 Sep 2001 00:20:48 -0400
Message-ID: <A4xt7.1500$yU1.104969349_at_radon.golden.net>


"Adrian Veith" <adrian_at_veith-system.de> wrote in message news:9nl4hf$ndq$1_at_wrath.news.nacamar.de...
> From: "Bob Badour" <bbadour_at_golden.net>
> Newsgroups: comp.databases,comp.databases.object,comp.databases.theory
> Sent: Monday, September 10, 2001 3:22 AM
> Subject: Re: The Fact of relational algebra (was Re: Clean Object Class
> Design -- What is it?)
> > "Adrian Veith" <adrian_at_veith-system.de> wrote in message
> news:<9ngc8f$de9$05$1_at_news.t-online.com>...
> > > "Bob Badour" <bbadour_at_golden.net> schrieb im Newsbeitrag
> > > news:cd3b3cf.0109081138.731c2192_at_posting.google.com...
> > > > "Adrian Veith" <adrian_at_veith-system.de> wrote in message
> > > news:<9na529$6u5$1_at_wrath.news.nacamar.de>...
> > > > > "Bob Badour" <bbadour_at_golden.net> schrieb im Newsbeitrag
> > > > > news:v2Vk7.768$LE3.134318184_at_radon.golden.net...
> > > > > >
> > > > > If relational algebra is that powerful, that you can declare any
> > > possible
> > > > > integrity constraint, why is it not possible to have relational
> > > integrity in
> > > > > an OODB ?
> > > >
> > > > I have previously stated several times that relational dbmses are
> > > > object oriented dbmses.
> > >
> > > My point is, that an OODB is a superset (should be, because some are
> not) of
> > > a RDBMS.
> >
> > Actually, you have that backwards. Strictly speaking, a non-relational
> > OODB is neither a subset nor a superset of a relational dbms. However,
> > if we remove pointers, a non-relational OODB is very much a proper
> > subset of a relational dbms.

>

> I have never talked of a non relational OODB. My opinion is, that a OODB
> should be relational, but with enhancements to the functionalty that
todays
> RDBMS have.

The "enhancements" you desire render the dbms non-relational doing tremendous harm in the process.

> > For instance, a single column relation has all of the functionality of
> > a collection, while a collection lacks much of the functionality of a
> > single column relation.
>
> I said so too: a collection is a subset of a (single column) relation.

A collection is not a subset of a (single column) relation. A collection is different from a relation; although, it performs a tiny subset of the functionality of a single column relation.

> > The non-relational oodb lacks the equivalent of a "zero degree"
> > relation, and the non-relational oodb lacks the equivalent of any
> > relation with degree greater than 1.
> >
> > The non-relational oodb attempts to overcome the lack of higher degree
> > relations by adding pointers. However, the last three decades have
> > clearly demonstrated the abject inferiority of pointers. Of course,
> > that should be clear to any informed person: A pointer provides only
> > partial functionality of a "2 degree" relation, and a pointer locks
> > the dbms into a specific physical structure.
>
> I have never talked about pointers. I talked about a reference.

A reference is a pointer.

> The
> "reference" i am talking about has the following characteristics:

>

> - it identifies the ascotiated object like a primary key, but is generated
> by the database

It points to an object variable without providing any useful information for identification.

> - but unlike the primary key, it gives you the following information:
> - what kind (class) of object it identifies.

While a primary key does not provide type information, the combination of relation name and column name do. Since primary key value, relation name and column name uniquely identify an object instance, the relational equivalent to a pointer provides as much type information.

> - when is the object (two or more different transactions, can see an
> object at a different state)

I am not sure I understand your point here, but it sounds as though you are talking about concurrency issues.

> - any field (data) of the object can be adressed via the reference

Not only do you want to discard the information rule by introducing pointers, now you want to discard physical independence by unencapsulating domains (abstract data types).

> - it *must* never change throughout the lifetime of an object.

How does it achieve this when I embed the object in my Excel spreadsheet, print out the spreadsheet and fax it to someone else?

> - it should give you an error if try to access an object, that has been
> deleted.

As would any dangling pointer, which is just one reason why pointers prove so inflexible and ineffective.

> - i don't make any assumptions how this reference is implemented.

You don't have to make assumptions how the pointer is implemented. It remains a pointer nevertheless.

> > > Therfore, anything you can do with a RDBMS, you can do with an OODB
> > > as well.
> >
> > You cannot write views with a non-relational oodb. You cannot declare
> > univerally enforced constraints of arbitrary complexity in a
> > non-relational oodb. You cannot arbitrarily alter performance
> > characteristics without rewriting applications in a non-relational
> > oodb. etc. etc. etc.

>

> I can write views as well, because the reference, that is returned by a
> query (more correct: the collection of references), can identify a real
> object, or an interface to one or more objects.

How do you project over a pointer? What results?

How exactly does this new reference type differ from the relational method of simply exposing the object directly?

> Since it the job of the reference, to retrieve the actual data of the
> object, there is no need for a specific physical storage of the data.

What benefit does this new reference type provide over the relational method of exposing objects directly?

> But i am not limited to the references. If the database should return a
> collection of a set of attributes (what you would call a relation), this
is
> also possible.

Using a relational dbms that does not expose references, I am not limited in any way.

> > > A collection is a
> > > subset (sub not super) of a relation. But in real live databases, you
> are
> > > confronted with 80% of relations, that can be described by a
collection.
> >
> > A collection cannot really describe a relation of any degree. For
> > instance, collections do not support any closed relational algebra.
> > The fundamental importance of this should not go unnoticed.
> >
> > At best, a collection provides a very limited proper subset of a
> > relation of degree 1.
>
> A database, as a result of a query, returns a collection of a set (tuple)
of
> references and/or other attributes. I wouldn't call this limited.

I would call it needlessly complex. A relational dbms, as a result of a query, returns a set of a set (tuple) of objects. What can I do with a reference that I cannot do with the object itself?

> Since a table in itself is a collection of references to the objects,
there
> is no difference in querying a table, or the result of a query.

The relational model does not limit the use of pointers at the physical storage level. Since it exposes only objects at the logical level, there is no difference in querying any relation regardless of storage or derivation.

> The join of two or more tables (collections), is a set of references to
the
> result of the product of these collections, that meet a certain criteria.

Again, I see this as needlessly complex. What do references allow me that objects do not?

> What is wrong with this definition ?

Needless complexity.

> > > Therefore does a collection saves you from the task of doing stupid
> things
> > > again and again.
> >
> > "Encourages it", I would say.
>
> Please explain, or i will stay dumb.

A collection encourages one to do the stupid things repeatedly. On second thought, iteration requires it.

> The construct of a collection helps you to omit most of the relationship
> tables. Therefore it helps you to formulate queries without explicit joins
> (under the hood it is something like a relationship table).

Since a collection has no equivalent to the relational algebra, it prevents one from formulating queries. Views, themselves described as queries, help users formulate queries without explicit joins.

> The construct of the "reference" brings the task of key-generation to the
> DBMS.
Unfortunately, it hides the identity of data from the user at the same time. A DBMS can generate unique key values without resorting to pointers.

> You can look at a reference and it tells you: who or what it is and
> when it is.

I sincerely doubt that many human users can tell anything by observing a reference.

> Since the database can be optimized to use the "reference", it is also
> *allowed* to perform navigational tasks.

But you claim that a reference is not a pointer? Above you introduced the concept of an unencapsulated "field" in an object. Can a field hold a reference to an object? If so, does the user not have to navigate from one object to another to another?

> But YOU don't have to use this feature.

Unfortunately, by introducing unecessary pointers at the logical level, your DBMS would prevent me from doing many of the things that I do want to do. YOU don't need references to do anything. Why should your arbitrary desire limit other users?

> > > > > If we look a typical object hierarchy:
> > > > >
> > > > > class media
> > > > > title: string
> > > > > end
> > > > >
> > > > > class cd < media
> > > > > number_of_songs: integer
> > > > > end
> > > > >
> > > > > class book < media
> > > > > number_of_pages: integer
> > > > > end
> > > > >
> > > > > one possibility to implement this hierarchy in an RDMS is to
flatten
> out
> > > the
> > > > > hierarchy (Not very effective, but possible)
> > > >
> > > > The "possibility" you describe is nothing more than a very poor
> > > > design, and it makes the fundamental mistake of equating object
> > > > classes with relations. An object class describes an encapsulated
data
> > > > type, while a relation describes an unencapsulated set. Your example
> > > > is nothing more than a straw man.
> > >
> > > No it shows, that a OODB is compatible with a RDBMS, but a RDBMS is
not
> very
> > > compatible with a OO design.
> >
> > Again, it is a straw man. It might demonstrate aspects of your design
> > skills. It might demonstrate your skills at constructing fallacious
> > arguments, but it demonstrates no general point about RDBMS or ODBMS.

>

> Since my design skills are week ( and I am not an advocat for RDBMS like
> they exist now), please show us your simple approach in SQL.

Since I am not an advocate of SQL, why should I? If you want to criticize SQL databases for lack of domain support, be my guest. I would encourage you.

> > > Either you do it in the way I have shown, then you
> > > end up in a total inflexible and error prone design.
> >
> > Again, you have not demonstrated this or in any way attempted to prove
> > this. I suggest, again, that you read Fabian's book before making such
> > claims.

>

> I must confess, that there is some truth in the things you are saying, but
> if you have read the book, why don't you enlight us with your knowledge.

Fabian devoted an entire chapter to the topic. Since I cannot cover the topic as succinctly or as eloquently, I direct you to his book. If your local library does not have a copy, please suggest that they purchase one. You will do your community a service.

Fabian's book deals with many other practical issues as well.

> Since many books are wasting of paper, I would prefer a reflected
> description.

Fabian's book is one of the rare exceptions. You will gain much more from it that you will from usenet.

> > > Or you use a relation
> > > for each node in the class hierarchy, then you are correct, in the
means
> of
> > > a RDBMS. But what will you do, if I confront you with a class
hierarchy
> of
> > > 10 or more levels ?
> >
> > That depends on the full set of design criteria. I might declare a
> > hierarchy of relational domains for them.
> >
> >
> > > Yes, you can design it with a RDBMS, but is this useful?
> >
> > Yes, of course, it is. It is much more useful than any navigational
> > approach.
>
> There are some situations, where navigation is useful. If the task you
have
> to perform needs navigation, why should a DBMS (OO or R) hinder you ?

No task requires navigation. Why should any dbms require it?

> For example, how do you implement a topological sort with a RDBMS. (Maybe
> there is a solution, but why should I hire a relational algebra pro, if
the
> task is easy with navigation)

You do not need an expert on RDBMS to implement a topological sort, you need an expert on graph theory.

> > > > > 1. You can transform any object hierarchy to a flat structure and
> some
> > > > > integrity constraints
> > > >
> > > > While one can easily and simply represent a multidimensional
construct
> > > > like a relation in a two-dimensional "table", a relation is anything
> > > > but flat. In fact, a single tuple contains an entire set of
> > > > arbitrarily complex object values or variables.
> > >
> > > I am impressed, but with "object hierarchy", I ment "class hierarchy",
> sorry
> > > for the mistake. If you had read carfully, you would have realized it.
> >
> > I understood what you meant regarding "object hierarchy". I ignored
> > that, and I addressed your claim that a relation is flat.
> >
> > If appropriate, you can simply declare a class hierarchy in an RDBMS.
> > Remember: "Relational Domain" = "Object Class".
>
> What about Inheritance ?

What kind of inheritance? An RDBMS should support type inheritance. For an example of one relational proponent's views on type inheritance, see: http://www.firstsql.com/dbdebunk/cjd8a.htm

> > > > > 2. You can translate a collection of objects into a table with 1:n
> > > relation
> > > >
> > > > You can translate a collection of objects into a single table of
> > > > degree 1. A table, which is a relation, cannot have a 1:n
cardinality.
> > > > Two tables associated by a foreign key can have 1:n relative
> > > > cardinality.
> > >
> > > Yes, you proved, that your english is better than mine. But we ment
the
> same
> > > thing. If I use the term relation, I mean two tables, that are
> associated by
> > > a key. I hope you don't mind, that I will use the term relation in
this
> way.
> >
> > I do mind. I do not attempt to redefine programming language terms
> > when posting to programming groups, and I ask that you not attempt to
> > redefine database management terms when posting to database management
> > groups.
>
> I am sure, that I am not the only one, who uses "relation" in the way I
did.

Ignorance is no excuse. I already know that many people are unknowingly ignorant of the relational model, which perhaps explains their willingness to dismiss the relational model as impractical theory. Microsoft even has an incorrect definition in a so-called dictionary, and I see that they further propagate their ignorance in ADO.net. Nevertheless, Dr. Codd publicly applied the mathematical concept of "relation" to the problem of database management back in 1970. A relation is what you would call a table. Referential integrity and foreign key relationships are just that: referential integrity and foreign key relationships.

The ISO Standard Vocabulary for Databases, ISO/IEC 2382-17, defines "Relation" as:

"A set of entity occurences that have the same attributes, together with these attributes. Note -- In a relational database, a relation can be represented by a table with the rows corresponding to the entities and columns corresponding to the attributes." ISO/IEC 2382-17.04.01

Ich habe kein deutsches Normung. Ich habe nur das englisches/franzoesisches Normung, und mein deutsch-englisches Computerverzeichnis ist in New Jersey. Ich habe ein deutscher Freund gefragt und er hat gesagt dass die logische oder mathematische Ausdruecke "function" und "relation" sind "Funktion" und "Relation" ins Deutsch. Ich glaube dass das Problem nicht eine Uebersetzung Ausgabe ist.

Your english is much, much better than my german.

> > > > > 5. In an OODB you need far less constraints, because they will be
> > > fulfilled
> > > > > automatically .
> > > > Statement #5 is simply untrue.
> > >
> > > It is simple to say "untrue". But fact is, that an OODB knows, what a
> > > collection (or reference) is, and what actions must be taken if an
> object is
> > > deleted.
> >
> > Really? Without anyone telling it what actions to take? Does it always
> > cascade the delete? Does it always Set NULL? Does it always balk?
> > Which of the three does it automagically know to do?

>

> You can have different attributes for the collection. If it has the
> attribute OWNS, then it will delete the associated objects.

These arbitrary and ad hoc "attributes" are less functional than the general integrity function of an RDBMS.

> Anyway, the
> object, that is deleted, will be removed from all collections, that
> reference it.

Why on earth would the DBMS remove an object value from all relations just because I delete it from one???

> > > > > 6. Not shown but obvious: Queries are much easier to formulate in
an
> > > OODB,
> > > > > than in a RDBMS
> > > >
> > > > Not shown nor obvious nor true.
> > >
> > > I will present our query language on our web-site. But since it is
still
> > > under development, there are many things to change.
> >
> > I repeat: Not shown nor obvious nor true. I should have figured you
> > are a vendor given the ignorance you try to spread.
> >
>
> Is it wrong, beeing a vendor ?

No. It is wrong to spread ignorance and misconception. It's just less surprizing when done for financial gain.

> Or who do you think should produce the DBMS, you would like to have.

A knowledgeable, conscientious, educated vendor instead of the all too common variety I usually encounter.

> > > > It is true that relational dbmses will perform better overall than
> > > > non-relational dbmses, but that performance benefit includes
> > > > application development time, facility of schema evolution etc.
> >
> > What's the matter? Cat got your tongue?
> >

>

> This was absolute out of context, I have allways talked about a OODB that
> works relational. But anyway, you write "overall" and not "allways".

Overall implies always -- "always when all things are considered".

> If a
> RDMS would allow navigation (beside of description), there could be an
> "always" instead of an "overall".

Untrue. Exposing pointers in the logical interface offers no performance advantage over using physical pointers hidden from the user.

> > > > > Wrong is:
> > > > > 1. That all optimizations of a query are done with relational
> algebra.
> > > The
> > > > > most common optimization is to use an index.
> > > >
> > > > Since physical storage determines execution speed, all optimizations
> > > > involve physical storage structures. Many different types of indexes
> > > > exist to change performance characteristics. However, indexes only
> > > > scratch the surface of performance altering physical structures. You
> > > > ignore clustering, physical pointers, pointer pools, distribution,
> > > > parallelization and everything else the human mind can imagine.
> > >
> > > This all has nothing to do with relational algebra!
> >
> > It has everything to do with relational algebra. How else will you
> > specify your intent without restricting the possible physical access
> > methods?
> >

>

> Parallel access to your data is a problem of transactions and their
> interference. This is one of the reasons, why I designed the "reference"
> with the abilty to know "when" it is.

You have not answered my question. You have only further validated the contention that the introduction of pointers needlessly complicates all database functions: integrity, manipulation, definition, concurrency, security, etc. Incidentally, the idea of combining a pointer with a transaction identifier is not new -- SQLBase did that years ago.

> > > Of course, you can use all of these optimization techniques with an
> OODB -
> > > and we do.
> >
> > Since your oodb has no concept of view or of join or of projection
> > etc., I doubt your product uses even a small subset of the
> > optimization techniques available to an RDBMS. Since I expect your
> > product to force applications to change in order to alter performance,
> > I doubt your product optimizes at all.
> >
>
> You are talking about something you haven't seen.

I was talking about my expectations, which I know better than anyone. FWIW, you have raised my exectations to those typical for SQL databases.

> > > > > Relational algebra is used to
> > > > > transform the query to use the index. But the index in itself is
not
> > > defined
> > > > > by relational algebra.
> > > >
> > > > You apparently misunderstand Dr. Codd's most fundamental goals when
he
> > > > proposed the relational data model. The relational model explicitly
> > > > avoids any definition of physical structures for the very purpose of
> > > > allowing physical independence. It permits any physical structure.
> > > >
> > >
> > > I have understood very well and I disagree. It permits to retrieve
data,
> > > that fits into the model, from any physical structure. But the model
is
> > > *not* optimal for something like an object.
> >
> > Since a relational domain is an object class and since the values
> > exposed to the user in relations are object values, I can see no valid
> > support for the above statement. It is just another outrageous claim
> > by a vendor trying to sell a flawed product.
> >

>

> Can a relational domain give you these answers:
> - i am of type ClassXY and i am derived from ClassX ?

Yes, if you mean type inheritance when you say "derived".

> - i am not derived from ClassX, but i can look like and act like ?

If you look like and act like, you are derived from, or you are ClassX. Remember physical independence! Received on Sun Sep 30 2001 - 06:20:48 CEST

Original text of this message