Re: Lucid statement of the MV vs RM position?

From: <ralphbecket_at_gmail.com>
Date: 20 Apr 2006 17:28:03 -0700
Message-ID: <1145579283.132985.253890_at_i39g2000cwa.googlegroups.com>


Pickie wrote:
> There doesn't seem to be a formal model anywhere. Integrity
> constraints are not enforced at the database level.

Yikes.

> http://www.pickwiki.com/cgi-bin/wiki.pl?PhilosophyOfPick gives
> my views of the philosophy behind Pick.

I've just read your article. Two misconceptions of the RM seem to crop up again and again in comparisons between MV and the RM:

(1) the RM is *not* based on the idea of storing data in tables. Under the RM, a relation is a set of rows with the same signature, and a row is a partial function from column names to values. The signature of a row is the domain of the function (i.e., a set of column names). It is purely accidental that relations of this kind can be conveniently portrayed on the page as two- dimensional tables. The emphasis on the RM should be that a relation is a *set* of rows, a row is a *set* of (column name, value) pairs, and sets are unordered, duplicate-free collections.

(2) The RM says *nothing* about *how* a database should be implemented. It would be a mistake to think that because relations are often called "tables" and are often portrayed as 2D arrays, that is how they are stored in memory or on disc. Any implementation taking that route would have shocking performance. The point of the RM is to separate the model (how one thinks of the data) from the representation (how it is stored). A good RDBMS implementation should make good decisions concerning representation (perhaps under the guidance of the DBA), but that is purely an optimization issue. Conflating representation and model is akin to hand optimizing a program before you'r sure it's correct: it will surely lead to a world of pain.

> There is a mindset about the Relational Model that is
> disturbing. The point of view that says that there is no
> TRULY Relational DBMS because of incompetance or wickedness on
> the part of the SQL DBMS providers is just outright wrong.

My gut feeling is that it's partly to do with poor early choices having become the standard and partly to do with the fact that not many people finish a CS degree with any understanding of theory or how the careful application of theory can save huge amounts of time and effort. Given things such as the lack of any decent type theory or the addition of terrible ideas like NULLs into SQL, I'm inclined to think the latter is more significant than the former.

> The problem is that it is difficult in the extreme to build a
> data store of whatever size desired, that can have some
> arbitrarily huge number of people changing the data in it,
> and that will provide the answer to any conceivable query -
> as if the data store were to be frozen until the query is
> done.

The DBMS has to

- ensure data integrity
- ensure data availability
- protect against hardware failure
- manage distribution
- manage concurrent access
- optimize *dynamically* for *multiple* applications.

There is no way it makes sense to implement each of these aspects in every new application. Implementing any one of them well is a huge undertaking.

> Every time you put an index in, or some other cute little
> wrinkle to more cleverly do this, you are argueably
> de-normalising your database. Well, you are storing data in
> multiple places, anyway.

If you bugger up your model ("denormalise" it) you should expect trouble. A good DBMS should allow the DBA to suggest optimizations, but the DBMS should be responsible for implementing those optimizations, which should not affect the model in any way.

> The idea of having a horrendously complex physical
> implementation - in order to provide the appearance of a clear
> logical model - is uncomfortable to me. I question, not the
> Relational Model, but whether implementing this aspect of it
> in this way is worth the trouble.

As someone else said, the same could be said of compilers for high level languages. But as I said above, there are things that you Just Have to Have in a DBMS, and it's better to get them right just once, in one place: the DBMS, not every application.

  • Ralph
Received on Fri Apr 21 2006 - 02:28:03 CEST

Original text of this message