Re: foundations of relational theory?

From: Dawn M. Wolthuis <dwolt_at_iserv.net>
Date: 12 Oct 2003 16:37:33 -0700
Message-ID: <6db906b2.0310121537.5892c25b_at_posting.google.com>


"Anith Sen" <anith_at_bizdatasolutions.com> wrote in message news:<b56ib.32551$mQ2.8873_at_newsread1.news.atl.earthlink.net>...
> >> ..are there some facts I am not getting right? <<
>
> Yes, in fact most of your understanding of 1NF is flawed. You seem to
> totally ignore the concept of domains, you ignore the fact that "atomicity"
> is a subjective concept (has no precise definition) and you seem to be have
> apathy for the entire R&D done in the field of the RM in recent years.

Definitely NOT apathy. I have read many a relational dbms treatise with great interest. It fascinates me that it seems to start out with good mathematics in its origin, looking at mathematical relations, and then takes this big leap -- that we should use simple mathematical relations (no embedded lists) AND then also that we should ditch some standard mathematical relation concepts, such as the ordering of domains (relations have ordered "columns"). The only rationale I can find from anyone on this list, from Codd, Date, Pascal, and others is that doing so gives us the simplest mathematics and there is no reason to persist data using a model that would have more complex mathematics behind it. If that really is the entire argument, then where is the science in that statement? KISS IS NOT a scientific argument, even if it is often a wise approach.

> Date recently has an article that clarified some issues on this topic: What
> 1NF really means (Parts 1 & 2). As for a formal treatment of domains in
> relational model (including scalar & complex types and representations),
> read Date's Intro to database systems. Also TTM has a section on non-scalar
> types including tuple & relation types (IIRC, RM prescriptions 7, 9, 10).
> For a brief article on how propositions represented based on relational
> model are logically superior and structurally sound read Relational Database
> writings 94-97 article: Constraints & Predicates (Parts 1, 2, 3) &
> especially the concept of relvar constraint. For a grasp on the mathematical
> support for the notion of constraints, treatment of NFNF (non-first normal
> form) and how it violates PNF (by Roth, a proposed normalization goal for
> nested relations) refer to the chapter references in the Book by Atzeni &
> Antonellis. Pascal's 1993 book on relational databases has a simpler
> explanation on 1NF, which IMO is truly beneficial to an average database
> professional.
>
> BTW, it is hard to provide quick references to rectify every conceptual
> misunderstanding and counterpoint each fallacious MV arguments. Generally,
> it just warrants the invocation of the Principle of Incoherence.
>
> >> Is there some mathematical theorem or any other proof that storing data
> (is NOT) in first normal form is bad or is this just a religion that the
> masses have been buying into for the past few decades? Clearly the XML doc
> specifiers have opted to leave that one in the dust. Is that because they,
> like me, are fools? <<
>
> If the "XML doc specifiers" are proposing an alternative data model, yes,
> they are taking a path, which was proven wrong already, but they aren't
> aware of it yet.

PROVEN? I would LOVE to see the proof -- I have begged to see a proof. There IS NO PROOF OF WHICH I AM AWARE or can you point me to one? In fact, persisting data with a model that is influenced by an understanding of language (since the idea is that we not just store it, but also retrieve it again) does employ a more complex structure by some measures (mathematical relations, for example) but it does so for a reason and, as I understand it, there is emperical evidence from contests performed over a decade ago (so we need to do these again) to show that systems that persist data with XML-like models (that what PICK is) provide a more agile development environment.

Coming from a DBMS background, I was surprised to find that from my experience, my developers who used PICK-based systems also had a more agile environment for maintaining applications over time. That, however, is anecdotal evidence and therefore requires some emperical evidence or mathematical proof before I claimn that I KNOW it to be the case that it is always better to persist data outside of the RDBMS model.

>
> >> I'm willing to be persuaded if there really is some scientific evidence
> to prove that the relational model is the WAY, the TRUTH, and ... <<
>
> You seem not. Moreover, why do you need persuasion?
>
I am sorry if I did not make this clear -- I am VERY interested in nailing this down, whatever the conclusion is. I am honestly seeking mathematical logic and/or scientific evidence to show me either in general or in line with some common requirements for a moderately-sized system whether an RDBMS is really a better (by a set of agree-upon quality measures) environment in which to develop and maintain application software. My experience and that of others I have read, which is all anecdotal, tells me that at least the current RDBMS's out there are not even as good as persisting data using the old-fashioned, yet highly effective, Nelson-PICK model for data persistence (which is similar to a key-value database such as Berkely DB, but more full-featured).

I want evidence, not just "you must not be reading" (I am) or "you are an idiot" (I might be that, too, but not such an idiot that I'm going to accept the religion of the relational database model without some proof that it is also scientifically defensible.

> >> My experience tells me that the relational model is not the best there is
> for ....Does that sound like math or religion to you? <<
>
> Neither. It sounds like the inability and/or unwillingness to devote time
> and effort to comprehend even the most basic concepts in data management.

Do you really think I don't comprehend the most basic concepts or do you think it is possible that I really do comprehend them and just don't buy it? The more I ask RDBMS disciples this question, the more non-scientific replies I get -- that I'm an idiot, or have not read enough, or have not spent time understanding it, or have no experience. This is not about ME, folks. There is no mathematical theorem that I know that says that you pick the simplest mathematical model that resembles mathematical relations and stick with that for your theory because there is nothing to gain by added complexity. Golly gee -- why would mathematicians or scientists ever add complexity? Perhaps to have a better model?

So, does anyone have a proof that leaves out as much RDBMS religion as feasible that proves that we must not persist any data with a model that includes repeating groups? Until the XML folks came around, the PICK/MultiValue folks just figured we had an old model that we kept for one reason only -- it SEEMED TO BE a better bang for the buck. With XML being essentially the PICK model, perhaps we can get in better position to either prove or disprove that theory?

Are there any RDBMS folks who are interested enough in the science and economics of the matter to help me design a contest to gain some emperical evidence?

By the way, do you know why it was over a decade ago that the contest I referred to up front took place? I understand it is because the RDBMS's did not end up beating the PICK folks and didn't want to lose another year, so the competition ended. I don't have the full facts on this either, so it is anecdotal too, but interesting none-the-less, don't you think?

--dawn

Dawn M. Wolthuis Received on Mon Oct 13 2003 - 01:37:33 CEST

Original text of this message