Re: foundations of relational theory?
Date: Mon, 20 Oct 2003 18:14:03 +0100
Message-ID: <bn15ap$g1a$1_at_gazette.almaden.ibm.com>
"Dawn M. Wolthuis" <dwolt_at_iserv.net> wrote in message
news:6db906b2.0310191644.13b47642_at_posting.google.com...
[snip]
> While this is not an academic statement of why PICK works nicely, it
> might give some hints on why folks who have put their dollars into the
> SQL-based RDBMS world can become born-again when they see the
> difference in dollars needed for a comparable PICK system.
Mind if I take it as an academic statement?, and use it to give hints at why
an MV system will be less useful than a relational system everything else
being equal.
> Again, it is not that PICK is flawless (by any stretch), but as a
> basis for moving forward, I'd sure rather start with this big bang for
> the buck implementation than any SQL-based RDBMS I've seen. And, yes,
> I know this is a theory forum and not necessarily for opinions based
> on practice, so I'll go back to the claim, even though not fleshed
> out, that persisting data based on language -- such as modeling entire
> propositions together rather than piecing them apart to the extent
> done in an RDBMS -- makes sense because we are not trying to persist
> mathematical relations ultimately -- we are trying to persist
> propositions.
Guess what, "trying to persit propostions" is *exactly* what the relational
model is all about. A tuple (row) is a fact - a proposition.
> For example, "Jane Doe has three kids -- John, George, & Paul -- and
> also three cars -- a 1967 Mustang Fastback and a 1968 VW Bug, but the
> car she usually drives is her other car -- a 2002 Ford Thunderbird".
> A lousy sentence, but easy to image on a form. This sentence/form is
> about a single person -- Jane Doe
Any why, prey, is that sentence not also about the person John (or George or Paul)? Or equally it is not about 2002 Ford Thunderbirds?
The point of the relational model is that it is *democratic* - i.e. all data is treated equally. Cars have people, people have cars. We do not bias ourselves one way or another. Children are no more (or less) important that Parents.
We would say that your single sentence above is in need of normalisation precisely because it favours some data over some others. Now if some data is truly and always more important than some other data, so that say you are only ever interested in the *set of kids* that a person has, then sure model them as a set valued attribute, but otherwise go the extra mile and make kids (and cars) first class citizens.
> and would all be filed in a single
> "folder" in PICK except, possibly, for code files used for validation
> -- there could be a code file for makes & models of cars.
> Depending on the application, it might make sense to have 4 "forms" in
> this folder -- one for each person: Jane and each of her kids or we
> might decide it isn't important to treat the kids as separate "filed"
> persons in our system at this point. If they are filed as separate
> people, then the Jane Doe record (form/document/proposition) would
> have a multivalued foreign key (pointer) to each of the child
> (literally!) records, else it would store the actual data, such as
> first names of each child.
>
> Playing the game with language to parse it out, split it into many
> fragments (often based on the nouns) for the purpose of storing it,
> only to need to retrieve it again as a whole, doesn't gain us
> anything, on the face of it.
It gains your data a flat playing field. It allows you to formulate queries about (in you example) people who happen to be kids as easily formulating queries about people who happen to have kids. The relational model encourages this flat, democratic, playing field.This ideal has sometimes caused people to say that non-flat data (to speak very loosely) has no place in the relational model. I.e. that 'multi-valued (e.g. relation/set valued) attributes are not allowed. Nowadays we (and I think I speak for most relational advocates here) allow relation valued attributes in the relational model. SQL implementations however, I hardly need say, have (mostly) not caught up.
However, we would still generally caution against an over reliance on using 'multi-values' in a relation database. If used 'badly', we begin to loose the flat playing field. Relation valued attributes are OK. Short hands for 'multivalued foreign key' constraints would make me nervous, and (infinitely) recursive relation valued attributes seem to be particularly troublesome.
I would urge those who wish to understand more of these issues to read Chris Date's paper "What First Normal Form Really Means" (It will cost you though )
http://www.dbdebunk.com/page/page/629796.htm
One thing the relational guys might learn from the experiences of MV systems, is a better idea of when relational valued attributes are OK and when they are not. I'm sure you guys know when MV attributes get misused for example.
> [I understand the purpose is to be able to provide a simple language
> that enforces various data constraints, but I think that purpose is
> somewhat flawed too. Protecting data is a noble goal -- controlling
> data and programmers with fixed, global constraints isn't necessarily
> the best way to lend such protection in my opinion, but that's a
> tangent to this post, so later on that.]
If a constraint is not global, then it is not, in fact, a constraint.
E.g. these are global constaints.
"WHEN (In Rome) DO (As the Romans)"
The 'local' constraints
"DO (As the Romans)"
mean nothing (i.e. they always evaluate to TRUE, they constrain nothing).
AND
"WHEN (Not in Rome) DO (As you like)"
"DO (As you like)"
Regards
Paul Vernon
Business Intelligence, IBM Global Services
Received on Mon Oct 20 2003 - 19:14:03 CEST
