Re: foundations of relational theory? - some references for the truly starving

From: cmurthi <xyzcmurthi_at_quest.with.a.w.net>
Date: Mon, 20 Oct 2003 22:03:41 -0400
Message-ID: <3F9493FD.5080305_at_quest.with.a.w.net>


Firstly, thanks, Paul for non-polemic reasoning, and for explaining one difference between the way relational and [Pick] dbs's treat data. It's interesting (and surprisingly fuzzy, using "democracy" and lack of parent-child constraints as a paradigm,) but no less an important theoretical point. From a *practical* viewpoint, having been in the systems and applications trenches long enough, using Pick, not sure it's important enough...ie, the need in Pick of having to establish a master-slave relationship among the data is both convenient and models the real world; it does not *necesarily* constrain you from viewing the data on a level, "democratic" field, though I will concede that there may be efficiency problems. As always, good design will triumph over most odds.

It's obvious that there is not a lot of academic writing about Pick; for the most part it has been ignored by theorists. As one steeped in theory through my academic years, I've never shed any tears over this; but it's equally obvious that trying to explain the model to those who look at the world through a relational lens is difficult at best. Others in this thread have suggested downloading a copy of a Picklike and using it; I suggest that's difficult; using a db without a set of gui/rad tools fresh out of the box is a good road to frustration and failure. This is not, of course any fundamental criticism of the Pick model, but of the marketing effort it engenders, lukewarm at best.

It's late and I no idea whether these are known to anyone or everyone, but I remember this, commissioned by Unidata, now an IBM db: maybe it's of use?

http://www-3.ibm.com/software/data/u2/pubs/whitepapers/nested_rdbms.pdf Abstract
This paper discusses technical advances represented by nested relational database technology. The focus is the removal of the requirement that relational databases conform to the first normal form (atomic attributes) and the advantages thereof. Such databases are technically called nonfirst-normal-form (NF2) databases, but are commonly referred to as nested or extended relational databases. We explain the differences between traditional relational databases and the IBM nested relational databases (principally the ability to nest tables and store complex data structures). We then outline the implications of this advanced technology for the user of relational databases. We conclude by answering questions that are frequently asked about nested relational databases. We provide an extensive bibliography of published material concerning nested relational databases.

We also have:
  http://www.prelude.com/white_papers/multidimensionalwhitepaper.doc. Not exactly academic, but...

Quote from Codd, 1993[ellipis in original, caveat emptor] "The relational dbms...were never intended to provide the very powerful functions for data synthesis, analyis and consolidation that are being defined as multidimensional databases."

Chandru Murthi

Paul Vernon wrote:
> "Dawn M. Wolthuis" <dwolt_at_iserv.net> wrote in message
> news:6db906b2.0310191644.13b47642_at_posting.google.com...
> [snip]
>

>>While this is not an academic statement of why PICK works nicely, it
>>might give some hints on why folks who have put their dollars into the
>>SQL-based RDBMS world can become born-again when they see the
>>difference in dollars needed for a comparable PICK system.

>
>
> Mind if I take it as an academic statement?, and use it to give hints at why
> an MV system will be less useful than a relational system everything else
> being equal.
>
>
>>Again, it is not that PICK is flawless (by any stretch), but as a
>>basis for moving forward, I'd sure rather start with this big bang for
>>the buck implementation than any SQL-based RDBMS I've seen.  And, yes,
>>I know this is a theory forum and not necessarily for opinions based
>>on practice, so I'll go back to the claim, even though not fleshed
>>out, that persisting data based on language -- such as modeling entire
>>propositions together rather than piecing them apart to the extent
>>done in an RDBMS -- makes sense because we are not trying to persist
>>mathematical relations ultimately -- we are trying to persist
>>propositions.

>
>
> Guess what, "trying to persit propostions" is *exactly* what the relational
> model is all about. A tuple (row) is a fact - a proposition.
>
>
>>For example, "Jane Doe has three kids -- John, George, & Paul -- and
>>also three cars -- a 1967 Mustang Fastback and a 1968 VW Bug, but the
>>car she usually drives is her other car -- a 2002 Ford Thunderbird".
>>A lousy sentence, but easy to image on a form.  This sentence/form is
>>about a single person -- Jane Doe

>
>
> Any why, prey, is that sentence not also about the person John (or George or
> Paul)? Or equally it is not about 2002 Ford Thunderbirds?
>
> The point of the relational model is that it is *democratic* - i.e. all data
> is treated equally. Cars have people, people have cars. We do not bias
> ourselves one way or another. Children are no more (or less) important that
> Parents.
>
> We would say that your single sentence above is in need of normalisation
> precisely because it favours some data over some others.
> Now if some data is truly and always more important than some other data, so
> that say you are only ever interested in the *set of kids* that a person
> has, then sure model them as a set valued attribute, but otherwise go the
> extra mile and make kids (and cars) first class citizens.
>
>
>>and would all be filed in a single
>>"folder" in PICK except, possibly, for code files used for validation
>>-- there could be a code file for makes & models of cars.
>>Depending on the application, it might make sense to have 4 "forms" in
>>this folder -- one for each person: Jane and each of her kids or we
>>might decide it isn't important to treat the kids as separate "filed"
>>persons in our system at this point.  If they are filed as separate
>>people, then the Jane Doe record (form/document/proposition) would
>>have a multivalued foreign key (pointer) to each of the child
>>(literally!) records, else it would store the actual data, such as
>>first names of each child.
>>
>>Playing the game with language to parse it out, split it into many
>>fragments (often based on the nouns) for the purpose of storing it,
>>only to need to retrieve it again as a whole, doesn't gain us
>>anything, on the face of it.

>
>
> It gains your data a flat playing field. It allows you to formulate queries
> about (in you example) people who happen to be kids as easily formulating
> queries about people who happen to have kids.
> The relational model encourages this flat, democratic, playing field.This
> ideal has sometimes caused people to say that non-flat data (to speak very
> loosely) has no place in the relational model. I.e. that 'multi-valued (e.g.
> relation/set valued) attributes are not allowed.
> Nowadays we (and I think I speak for most relational advocates here) allow
> relation valued attributes in the relational model. SQL implementations
> however, I hardly need say, have (mostly) not caught up.
>
> However, we would still generally caution against an over reliance on using
> 'multi-values' in a relation database. If used 'badly', we begin to loose
> the flat playing field. Relation valued attributes are OK. Short hands for
> 'multivalued foreign key' constraints would make me nervous, and
> (infinitely) recursive relation valued attributes seem to be particularly
> troublesome.
>
> I would urge those who wish to understand more of these issues to read Chris
> Date's paper "What First Normal Form Really Means" (It will cost you
> though )
>
> http://www.dbdebunk.com/page/page/629796.htm
>
>
> One thing the relational guys might learn from the experiences of MV
> systems, is a better idea of when relational valued attributes are OK and
> when they are not. I'm sure you guys know when MV attributes get misused for
> example.
>
>
>
>
>>[I understand the purpose is to be able to provide a simple language
>>that enforces various data constraints, but I think that purpose is
>>somewhat flawed too.  Protecting data is a noble goal -- controlling
>>data and programmers with fixed, global constraints isn't necessarily
>>the best way to lend such protection in my opinion, but that's a
>>tangent to this post, so later on that.]

>
>
> If a constraint is not global, then it is not, in fact, a constraint.
>
> E.g. these are global constaints.
>
> "WHEN (In Rome) DO (As the Romans)"
> AND
> "WHEN (Not in Rome) DO (As you like)"
>
> The 'local' constraints
>
> "DO (As the Romans)"
> AND
> "DO (As you like)"
>
> mean nothing (i.e. they always evaluate to TRUE, they constrain nothing).
>
> Regards
> Paul Vernon
> Business Intelligence, IBM Global Services
>
>
>
Received on Tue Oct 21 2003 - 04:03:41 CEST

Original text of this message