Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> comp.databases.theory -> Re: A Normalization Question

Re: A Normalization Question

From: Tony <andrewst_at_onetel.net.uk>
Date: 7 Jul 2004 05:07:39 -0700
Message-ID: <c0e3f26e.0407070407.3d5b34ca@posting.google.com>


neo55592_at_hotmail.com (Neo) wrote in message news:<4b45d3ad.0407061932.34d640cb_at_posting.google.com>...
> > > "Having several copies" (ie 'brown', 'brown', 'brown') is redundant.
> >
> > a tuple that contained ('brown', 'brown', 'brown')
> > would be representing three independent, distinct facts,
> > therefore it would not be redundant.
>
> Although each cell of the tuple represents a different thing (ie
> fact), they are all named by the same thing (ie string 'brown'). The
> string 'brown' is redundant. Because RM is a limited data model, it is
> hard to see this fact. It is more obvious in XDb1/TDM.

You are of course confusing logical and physical issues - and your issue is absurd in any case. It is not LOGICALLY redundant to record that "Car X is brown" and "Dog Y is brown", because these are two different facts. You are presumably saying that it is PHYSICALLY redundant to store the 5 characters of "brown" twice on the disk, and so you want to physically store the word "brown" once and then point to it many times.

The relational model is LOGICAL and is not concerned with the physical issues. If your argument had any merit at all, an RDBMS implementation could in fact store every data value only once, and then physically point to it from all other records. For all you know, maybe that IS what some RDBMSs do (it isn't, but it would be impossible to tell using SQL).

Perhaps you might like to build an XDb1/TDM physical implementation of an RDBMS like that. It would have a big index file full of values like this:

Index# Value

1         "brown"
2         3.1415926535
3         DATE '2004-07-07'
4         "blue"
5         DATE '2004-07-01'
6         "Mary"

...

When the user inserts a row into a table using SQL like this:

INSERT INTO person (name, eye_color, date_of_birth) VALUES ('Mary','green',DATE '1970-01-01');

... your RDBMS can look up the values 'Mary', 'green' and DATE '1970-01-01' in your humungous VALUES list and either return the index# if it exists, or create a new entry and return its index# if not. Then it can store the record using only your index# pointers like (6,42,10232).

For a SELECT, a similar process will be invoked.

I suggest you go for it: clearly you think this will be a better RDBMS, and the SQL interface will make it marketable. Also it will keep you busy so you don't have so much time to post your ridiculous ramblings here. Received on Wed Jul 07 2004 - 07:07:39 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US