Re: foundations of relational theory? - some references for the truly starving

From: Anthony W. Youngman <thewolery_at_nospam.demon.co.uk>
Date: Sun, 26 Oct 2003 20:44:58 +0000
Message-ID: <fgg0BsAKJDn$Ewsx_at_thewolery.demon.co.uk>


In article <1b0b566c.0310222023.73e4ba5d_at_posting.google.com>, Mike Preece <michael_at_preece.net> writes
>>> In Pick the nested rows don't need pointers - they are in-situ. The
>item,
>>> or record, contains all of the related data - hence we get it all
>in a
>>> single read. No need to "dereference" anything. Not such a "plenty
>of
>>> complications".
>>>
>>
>> The "complications are with regards to integrity constraints,
>> normalization theory and query optimization.
>>
>
>Data integrity is simplified, normalisation is greatly simplified and
>there's no need to optimise anything. I know. Sounds incredible
>doesn't it? I'll try to remember to include an example at the end of
>this post.

I was trying to find that post where I said "MV is simpler than relational", and someone replied with the pantomime reply of "no it isn't" or words to that effect. I'll try and show it here, as this seems a good enough place to respond ...

Let's take an invoice. It exists in the real world as a piece (or possibly more than one) of paper. I'm an engineer. I'll use Occam's razor for proof (and if you want to argue, you'd better get some cotton wool. You'll need it :-)

In MV, if we've done it properly, we map one database object (the FILE INVOICES) to the entity "invoices", and we store one real-world invoice as one RECORD in INVOICES. In relational, we'd need a tbl_invoices, a tbl_invoice_addresses, a tbl_order_detail, and maybe more. I think the original razor (translated into English) said "do not multiply entities beyond necessity". Score one to MV - We need one db entity for one realworld  entity - in this example relational needs at least three and this is typical for everything relational.

The Einstein version of the razor redefines it as "make everything as simple as possible - BUT NO SIMPLER!". Well, we've got a "one to one" match, so we don't have a problem. But relational - there's the tbl_invoices table, which represents a real-world entity. There's the other two, which represent invoice *attributes*! Both contain a "foreign key", which represents "is a part of", and relational needs a cascading delete to maintain integrity. But tbl_invoices *also* contains a foreign key, the company_id. This however, actually means "is related to", and should NEVER have a cascading delete attached. Sod's law says a program will delete a company - and bang goes a large number of outstanding invoices! Yes, that's not the fault of the theory, but in the real world it's an accident waiting to happen! In MV, if it's a FIELD ("column") then it's an attribute, and if that attribute is a foreign key, it's a "these two entities are related", and a change to one entity does NOT necessarily affect the other.

So score two to MV - we map a FILE to a real world group of things, one RECORD to each real world instance of this thing, and one FIELD to each distinct attribute (including treating a relationship as an attribute, which Codd admits is a valid approach). A relationship links two entities. With relational, is a table an entity or an attribute or even a relation? An attribute can be a column or a table. A column can be an attribute or a relation (which are considered separate things). A relationship can join two entities, or an attribute to an entity. In short, it's a mess!

Cheers,
Wol

-- 
Anthony W. Youngman - wol at thewolery dot demon dot co dot uk
Witches are curious by definition and inquisitive by nature. She moved in. "Let 
me through. I'm a nosey person.", she said, employing both elbows.
Maskerade : (c) 1995 Terry Pratchett
Received on Sun Oct 26 2003 - 21:44:58 CET

Original text of this message