Re: Normalisation

From: Paul <paul_at_test.com>
Date: Tue, 05 Jul 2005 13:43:50 +0100
Message-ID: <42ca8086$0$30524$ed2619ec_at_ptn-nntp-reader01.plus.net>


Jon Heggland wrote:

>>OK I see. I've never understood the appeal of "nesting" or "unnesting"
>>relations. It doesn't seem to add anything to the relational model and
>>only serves to complicate things. Does anyone have a concrete example of
>>the usefulness of this? Any I've seen in the past seem to be trival to
>>implement in a standard relational model.

>
> You can do outer joins without nulls.

An outer join is just syntactic shorthand for an inner join followed by a union (of the relevant missing rows). It could easily be extended to remove the NULLs problem by forcing the user to specify default values instead.

I don't quite understand how nested relations gets round the problem, though. Because a nested empty relation takes the place of a NULL?

> Also, nested relations are used
> heavily in the definitions of the operators for handling interval types
> in temporal relational DBMSs (Date, Darwen and Lorentzos, "Temporal data
> and the relational model").

I confess I've not read this and can't find a link to a summary of why this requires nested relations. It seems to my uninformed mind that you can talk about time in first order logic, so what is the problem in dealing with temporal data using the standard relational model? I can understand how specialised domains like intervals might be useful, or even some syntactic shorthand for complicated constructs like overlapping periods etc., but not the need for serious amendments to the relational model like nested relations.

>>I agree that there is nothing to stop someone having a "relation" domain
>>but I think this should all be encapsulated in the domain and not
>>pollute the relational model. 

>
> I must admit I am unsure of the significance of encapsulation here.
> Would anyone care to explain?

Just that all the domain stuff is hidden away from view. So the domain values are just like "black boxes" to the relational part. The domain part knows that the internal structure of the values are relations, but the relational part doesn't. I'm not sure if "encapsulation" is the official terminology, but I'm just making up the jargon here.

> Also, I'm not sure what the "pollution" consists of. Is it the
> introduction of GROUP/UNGROUP (nest/unnest)operators? GROUP can be
> defined as a shorthand for a certain kind of EXTEND, iirc---I'm not sure
> about UNGROUP.

The pollution is the breaking of the idea that the values in the columns are handled by the domain part of the DBMS, and are atomic from the point of view of the relational part. (I've used the terms "relational engine" and "domain/type engine" in the past to describe this.)

Maybe this is all just a matter of perception though, and it's not really important to make this separation?

Looking at the Third Manifesto book, it looks like their GROUP/UNGROUP operators are shorthand for something involving EXTEND, but also "RELATION", which is a type generator. What I'm not getting is why the relational model needs a relation type generator - shouldn't that be part of the domain engine and kept separate from the relational engine?

>>Also, I question the practical sense in
>>doing this as well. It seems a bit like having a database-valued domain
>>and "simplifying" your database to be a single value in a one-rowed,
>>database-valued table.

>
> Yes, you shouldn't use relation-valued attributes just because you can.
> In most cases, they are a bad design, which Date is quick to point out.

OK, I'm going one step further here and claiming they are *always* a bad design. :) I'm offering no real proof of this (and I don't think one is really possible) other than a vague appeal to Occam's Razor.

Paul. Received on Tue Jul 05 2005 - 14:43:50 CEST

Original text of this message