Re: Normalisation

From: Jan Hidders <jan.hidders_at_REMOVETHIS.pandora.be>
Date: Thu, 07 Jul 2005 19:06:36 GMT
Message-ID: <03fze.139573$4A5.7335025_at_phobos.telenet-ops.be>


Jon Heggland wrote:
> In article <wLVye.138749$to.7294014_at_phobos.telenet-ops.be>,
> jan.hidders_at_REMOVETHIS.pandora.be says...
>

>>>The quote didn't mention nested relations, it is about sets. Do you make 
>>>that assumption about sets as well? Why?
>>
>>These sets are very similar to unary relations. Treating them 
>>differently would make not much sense because there are simple 
>>operations that transform one into the other.
>>
>>
>>>Why not about strings?
>>
>>They are not very similar to relations. :-)  

>
> A set can be transformed into a unary relation with a simple operation.
> A string can be transformed into a binary relation (integer and
> character) with a simple operation.

That requires logarithmic space, and not constant space as my transformation. So it is arguably more complex.

>>Besides, most nested 
>>relational algebras I know are not equipped with an operation for 
>>unnesting strings. 

>
> That's just because it's a pretty useless thing to do. :) My point it
> that the difference between sets and strings in this context is pragma,
> not logic.

My definition of 1NF doesn't make that distinction.

>>>Given the above definition, do you think that a relation with relation-
>>>valued attributes in the presence of nest/unnest should be "normalised" 
>>>to reach 1NF? Why?
>>
>>It ensures that the operations at relation level are essentially those 
>>of the flat relational algebra and that they all work, in some sense, at 
>>the same level.

>
> That does not help the least. Normalisation is for base relvars. Even if
> you normalise them all, derived relvars and query expressions might/will
> still be unnormalised if you allow nest/unnest.

Even then it still helps. You can always split the query in two parts: (1) the computation of the flattened final result and (2) nesting it to obtain the actual result. Those separate phases are easier to optimise than when they are mixed.

>>The theory of query optimization for these operation is 
>>reasonably well understood, which is far less the case for operations 
>>that mix the levels of computation such as the nest and unnest do. In 
>>some sense we are forcing here the user to keep things simple so that 
>>the job of the query optimizer becomes easier. 

>
> I think your objection here is to the existence and complexity of nested
> version of the standard relational algebra operators.

Not really. The operators are actually very simple (only nest and unnest in the usual algebra) but these simple operations can still lead to very complex optimisation problems.

> Date does, too---
> that's why he discards them. He writes:
>
> "Now, this idea has been extensively researched [...] under such names
> as 'nested relations' or 'NFNF relations' (NFNF stands for 'non first
> normal form' [...]). However, I do not take the approach of references
> [Jaeschke; Roth, Korth and Silberschatz], which add many new operators
> to the relational algebra and calculus to handle 'nested relations', and
> thus definitely do breach first normal form (as the term 'non first
> normal form' suggests). Rather, I propose a simple generalisation of the
> relational algebra extend operator to permit (among other things) the
> use of relation-valued expressions to define attribute values. In this
> way, I stay within the spirit of the classical relational model, while
> still obtaining some of the benefits, such as they may be, of 'nested
> relations.'"

*sigh* Let's just say I disagree with Date here on several points.

> Now, I think we surely agree that complicating the relational algebra by
> adding 'nested' versions of the classical operators is a bad idea.

Ok.

> I also think we can agree that set-valued or relation-valued attributes
> are perfectly okay if you exclude nest/unnest (and the 'nested'
> classical operators, of course).

Yes.

> Thus, I ask you if we can agree that generalising extend to support
> nest/unnest functionality in the way Date suggests is also okay, since
> it does not imply adding nested versions of the classical operators; and
> that it does not affect normalisation.

I'm sorry but no. It would still complicate optimization and in fact would it make even more difficult because it adds a calculus-aspect to the algebra that makes it harder to recognize optimizable operations.

  • Jan Hidders
Received on Thu Jul 07 2005 - 21:06:36 CEST

Original text of this message