# Re: Normalisation

Date: Thu, 07 Jul 2005 19:06:36 GMT

Message-ID: <03fze.139573$4A5.7335025_at_phobos.telenet-ops.be>

Jon Heggland wrote:

> In article <wLVye.138749$to.7294014_at_phobos.telenet-ops.be>,

*> jan.hidders_at_REMOVETHIS.pandora.be says...
**>
*

>>>The quote didn't mention nested relations, it is about sets. Do you make >>>that assumption about sets as well? Why? >> >>These sets are very similar to unary relations. Treating them >>differently would make not much sense because there are simple >>operations that transform one into the other. >> >> >>>Why not about strings? >> >>They are not very similar to relations. :-)

*>*

> A set can be transformed into a unary relation with a simple operation.

*> A string can be transformed into a binary relation (integer and*

*> character) with a simple operation.*

That requires logarithmic space, and not constant space as my transformation. So it is arguably more complex.

>>Besides, most nested >>relational algebras I know are not equipped with an operation for >>unnesting strings.

*>*

> That's just because it's a pretty useless thing to do. :) My point it

*> that the difference between sets and strings in this context is pragma,*

*> not logic.*

My definition of 1NF doesn't make that distinction.

>>>Given the above definition, do you think that a relation with relation- >>>valued attributes in the presence of nest/unnest should be "normalised" >>>to reach 1NF? Why? >> >>It ensures that the operations at relation level are essentially those >>of the flat relational algebra and that they all work, in some sense, at >>the same level.

*>*

> That does not help the least. Normalisation is for base relvars. Even if

*> you normalise them all, derived relvars and query expressions might/will*

*> still be unnormalised if you allow nest/unnest.*

>>The theory of query optimization for these operation is >>reasonably well understood, which is far less the case for operations >>that mix the levels of computation such as the nest and unnest do. In >>some sense we are forcing here the user to keep things simple so that >>the job of the query optimizer becomes easier.

*>*

> I think your objection here is to the existence and complexity of nested

*> version of the standard relational algebra operators.*

Not really. The operators are actually very simple (only nest and unnest in the usual algebra) but these simple operations can still lead to very complex optimisation problems.

> Date does, too---

*> that's why he discards them. He writes:
**>
**> "Now, this idea has been extensively researched [...] under such names
**> as 'nested relations' or 'NFNF relations' (NFNF stands for 'non first
**> normal form' [...]). However, I do not take the approach of references
**> [Jaeschke; Roth, Korth and Silberschatz], which add many new operators
**> to the relational algebra and calculus to handle 'nested relations', and
**> thus definitely do breach first normal form (as the term 'non first
**> normal form' suggests). Rather, I propose a simple generalisation of the
**> relational algebra extend operator to permit (among other things) the
**> use of relation-valued expressions to define attribute values. In this
**> way, I stay within the spirit of the classical relational model, while
**> still obtaining some of the benefits, such as they may be, of 'nested
**> relations.'"
*

*sigh* Let's just say I disagree with Date here on several points.

> Now, I think we surely agree that complicating the relational algebra by

*> adding 'nested' versions of the classical operators is a bad idea.
*

Ok.

> I also think we can agree that set-valued or relation-valued attributes

*> are perfectly okay if you exclude nest/unnest (and the 'nested'
**> classical operators, of course).
*

Yes.

> Thus, I ask you if we can agree that generalising extend to support

*> nest/unnest functionality in the way Date suggests is also okay, since
**> it does not imply adding nested versions of the classical operators; and
**> that it does not affect normalisation.
*

I'm sorry but no. It would still complicate optimization and in fact would it make even more difficult because it adds a calculus-aspect to the algebra that makes it harder to recognize optimizable operations.

- Jan Hidders