Re: Normalisation

From: Jon Heggland <heggland_at_idi.ntnu.no>
Date: Fri, 8 Jul 2005 12:06:19 +0200
Message-ID: <MPG.1d386e72cc1c7e369896ef_at_news.ntnu.no>


In article <03fze.139573$4A5.7335025_at_phobos.telenet-ops.be>, jan.hidders_at_REMOVETHIS.pandora.be says...
> >>These sets are very similar to unary relations. Treating them
> >>differently would make not much sense because there are simple
> >>operations that transform one into the other.
> >>
> >>>Why not about strings?
> >>
> >>They are not very similar to relations. :-)
> >
> > A set can be transformed into a unary relation with a simple operation.
> > A string can be transformed into a binary relation (integer and
> > character) with a simple operation.
>
> That requires logarithmic space, and not constant space as my
> transformation. So it is arguably more complex.

Please elaborate. Assuming for the sake of the argument that you are right, so what?

> >>Besides, most nested
> >>relational algebras I know are not equipped with an operation for
> >>unnesting strings.
> >
> > That's just because it's a pretty useless thing to do. :) My point it
> > that the difference between sets and strings in this context is pragma,
> > not logic.
>
> My definition of 1NF doesn't make that distinction.

Your definition of 1NF seems singularly useless if you cannot use it to determine the quality of a relvar in any way---unless you introduce a lot of unstated and pragmatic assumptions. It is also rather complicated, imo, since you have to refer to operations over signatures and proper classes as opposed to sets/domains.

Let me define Jon's Normal Form (JNF): A relation is in JNF if it has an even number of columns. If a particular DBMS cannot handle relations with odd numbers of columns (or is inefficient at it), you should normalise. :)

> Even then it still helps. You can always split the query in two parts:
> (1) the computation of the flattened final result and (2) nesting it to
> obtain the actual result. Those separate phases are easier to optimise
> than when they are mixed.
>
> >>The theory of query optimization for these operation is
> >>reasonably well understood, which is far less the case for operations
> >>that mix the levels of computation such as the nest and unnest do. In
> >>some sense we are forcing here the user to keep things simple so that
> >>the job of the query optimizer becomes easier.
> >
> > I think your objection here is to the existence and complexity of nested
> > version of the standard relational algebra operators.
>
> Not really. The operators are actually very simple (only nest and unnest
> in the usual algebra) but these simple operations can still lead to very
> complex optimisation problems.

More complex than (say) summarize / group by? Isn't a nested relation pretty equivalent to the "grouped table" that is produced in SQL?

I see your point, but for me it smacks of the kind of reasoning that leads you to "denormalise" relations because join is too complex and slow. It may be the right choice at the implementation level, but we should separate that from the logical level.

-- 
Jon
Received on Fri Jul 08 2005 - 12:06:19 CEST

Original text of this message