Re: Is this De-Normalization?

From: Bob Badour <bbadour_at_golden.net>
Date: Mon, 1 Dec 2003 20:10:30 -0500
Message-ID: <ipudnY0c5ZIQe1aiRVn-hA_at_golden.net>


"tom dorgan" <tdorgan_at_yahoo.com> wrote in message news:d4518ef4.0312011036.311b9a1a_at_posting.google.com...
> Hello,
>
> We have a table, call it DataParent. Another table, call it
> DataChildren, contains foreign keys to DataParent. You get the idea
> that this is a one-to-many relationship like the well-known Order and
> LineItem pair.
>
> In a discussion regarding increasing performance, the idea was brought
> up to pull each of the DataChildren items into the DataParent table by
> using a fixed number of columns, call them DataChildren1-N in
> DataParent. Leave unused slots null and only allow N DataChildren per
> DataParent.
>
> I understand this to be First Order de-normalization - we are putting
> an array, set or tuple into a table. I was told that this was not the
> case but did not get an explanation.
>
> Any insights appreciated.

Well, it is and it isn't. Having N attributes called attr1, addr2, ... addrN does not make a repeating group. It makes N independent attributes each functionally dependent on the key. To have a repeating group, one would have to change SQL into a different language that supports repeating groups. On the other hand, some would argue that having all those NULL attributes would violate 1NF and the information principle.

Of course, there are plenty of problems with what you propose. For instance, while the attributes are all independent per SQL, they are not really independent because they are all interchangeable. All permutations of the same values are equivalent leading to potentially N-factorial equivalent representations of the same data.

The design will necessarily complicate any query that needs to address data from the original 'child' table.

Given that all you really want to do is to cluster the data for performance, one must ask why you want to change the logical design at all? Why not have two tables and cluster the data? Received on Tue Dec 02 2003 - 02:10:30 CET

Original text of this message