Re: How to normalize this?

From: Jan Hidders <hidders_at_gmail.com>
Date: Sun, 5 May 2013 11:35:46 +0200
Message-ID: <518627f2$0$611$e4fe514c_at_dreader34.news.xs4all.nl>


On 2013-05-04 23:28:01 +0000, compdb_at_hotmail.com said:

> On Thursday, May 2, 2013 3:40:28 AM UTC-7, Erwin wrote:
>

>>>> It's often bothered me that normalization theory/procedure seems to 
>>>> quietly ignore the notion of "nullability or not" of any of the 
>>>> attributes in the "initial table design" ...
>>>> Iow, that normalization theory per se doesn't actually allow to 
>>>> determine when and when not the phenomenon is "an artifact of the 
>>>> iniital table design".
>> However, I'd argue that there is indeed "a phenomenon", and it's 
>> exactly the phenomenon you described as the "...a...b...c...d...some 
>> e..." vs. "...a...b...c...d..." confusion.

>
> You have changed the meaning of "phenomenon" from some alleged
> constraint imposed by an original single-relation schema to certain
> difficulties with certain design processes or their presentation.
> (Presumably you were concerned with the latter but hypothesized the
> former as the cause.)
>
>> Jan observes that in practice most of the time, making this "mistake" 
>> leads people to "discovering" the correct predicates for the business 
>> problem at hand, and seems to be on the side that therefore it must be 
>> a good thing, I personally am rather bothered with the fact that a 
>> theory would be useful because misunderstanding it or only partially 
>> understanding it, in practice mostly leads to correct designs being 
>> derived from less correct ones ...

>
> I agree with you that most (all?) presentations of db design and of
> normalization and the role of the latter in the former are poor. But
> you seem to confuse the latter with the former.
>
> Eg normalization addresses replacing some relations with others but
> doesn't generally address information equivalence and constraint
> preservation, which is different from non-loss decomposition or FD
> preservation.

Agreed, although I'd add that FD preservation at least partially covers constraint preservation.

But the main point of contention is whether in practice this is really such a big issue. I'd certainly agree that in some particular cases it can be important that your normalizing schema transformation results in an information equivalent schema, and I also agree that schema designers who are doing normalization should be aware of the fact that this does not always happen with the usual normalization algorithms.

> Eg when you normalize a relation variable based on FDs you have have to
> know its FDs so you have to know its predicate. (If you didn't have
> have a predicate in mind, why would you even have a relation variable?)
> So normalization cannot be expected to generate proper predicates. I
> agree a design process, incorporating normalization, should.

This is a somewhat odd line of reasoning. It is not true that the schema designer can only think in terms of predicates that are associated with the relvars of the original database schema. So normalization *can* be expected to generate new predicates that are not expressible in terms of the old ones. And in practice it often does. Of course you are free to redefine the process of normalization so that this is not true, but then I'm afraid you would be becoming too theoretical in the bad sense of the word.

The key question still is if being information equivalent (rather then just being lossless) should be the main goal when redesigning your schema / normalizing. You seem to think it is, but in my experience this is almost never the case. So I'm curious to know on what it is that you are basing that claim.

  • Jan Hidders
Received on Sun May 05 2013 - 11:35:46 CEST

Original text of this message