Re: Getting Normal Forms *wrong*. Is there actual disagreement on what 1NF > 3NF means or is this sloppiness or ignorance on authors' parts?

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Fri, 07 Aug 2009 14:41:09 -0300
Message-ID: <4a7c6737$0$23786$9a566e8b_at_news.aliant.net>


dana wrote:

> Is there "genuine" disagreement on what 1NF, 2NF and 3NF mean? Or is
> the variety of contradictory in-print definitions of Normal Forms due
> to sloppy thinking, slopping writing, or plain old ignorance from
> authors?

Yes.

> Also, where can I find an accurate, concise (less than 3
> sentences for each NF), and easy-for-any-IT-person-to-understand,
> definition of Normal Forms 1 through 3?

http://filedb.experts-exchange.com/incoming/2009/02_w06/102071/rettigNormalizationPoster.pdf

> My understanding of Normal Forms 1 through 3 is as follows. I look
> forward to being corrected--but hopefully in a polite, constructive,
> professional manner. I'm leaving out any mention of relations and
> relvars here, because I don't expect the typical IT person to know
> what those are (although it can be argued, convincingly I think, that
> they should):
>
> 1NF:
> 1) No multivalued attributes (e.g. every attribute in every
> tuple should be atomic)
> 2) No repeating groups of attributes are allowed (e.g.
> SKILL_1, SKILL_2, SKILL_n are out)

A repeating group is basically a synonym for a multivalued attribute. Whether SKILL_1, SKILL_2 etc. comprise a repeating group depends on what SKILL_1, SKILL_2 etc. are.

If SKILL_1 is always "reading" and SKILL_2 is always "horseback riding", for example, then they do not violate 1NF. If one is trying, on the other hand, to construct an array type where none exists such that sometimes SKILL_1 is "reading" and sometimes it is "horseback riding", then that is a problem. Some might argue it still does not violate 1NF and is just an incoherent design.

> Many common definitions I've read in books on database design miss
> point #1 completely. Is that because #1 is not part of 1NF, because
> the authors are "protecting" the reader, or because of author
> ignorance on what it is for something to be in 1NF?

I suspect it is just a little misunderstanding regarding the terms used.

> 2NF:
> 1) Must already be in 1NF
> 2) Every non-key attribute must rely exclusively on the
> key, e.g. no Functional Dependencies.

Statement 2) makes no sense to me. 1NF requires every non-key attribute is functionally dependent on each candidate key.

> Here's something I really need clarification on. Is it the case, or is
> it *not* the case, that 2NF "only applies" in cases where there is a
> composite key?

It is the case.

> This is my understanding, but it may be dead wrong. If
> so, I need to know sooner or later. I have seen this point glossed
> over or missed by various authors, some even using the words "fully
> dependent" when perhaps they should have used "Functionally
> Dependent"

Each non-key attribute has to be functionally dependent on the entire key -- not just part of a composite key. "Fully dependent" seems like a shorthand for that--perhaps an ill-advised shorthand.

> ; meaning they either don't know what a functional
> dependency is, or they're trying to "protect" the reader (and that may
> or may not be legit given the context and one's proclivities).

I suggest you stop and review what functional dependency means. Normalization is all about functional dependencies, and I suspect you would understand normalization better if you understood functional dependency better.

> 3NF:
>
> 1) Must already be in 2NF
> 2) No transitive dependencies. All non-key attributes must
> rely exclusively on the key.
>
> I would like some concise and clear definitions of functional and
> transitive dependencies, because 2NF and 3NF sound similar to me.

Functional dependency is closely related to the logical operation called "implication", and the same symbol is generally used for both. ie. the arrow.

B is functionally dependent on A if for every instance of the same value of A, one has the same value of B. Thus, any given value of A implies a specific value of B.

Note that two different values of A might imply the same value of B so the relation is not symmetric. B does not necessarily imply A when A implies B.

While not symmetric, implication and functional dependency are transitive. If B implies C and A implies B, then A implies C too.

A->B->C means A->C

In the latter case, the functional dependency is transitive. In a sense, A only implies C indirectly.

> Pet peeve:
>
> I dislike the mantra, "the key, the whole key, and nothing but the
> key" to characterize Normal Forms 1 through 3. What is evidently
> intended as a helpful mantra seems to give a false sense of
> understanding. I'm all about some mnemonics, but understanding must
> preceed the mnemonic, right?

It's not so much a mantra as a very accurate description of what the normal forms mean. 1NF means every non-key attribute depends on the key (and not some array index instead.) 2NF means every non-key attribute depends on the whole key (not just part of a composite key). 3NF means every non-key attribute depends on the key but not on any non-key attribute(s).

The only part of the mnemonic that is shaky is the last part. If there are multiple candidate keys, of course, the non-key attributes depend on each of the candidate keys-not just a single key.

> Finally, what are some good books out there on database design for
> neophytes and experienced folks alike that gets the theory correct
> while not getting mired in what relational databases *could* be. I
> read a really great book by Fabian Pascal, the title of which escapes
> me, which I need to go and buy--believe it had the word Practitioner
> in it.

The one you are thinking of is Fabian's /Practical Issues in Database Management: A Reference for the Thinking Practitioner/

His 2nd book is good too. I have never read his 1st book.

Every text I have found has some annoying errors, but many are still useful for learning 1st principles. I originally learned from one of Teorey's books.

> I want to be faithful to relational theory as much as possible while
> getting work done with products currently on the market. I believe
> Fabian Pascal stated that theory was immensely important for practice;
> that it was not about a bunch of curmudgeonly academics bickering
> about how many angels can stand on the head of a pin. Is he correct?

He is correct. Of course, he's correct. I don't know why that would surprize anyone or why anyone would have to ask.

Would you want to fly in an airplane designed by someone who was ignorant of the theories of fluid dynamics, statics or newton's laws? You might end up in an airplane with no usable control surfaces, with wings that fall off in mid-flight, or with no way on landing to stop before the end of the runway.

> While it's essential that methodologists and theorists are out there
> visioning what databases can or should be, practitioners have to get
> projects completed with existing resources.

How will they use the resources if they don't know how to use them? Received on Fri Aug 07 2009 - 19:41:09 CEST

Original text of this message