Re: Please help understanding normal forms

From: Roy Hann <rhann_at_globalnet.co.uk>
Date: Mon, 17 Nov 2003 21:34:33 +0000 (UTC)
Message-ID: <bpbet8$eti$1_at_sparta.btinternet.com>


"Ryan Gaffuri" <rgaffuri_at_cox.net> wrote in message news:1efdad5b.0311171212.4e8ec789_at_posting.google.com...
> Please let me know if Im correct and if not what Im missing.
>
> 1NF -- all tuples are atomic

Nope. All *attributes* are atomic, with respect to their domain. (No one seems to be quite sure if Codd knew quite what he meant by that at the time he first proposed the idea.)

> 2NF -- design allows transitive dependency?

Are you asking if 2NF allows transitive dependency? Yes it does; 2NF just means that all attributes are irreducibly dependent on the key.

> I have 3NF and BCNF in my book and I understand them.

I'll take your word for it.

> I dont understand the definitions of 4NF and 5NF. I cant find an
> english definition of them. The proof language is tough to follow. Can
> someone help me out?

Since you're evidently already looking at some formal treatment of the subject, perhaps a colloquial explanation will give you a toe-hold on 4NF and 5NF. 4NF means that there is no more than one independent multivalued attribute. For example if you imagine a table representing employee's skills, you might have three columns: employee number, language, and certificate. One employee might speak English and Chinese. So language is a multivalued attribute; the employee number multi-determines language. The same employee might also have a certificate in first aid, meat handling, and firearms safety. Hence the employee number also multidetermines certificate. This creates the risk of all sorts of update anomalies, which are eliminated by allowing only one multivalued attribute in the table.

5NF can be violated only when a certain kind of constraint exists. Suppose you want to represent the skills that employees supply to projects, and you model that information with a three attribute table. It is "obvious" that if an employee has a skill then they could supply that skill to any project they are assigned to (i.e. there is a rule that allows us to know what what rows to expect). Suppose that there are two rows in the table:

   HR-project     andy     SQL
   payroll        bob      Cobol

Now suppose that someone adds the row:

  payroll andy Cobol

This row implies that andy knows Cobol, so there should be another row showing that andy can supply Cobol skills to the HR project too (since--regardless of how much he'd like to--he doesn't forget his Cobol when he walks into the HR project's cubicles). The table design fails to prevent the absence of the row that says he can supply Cobol to the HR. By decomposing the table to eliminate join dependencies it becomes impossible to represent this kind of incomplete information because the join of the decomposed tables will automatically include all the necessary rows.

Now that was a bunch of hand-waving, but maybe it will give you a more intuitive grasp of what 4NF and 5NF is all about, as you read the "proof language".

> There is a 6th and 7th NF too right? Is that it?

There is a 6NF. 6NF means there is a key and at most one non-key attribute. There is no 7NF.

Roy Hann Received on Mon Nov 17 2003 - 22:34:33 CET

Original text of this message