Re: 4NF and 5NF

From: -CELKO- <jcelko212_at_earthlink.net>
Date: 4 Jan 2005 18:34:13 -0800
Message-ID: <1104892453.355647.121130_at_z14g2000cwz.googlegroups.com>


Fourth Normal Form (4NF) makes use of multi-valued dependencies. The problem it solves is that the table has too many of them. For example, consider a table of departments, their projects, and the parts they stock. The MVD's in the table would be:

department ->> projects

department ->> parts

Assume that department d1 works on jobs j1, and j2 with parts p1 and p2; that department d2 works on jobs j3, j4, and j5 with parts p2 and p4; and that department d3 works on job j2 only with parts p5 and p6. The table would look like this:

department job part



d1 j1 p1
d1 j1 p2
d1 j2 p1
d1 j2 p2
d2 j3 p2
d2 j3 p4
d2 j4 p2
d2 j4 p4
d2 j5 p2
d2 j5 p4
d3 j2 p5
d3 j2 p6

If you want to add a part to a department, you must create more than one new row.

Likewise, to remove a part or a job from a row can destroy information. Updating a part or job name will also require multiple rows to be changed.

The solution is to split this table into two tables, one with
(department, projects) in it and one with (department, parts) in it.
The definition of 4NF is that we have no more than one MVD in a table. If a table is in 4NF, it is also in BCNF.

Fifth Normal Form (5NF), also called the Join-Projection Normal Form or the Projection-Join Normal Form, is based on the idea of a lossless JOIN or the lack of a join-projection anomaly. This problem occurs when you have an n-way relationship, where n > 2. A quick check for 5NF is to see if the table is in 3NF and all the candidate keys are single columns.

As an example of the problems solved by 5NF, consider a table of house notes that records the buyer, the seller, and the lender:

HouseNotes
buyer seller lender


Smith     Jones     National Bank
Smith     Wilson    Home Bank
Nelson    Jones     Home Bank

This table is a three-way relationship, but because many CASE tools allow only binary relationships it might have to be expressed in an E-R diagram as three binary relationships, which would generate CREATE TABLE statements leading to these tables:

BuyerLender
buyer lender


Smith      National Bank
Smith      Home Bank
Nelson     Home Bank

SellerLender
seller lender



Jones National Bank
Wilson Home Bank
Jones Home Bank

BuyerSeller
buyer seller



Smith Jones
Smith Wilson
Nelson Jones

The trouble is that when you try to assemble the original information by joining pairs of these three tables together, thus:

SELECT BS.buyer, SL.seller, BL.lender
FROM BuyerLender AS BL,
SellerLender AS SL,
BuyerSeller AS BS
WHERE BL.buyer = BS.buyer
AND BL.lender = SL.lender
AND SL.seller = BS.seller;

you will recreate all the valid rows in the original table, such as
('Smith', 'Jones', 'National Bank'), but there will also be false rows,
such as ('Smith', 'Jones', 'Home Bank'), which were not part of the original table. This is called a join-projection anomaly.

There are also strong JPNF and overstrong JPNF, which make use of JOIN dependencies (JD for short). Unfortunately, there is no systematic way to find a JPNF or 4NF schema, because the problem is known to be NP complete.

As an aside, Third Normal Form is very popular with CASE tools and most of them can generate a schema where all of the tables are in 3NF. They obtain the FDs from an E-R (entity-relationship) diagram or from a statistical analysis of the existing data, then put them together into tables and check for normal forms.

The bad news is that it is often possible to derive more than one 3NF schema from a set of FDs. Most of CASE tools that produce an E-R diagram will find only one of them, and go no further. However, an ORM
(Object Role Model) tool properly, the schema will be in 5NF. I
suggest strongly that you get any of the books by Terry Halpin on this technique.

Ronald Fagin defined Domain/Key Normal Form (DKNF) in 1981 as a schema having all of the domain constraints and functional dependencies enforced. There is not yet a general algorithm that will always generate the DKNF solution given a set of constraints. We can, however, determine DKNF in many special cases and it is a good guide to writing DDL in the real world.

You also need to look up articles by Tom Johnston on non-normal form redundancy problems. Received on Wed Jan 05 2005 - 03:34:13 CET

Original text of this message