Re: BCNF: superkey or candidate key ?

From: Jan Hidders <hidders_at_gmail.com>
Date: 27 Sep 2006 03:00:05 -0700
Message-ID: <1159351205.410007.70750_at_h48g2000cwc.googlegroups.com>


Jan Hidders wrote:
> David Cressey wrote:
> >
> > Here's what's behind my question. relational tables with at most 1 non-key
> > column is where the great debate about NULLs becomes moot. If you want to
> > leave the non-key column NULL, then just omit the row. Of course, you need
> > another table with just the primary key to keep track of which of the
> > possible primary keys are in existence.
>
> Why the restriction to at most 1 non-key column? It doesn't matter how
> many non-key columns there are: you can always apply that
> transformation if the nullable column is not part of a candidate key.
>
> One might even argue that in theory the restriction to non-key columns
> is not really necessary. In that case you just need to introduce a more
> complex database constraint to replace the candidate key constraint.

That final statement of mine is actually a bit misleading, it is more the FDs that arrive in the nullable column that you should worry about, so let me clarify. An example:

R(A, B, C, D) with FDs AB->C and CD->B, and C nullable

Observe that the CKs are AB and AC. You can split this in two ways: Split 1: R1(A, B, C, D) contains all tuples of R where C is not null, and R2(A, B, D) contains all tuples in R projected on {A, B, D} where C is null
Split 2: R1(A, B, C, D) contains all tuples of R where C is not null, and R2(A, B, D) contains all tuples in R projected on {A,B}

Let's consider Split 1 first:

What is now the equivalent set of constraints? Clearly we should still

have:
- R1: AB->C
- R1: CD->B

For R2 we have:
- R2: D->B
We still need to ensure AB->C globally:
- R1[A,B] DISJOINT R2[A,B] (i.e. the projections on {A,B} share no tuples)
And voila, an equivalent schema.

On to Split 2:

We need:

- R1: AB->C
- R1: CD->B
- R1[A,B] SUBSET R2[A,B]
- the FD D->B holds for (R2[A,B,D] MINUS R1[A,B,D])
And again we have an equivalent schema.

Both splitting methods can be generalized. So you see that the price of splitting off a nullable key-columns comes either in the form of exclusion dependencies or embedded functional dependencies.

  • Jan Hidders
Received on Wed Sep 27 2006 - 12:00:05 CEST

Original text of this message