Re: BCNF: superkey or candidate key ?
Date: 27 Sep 2006 03:00:05 -0700
Message-ID: <1159351205.410007.70750_at_h48g2000cwc.googlegroups.com>
Jan Hidders wrote:
> David Cressey wrote:
> >
> > Here's what's behind my question. relational tables with at most 1 non-key
> > column is where the great debate about NULLs becomes moot. If you want to
> > leave the non-key column NULL, then just omit the row. Of course, you need
> > another table with just the primary key to keep track of which of the
> > possible primary keys are in existence.
>
> Why the restriction to at most 1 non-key column? It doesn't matter how
> many non-key columns there are: you can always apply that
> transformation if the nullable column is not part of a candidate key.
>
> One might even argue that in theory the restriction to non-key columns
> is not really necessary. In that case you just need to introduce a more
> complex database constraint to replace the candidate key constraint.
That final statement of mine is actually a bit misleading, it is more the FDs that arrive in the nullable column that you should worry about, so let me clarify. An example:
R(A, B, C, D) with FDs AB->C and CD->B, and C nullable
Observe that the CKs are AB and AC. You can split this in two ways:
Split 1: R1(A, B, C, D) contains all tuples of R where C is not null,
and R2(A, B, D) contains all tuples in R projected on {A, B, D} where
C is null
Split 2: R1(A, B, C, D) contains all tuples of R where C is not null,
and R2(A, B, D) contains all tuples in R projected on {A,B}
Let's consider Split 1 first:
What is now the equivalent set of constraints? Clearly we should still
have: - R1: AB->C - R1: CD->B
For R2 we have:
- R2: D->B
We still need to ensure AB->C globally:
- R1[A,B] DISJOINT R2[A,B] (i.e. the projections on {A,B} share no tuples)
And voila, an equivalent schema.
On to Split 2:
We need:
- R1: AB->C - R1: CD->B - R1[A,B] SUBSET R2[A,B] - the FD D->B holds for (R2[A,B,D] MINUS R1[A,B,D])And again we have an equivalent schema.
Both splitting methods can be generalized. So you see that the price of splitting off a nullable key-columns comes either in the form of exclusion dependencies or embedded functional dependencies.
- Jan Hidders