Re: Functional Dependencies > Uniqueness Constraints

From: Marshall <marshall.spight_at_gmail.com>
Date: 30 Aug 2006 07:00:20 -0700
Message-ID: <1156946420.708493.121420_at_m79g2000cwm.googlegroups.com>


Jon Heggland wrote:
> Marshall wrote:
> > Okay, fine. I propose that a good relational DBMS should
> > not have any special feature for enforcing uniqueness constraints.
> > Instead, it should be able to record and enforce functional
> > dependencies.
>
> I'd rather say that an RDBMS should be able to reason about FDs.
> Uniqueness constraints are convenient, and conceptually simpler than
> FDs, and half the point of normalisation is to enable all FDs to be
> specified as uniqueness constraints / keys.

Even accepting that all these things are true, it is also the case that if you don't have FDs explicitly, then you are losing some information and some chances for optimization. It is also true that a system that supported FDs explicitly could express everything that one that had only uniqueness could; in other words, one concept is strictly more powerful than the other, and I propose there is a language design principle that says that in that situation, you should not include the weaker concept.

> But what exactly do you mean? That the DDL of the DBMS shouldn't include
> the concept of uniqueness constraints?

That the kernel language shouldn't support them. If the user-visible level has syntactic shorthand for them, that's not an issue.

> And/or that the DBMS shouldn't
> bother optimising the enforcement of FDs that correspond to keys (i.e.
> where the closure is the entire relvar heading)?

I certainly don't mean that.

> > (And of course there must be a rule that
> > says every base table must have at least one functional
> > dependency in which the union of the determinant set
> > and the dependent set equals the set of attributes. (This
> > restriction is sufficient to ensure every base table is a
> > relation; is it necessary?))
>
> No more (or less) necessary than a rule that any relvar must have at
> least one key specified.
>
> > Why? Because uniqueness constraints by themselves
> > don't capture enough meaning.
> >
> > Example:
> >
> > Relation R1(a,b) with FD {a} -> {b}
> > Relation R2(c,d) with FD {c} -> {d}
> >
> > Derived relation J = R1 join R2
> > (this join is a cross product)
> >
> > FDs in J:
> > {a} -> {b}
> > {c} -> {d}
> >
> > What if we had just recorded uniqueness constraints in R1
> > and R2 (on a and c respectively?) We can't express anything
> > derived from those uniqueness constraints in J, at least not
> > as uniqueness constaints anyway.
>
> Yes, we can. Dataphor, for example, derives that the/a key for J is { a,
> c }.

Cool.

> In any case, from the posited uniqueness constraints on R1 and R2,
> we can infer the FDs {a} -> {b} and {c} -> {d}. Now, I agree that the
> DBMS should be able to infer that these FD still hold in J. (Dataphor
> does not do this, afaik, with the result that its key inference is
> incomplete/wrong/weak for more complex expressions.) But to discard the
> concept of uniqueness constraints is to throw the baby out with the
> bathwater---it seems detrimental to both usability and performance. The
> ability to specify constraints in the form of FDs might be useful, but
> marginally so---in most cases, keys are both sufficient (due to
> normalisation) and simpler; and in the cases where they aren't, FDs can
> be expressed as database constraints without special syntax.
>
> To sum up: A good RDBMS should be able to infer FDs from keys /
> uniqueness constraints (and inclusion dependencies), and reason about
> them; in particular, to infer FDs (and thus keys) for arbitrary relation
> expressions, e.g. in order to perform semantic optimisation.

Interesting. I'll have to think about that.

One side comment: I note that the use case for a means of abstracting constraints seems to come up repeatedly.

Marshall Received on Wed Aug 30 2006 - 16:00:20 CEST

Original text of this message