Re: Functional Dependencies > Uniqueness Constraints
Date: 4 Sep 2006 12:37:16 -0700
Message-ID: <1157398636.460809.244640_at_p79g2000cwp.googlegroups.com>
Jan Hidders wrote:
> Marshall wrote:
> > >
> > > Obviously a DBMS that can express general FDs can express more than one
> > > that can just express CKs, except if you alway normalize to BCNF. But
> > > that doesn't mean there shouldn't be a special notion of CK.
> >
> > Special notion or special notation?
>
> Both. Having this notion makes things easier for the user (DKs are
> easier to understand for most people than FDs)
> and for the DBMS (CKs
> are important for indexes and are easier to maintain then general FDs).
Sure, but in general optimizing for ease of implementation is the wrong choice in code that is going to be the foundation for lots of other code. The same thing is true with library code; good library code writers understand that if they can make things a little easier for client code by making things a lot harder for themselves, that's a good tradeoff.
> > > Determining if a set of fields is a CK on the basis of the known FDs is
> > > actually an NP-complete problem, so if the user is so kind to indicate
> > > them explicitly that is useful. :-)
> >
> > Hmmm. That doesn't seem right to me. (Although in an issue like
> > this, were I a betting man, my money would be on you.)
>
> You can look it up in Garey and Johnson under the list of NP-complete
> problems, along with checking whether a relation is in BCNF and if a
> field is in one of the candidate keys.
>
> > Wait, isn't it the case that all we really care about is *whether* we
> > have a candidate key?
>
> There is always one, by definition.
Yes, in a well-formed relation, there is always at least one candidate key. Now when they user gives us a table definition, we have to check whether it is well-formed. Which is to say, what we really care about is *whether* there is candidate key.
> > [...] Further, the number of FDs and
> > the number of attributes are typically going to be pretty small
> > numbers, wouldn't you think?
>
> Often they will, sometimes they won't.
What's the most you've seen in the field in a base table? My sense is the breakdown is roughly:
90%: 1 key
99%: 2 keys or fewer
1%: more than 2 keys
I'm trying to think of a table in any database at work that has 3 uniqueness constraints. No luck so far. Maybe I'll do a survey.
> > Perhaps I'm missing something, but I don't see the hard
> > in here. Maybe I should try to code it up.
>
> Pleas do. If you come up with a PTIME algorithm you would have proven
> P=NP.
The reason I couldn't see the hard at first is because the algorithms are easy and the inputs are tiny. If your worst-case scenario is n! but n is never > 4 then that's perfectly fine. There is ample precedent. For example, the SML type system is n^2 in the worst case but much more tightly bounded in all practical cases. The C++ template system is Turing-complete, so technically cannot even be said to necessarily terminate; this is addressed in practice by limiting the maximum recursion depth templates can use. We could likewise limit the number or complexity of FDs if necessary. Again, in base tables we would be surprised to see even 4. Or perhaps the better choice is to record FDs but require at least one of them to be a key; this eases our checking burden and enforces normalization.
Marshall Received on Mon Sep 04 2006 - 21:37:16 CEST