# Re: Constraints and Functional Dependencies

Date: Sat, 24 Feb 2007 15:55:47 GMT
Message-ID: <7eZDh.684\$PV3.10430_at_ursa-nb00s0.nbnet.nb.ca>

Marshall wrote:

> I hope no one will mind too much if I take a break from our
> usual sort of discussion and talk about database theory for
> a bit. I feel like free associating about functional dependencies.
>
> I like to consider a system under which there is only one type
> of constraint: a formula of first-order logic. That is to say,
> there are no uniqueness constraints, foreign keys, primary keys,
> etc., except as such can be expressed as FOL formulas.
>
> In a data management context, there is some value to restricting
> what we can quantify over as being only attributes of declared
> relations (whether variables or constants.) So, we can't express
> the "no-upper-bound" property of the natural numbers; they aren't
> a database table so we can't quantify over them.
>
> With such a system, a relation R with attribute a (which I will
> write as R(a)) having a as a foreign key into S(b) is expressed
> as follows:
>
> forall R(a): exists S(b): a = b
>
> So we can express foreign keys this way.
>
> In the context of relations with named attributes, it is not necessary
> to declare or make up logic variable names, the way we have to when we
> are using sets of ordered pairs. We can use the existing attribute
> names as the names of the logic variables. However that raises the
> question of what happens when we want to quantify over an attribute
> more than once in a given formula. In that case let a primed attribute
> name be considered a second instance of a logic variable quantified
> over the attribute. So if we want to say that for every row of R
> with attribute a there exists a row of R with an unequal value
> for attribute a, we can say:
>
> forall R(a): exists R(a'): a != a'
>
> The way quantification is usually expressed we can only introduce
> one quantified variable at a time, however it makes sense to relax
> this syntactic restriction when dealing with relations with
> named attributes. We can introduce quantified variables matching
> many attributes at once. However we have to be aware that when
> we do so, they are all quantified the same way, and they are
> all considered to come from the same row.
>
> What about candidate keys? Suppose we have a relation R with
> *sets* of attributes A and B: R{A, B}. Again, A is a set of
> attributes, possibly just one or even none; likewise B. Then
> if we can express the functional dependency:
>
> A -> B
>
> that is the same as expressing that A is a candidate key.
> How do we express functional dependencies as FOL formulas?
> Let's start with a simple case, where A and B have one
> attribute each, a and b.
>
> R(a, b)
> {a} -> {b} =def=
> forall R(a, b): forall R(a', b'): a = a' => b = b'
>
> " For all rows (a,b) in R, and for all rows (a', b') in R,
> if a = a', then b = b' "
>
> How does that look if R has many columns?
>
> R{A, B}
> A = {a1, ... an}
> B = {b1, ... bn}
>
> A -> B =def=
> forall R(a1, ... an, b1, ... bn)
> forall R(a1', ... an', b1', ... bn')
> (a1 = a1' and ... and an = an') =>
> (b1 = b1' and ... and bn = bn')
>
> Ugh, I wish that were simpler. But conceptually it's not too
> complex. Starting from the earlier a = a' => b = b' case,
> we have simply expanded the a = a' on the left to a big
> conjunction of a = a' clauses for each attribute in A, and
> done the same thing on the right for the attributes in B.
>
> It's always interesting and enlightening to consider the zero-case.
> Let's look at the B = {} case, and see how it turns out.
>
> R{A, B}
> A = {a1, ... an}
> B = {}
>
> A -> B =
> forall R(a1, ... an)
> forall R(a1', ... an')
> (a1 = a1' and ... and an = an') => ()
>
> Hmm, what's that () doing? It's an empty conjunction, so we can
> replace it with true, which is the identity for and. But what does
> "implies true" really mean? Let's transform it.
>
> We know that (P => Q) <=> (!P or Q), so
>
> forall R(a1, ... an)
> forall R(a1', ... an')
> !(a1 = a1' and ... and an = an') or true
>
> So we have a condition that ends with "or true" which is to say
> it is always true! In other words, the constraint collapses away
> and we are left with nothing. Which, when you think about it, is
> exactly what you would want: if A -> B, and B is empty, that just
> means that A is unique, and that's true by the definition of
> relation. So there shouldn't be anything to check.
>
> What if A = {}?
>
> R{A, B}
> A = {}
> B = {b1, ... bn}
>
> A -> B =
> forall R(b1, ... bn)
> forall R(b1', ... bn')
> () => (b1 = b1' and ... and bn = bn')
>
> transforming that last line:
>
> !true or (b1 = b1' and ... and bn = bn')
>
> simplifying:
>
> b1 = b1' and ... and bn = bn'
>
> What does that mean? For every two rows of R, every attribute
> is the same. That is to say, R has at most one row. Which again
> is exactly what you would expect from a uniqueness constraint
> on zero attributes.
>
> Questions, comments, corrections, criticisms? Anyone have any
> alternate formulas for functional dependencies or candidate
> keys?

forall R(a1,...an): a1 in d1 and ... an in dn

I don't know why you think one cannot express unboundedness--not that the constraint is meaningful for finite computers.

forall R(a): exists R(a'): a' = a + 1

or

forall R(a): exists R(a'): a' > a Received on Sat Feb 24 2007 - 16:55:47 CET

Original text of this message