# Re: Constraints and Functional Dependencies

Date: Sat, 24 Feb 2007 16:35:33 +0100

Message-ID: <45e05ad5$0$328$e4fe514c_at_news.xs4all.nl>

Hi Marshall,

> With such a system, a relation R with attribute a (which I will

*> write as R(a)) having a as a foreign key into S(b) is expressed
**> as follows:
**>
**> forall R(a): exists S(b): a = b
*

Just to be sure notationally: the first colon reads ' it is true that ', the second ' such that it is true that ', right?

*> So we can express foreign keys this way.
**>
*

> In the context of relations with named attributes, it is not necessary

*> to declare or make up logic variable names, the way we have to when we
**> are using sets of ordered pairs. We can use the existing attribute
**> names as the names of the logic variables. However that raises the
**> question of what happens when we want to quantify over an attribute
**> more than once in a given formula. In that case let a primed attribute
**> name be considered a second instance of a logic variable quantified
**> over the attribute.
*

> So if we want to say that for every row of R

*> with attribute a there exists a row of R with an unequal value
**> for attribute a, we can say:
**>
**> forall R(a): exists R(a'): a != a'
*

This looks like setting up a (unintended) trap of mixing attributes
and attribute values.

Why not prime the first instance^W^W^W all instances?

forall R(a): exists R(a'), exists R(a''): a' != a''
?

(saying that R has at least two rows with different
values for a, just like above.)

> The way quantification is usually expressed we can only introduce

*> one quantified variable at a time, however it makes sense to relax
**> this syntactic restriction when dealing with relations with
**> named attributes. We can introduce quantified variables matching
**> many attributes at once. However we have to be aware that when
**> we do so, they are all quantified the same way, and they are
**> all considered to come from the same row.
**>
**> What about candidate keys? Suppose we have a relation R with
*

... only the ...

> *sets* of attributes A and B: R{A, B}. Again, A is a set of

*> attributes, possibly just one or even none; likewise B. Then
**> if we can express the functional dependency:
**>
**> A -> B
**>
**> that is the same as expressing that A is a candidate key.
*

(if R has no attributes outside A u B)

> How do we express functional dependencies as FOL formulas?

*> Let's start with a simple case, where A and B have one
**> attribute each, a and b.
**>
**> R(a, b)
**> {a} -> {b} =def=
**> forall R(a, b): forall R(a', b'): a = a' => b = b'
**>
**> " For all rows (a,b) in R, and for all rows (a', b') in R,
**> if a = a', then b = b' "
**>
**> How does that look if R has many columns?
**>
**> R{A, B}
**> A = {a1, ... an}
**> B = {b1, ... bn}
**>
**> A -> B =def=
**> forall R(a1, ... an, b1, ... bn)
**> forall R(a1', ... an', b1', ... bn')
**> (a1 = a1' and ... and an = an') =>
**> (b1 = b1' and ... and bn = bn')
**>
**> Ugh, I wish that were simpler. But conceptually it's not too
**> complex.
*

So, let's have a notational shorthand: A' is an instance of A.

A = {a1, ... an}

A' = {a1', ... an'}

So it becomes:

R{A, B}

A = {a1, ... an}

B = {b1, ... bn}

A -> B =def=

forall ( R(A', B'), R(A'', B'') ): (A'=A'') => (B' = B'')

> Starting from the earlier a = a' => b = b' case,

*> we have simply expanded the a = a' on the left to a big
**> conjunction of a = a' clauses for each attribute in A, and
**> done the same thing on the right for the attributes in B.
**>
**> It's always interesting and enlightening to consider the zero-case.
**> Let's look at the B = {} case, and see how it turns out.
**>
**> R{A, B}
**> A = {a1, ... an}
**> B = {}
**>
*

*> A -> B =
**> forall R(a1, ... an)
*

> forall R(a1', ... an')

> (a1 = a1' and ... and an = an') => ()

would become

A -> B =

forall R(A'), forall R(A''): A'=A'' => ()

*>
*

> Hmm, what's that () doing? It's an empty conjunction, so we can

*> replace it with true, which is the identity for and. But what does
**> "implies true" really mean? Let's transform it.
**>
**> We know that (P => Q) <=> (!P or Q), so
**>
**> forall R(a1, ... an)
**> forall R(a1', ... an')
**> !(a1 = a1' and ... and an = an') or true
**>
**> So we have a condition that ends with "or true" which is to say
**> it is always true! In other words, the constraint collapses away
**> and we are left with nothing. Which, when you think about it, is
**> exactly what you would want: if A -> B, and B is empty, that just
**> means that A is unique, and that's true by the definition of
**> relation. So there shouldn't be anything to check.
*

Nice!

*> What if A = {}?
**>
**> R{A, B}
**> A = {}
**> B = {b1, ... bn}
**>
**> A -> B =
*

> forall R(b1, ... bn)

*> forall R(b1', ... bn')
**> () => (b1 = b1' and ... and bn = bn')
**>
**> transforming that last line:
**>
**> !true or (b1 = b1' and ... and bn = bn')
**>
**> simplifying:
**>
**> b1 = b1' and ... and bn = bn'
**>
**> What does that mean? For every two rows of R, every attribute
**> is the same. That is to say, R has at most one row. Which again
**> is exactly what you would expect from a uniqueness constraint
**> on zero attributes.
*

Very nice :-) Received on Sat Feb 24 2007 - 16:35:33 CET