# Re: Codd's Information Principle

Date: Sat, 31 Oct 2009 13:03:26 -0400
Message-ID: <1P6dnduaO7VD8HHXnZ2dnUVZ_tmdnZ2d_at_giganews.com>

> On Oct 29, 7:40 pm, "Mr. Scott" <do_not_re..._at_noone.com> wrote:
> >Suppose that a database embodies the
> > following sentence,
> > forall x forall y forall z Pxy \/ Qxz
> > The database has two tables, let's call them P and Q, with predicates
> > Pxy
> > and Qxz respectively.
>
> I don't understand what you are trying to say.
> The variable predicates and values determine the database proposition.
> If the predicates for variables P and Q are Pxy and Qxz
> then by definition the database proposition is (ie the database states
> that)
> (AND [for <x, y> in P] Pxy) AND
> (AND [for <x, y> typed by but not in P] ~Pxz) AND
> (AND [for <x, z> in Q] Qxz) AND
> (AND [for <x, z> typed by but not in Q] ~Qxz)
> (The ANDs with fors mean standard math series notation.)

Not so much, but this is an understandable misconception. The information content in the database is the logical sum (disjunction) of the propositions represented by each row in the database. Under both the closed and open world interpretations, only those propositions that are judged to be true are represented by rows in the database. The truth value of the sum of just those propositions that are judged to be true is the same as the truth value of the sum of all consistent propositions. That cannot be said if the logical connector is AND instead of OR.

closed world:
true \/ false = true.
true /\ false = false.

open world:
true \/ unknown = true
true /\ unknown = unknown

> This is equivalent to
> (forall <x, y> in P Pxy) AND
> (forall <x, y> typed by but not in P ~Pxz) AND
> (forall <x, z> in Q Qxy) AND
> (forall <x, x> typed by but not in Q ~Qxz)
> I can't make much sense of what you're writing, but it seems to
> be inconsistent with this.
>
> > The
> > dependencies defined on the database also have an impact.
>
> The dependencies are simply things that are true of the values that
> of P and Q will simultaneously hold. Their tuples are determined by
> their predicates and the way the world is.
> So there's no effect to adding them to the database proposition;
> they're always true.

Are you saying that there is no point in defining constraints?

>
> > inclusion dependency from P[x] to Q[x].
> >forall x forall y forall z Pxy iff Qxz
>
> If P's xs must be in Q then
> forall x forall y (Pxy -> exists z Qxz)

Again, not so much. Pxy and Qxz have x in common; consequently, in order to satisfy Pxy -> Qxz, there cannot be an instance of Pxy that is true without an instance of Qxz that is true and has a corresponding value for x, but there can be an instance of Qxz without a corresponding instance of Pxy. Similarly, in order to satisfy Pxy iff Qxz, there cannot be an instance of Pxy without a corresponding instance of Qxz and there cannot be an instance of Qxz without a corresponding instance of Pxy. But for Pxy \/ Qxz, there can be an instance of Pxy without a corresponding instance of Qxz and there can be an instance of Qxz without a corresponding instance of Pxy.

> but that's not equivalent to what you wrote.
>
> I don't know what you mean by "embodies". Implies? Is thus
> constrained?

"Embodies" is easier to write than "is an arbitrary member of the set of all models of."

A sentence is a well-formed formula with no free variables.

Not "query result" but rather "what can be queried." I thought the context of my use of the term "database" was sufficient to disambiguate between the different senses of the word. I guess I was wrong. Sorry.