Re: TRUE and FALSE values in the relational lattice

From: Jan Hidders <hidders_at_gmail.com>
Date: Thu, 21 Jun 2007 18:57:41 -0000
Message-ID: <1182452261.505811.48150_at_n2g2000hse.googlegroups.com>


On 21 jun, 16:12, Marshall <marshall.spi..._at_gmail.com> wrote:
> On Jun 21, 3:14 am, Jan Hidders <hidd..._at_gmail.com> wrote:
>
>
>
> > On 19 jun, 22:32, Marshall <marshall.spi..._at_gmail.com> wrote:
>
> > > I am thinking about formalizing the lattice logic. That would
> > > involve a formal statement of the axioms and the rules of
> > > inference. We would then want a soundness proof, and
> > > a proof of consistency. We would also want to understand
> > > how complete such a logic would be, and how strong it
> > > was relative to FOL. And how decidable. My hope is
> > > that it is decidable even if that means it is weaker than
> > > FOL. We also have to think about what else besides
> > > & and | we're putting in to it. Clearly 00 is a strong
> > > candidate. I'm currently thinking that the negation
> > > operator I defined a week or two ago is *not* a good idea.
> > > The scary one is division/group by.
>
> > I would assume the following operations: (in yet another notation)
> > - E * E : natural join
> > - E + E : generalized union
> > - {()} : the relation with empty header and a single empty tuple
> > - [A,B,..D] : the empty relation with header {A, B, ..., D} (possibly
> > the empty set)
> > - A=B : the equality relation with header {A, B}
> > - A=c : the relation with header {A} and tuple (A:c)
> > - R : a relation with name R (and known header)
>
> > That would be roughly equivalent with unions of conjunctive queries,
> > for which equivalence is known to be decidable.
>
> Hmm. But without generalized distributivity, we can't always
> reduce an expression to conjunctive normal form nor
> disjunctive normal form.

Correct, but that is a matter of finding enough algebraic identities. I'm fairly certain I can come up with a set of identities that allows you to rewrite to a normal form r_1 + ... + r_n with the r_i's in the following syntax:

  E ::= (E + [H]) | (E * E) | [H] | (A=B) | (A=c) | R

where H is a set of attribute names, A and B are attribute names, c a constant and R a relation name. Note that you need the E+[H] to do the projection, so you cannot remove all +'s.

> > Adding the set difference or division
> > would make it relationally complete (so, FOL)
> > and make this undecidable.
>
> I would be interested to have the axioms for division
> especially.

Me too. :-) Finding a simple finite algebraic complete axiomatization for FOL is an open problem.

> > Anyone interested in thinking about a paper on that?
>
> Yes.

Ok. But if I'm going to invest time in this I want this to be a focused effort. The goal has to be to find a complete axiomatization, and nothing else. Otherwise I'm afraid it will cost too much time and not lead to any usable (= publishable in real database journal) results. I hope you will forgive me for being a bit bossy.

Right, so let's starts with a small hors d'ouvre. Let's take a very very small and trival part of the algebra and see what we can achieve. I suggest:

  E ::= (E * E) | R

So we can only do natural joins between given relations. What do we need? The obvious ones:

(1)   r * r = r
(2)   r * s = s * r
(3)   r * (s * t) = (r * s) * t

This is complete since two expressions are equivalent if the join the same set of relations, and that is exactly what these rules allow you to show.

Ok. In the next step we add the empty relations [H] with H a possibly empty set of attributes. So the syntax becomes:

  E ::= (E * E) | R | [H]

The extra rules:

(4) [H] * [S] = [H + S] (I'm overloading * and + as intersection and union of sets of attributes)
(5) [H] * R = [H + S] where S is the header of R

Is this sufficient? For the the case [H] did not appear we were already complete. If it does appear then the result will be empty, so equivalence only depends upon the header of the result. In this case you can use (2) and (3) to rewrite it to a form [H_1] * .. * [H_n] * R_1 * .. * R_m. With (4) and (5) you can then rewrite it to the form [H] which indeed gives you the header of the result.

We proceed with the {()} so the syntax becomes:

  E ::= (E * E) | R | [H] | {()}

The extra rules needed:

(6) r * {()} = r

This seems sufficient. Proof proceeds similar as in the previous addition.

What next? Let's take (A=B). So the syntax becomes:

  E ::= (E * E) | R | [H] | {()} | (A=B)

What this adds is that we can we identify certain sets of attributes in the join result that all have to be the same. With the rules we already have we can put all (A=B)'s next to eachother. So we need rules to make sure we can write the same sets to the same form:

(7)    (A=B) = (B=A)
(8)    (A=B)*(B=C) = (A=C)*(C=B)
(9)    (A=B)*(B=B) = (A=B)

Is this complete? You can think of the pairs as edges in an indirected graph, and the rules (7) and (8) are sufficient to allow you to order the nodes in the graph alphabetically if the graph is connected. Moreover, (9) allows you to remove duplicates, so sets of pairs that are connected and describe the same set of attributes can be rewritten to exactly the same form.

Ok. Next would be (A=c) but you can simulate that with a special relation C with header {A}. So all that remains is the + operator, but this is of course the most difficult one, not least because it combines the set union and the projection. A careful approach might be to start with allowing only projection, i.e., expressions of the form E +[H].

That's it sofar. Let me know if you can follow this.

  • Jan Hidders
Received on Thu Jun 21 2007 - 20:57:41 CEST

Original text of this message