Re: completeness of the relational lattice

From: Vadim Tropashko <vadimtro_invalid_at_yahoo.com>
Date: Sat, 23 Jun 2007 15:11:42 -0700
Message-ID: <1182636702.650529.155840_at_d30g2000prg.googlegroups.com>


On Jun 23, 6:53 am, Jan Hidders <hidd..._at_gmail.com> wrote:
> Btw. let me give a rough sketch of what my plan of attack is going to
> be. It takes very big steps and ignores important technical issues,
> but it should give you an idea of what my ideas are. The proof is
> going to proceed in several stages with several normal forms. The
> syntax I'll start from is:
>
> E ::= R | E /\ E | E \/ E | 11 | [H] | 'x=y'
>
> I allow H to be empty, so I omitted 00 = [], 01 = 11 \/ [] and 10 =
> 11 /\ []
>
> The first stage will be a normalization to a "union normal form" that
> I already talked about. This normal form means that the expression is
> in a form r_1 \/ ... \/ r_n with the r_i such that the contain no \/
> except in the form s \/ [H]. I will refer to these r_i's as the
> disjuncts.

Let see if I can follow this. Say we have an expression from my other post:

(`x=y`) \/ R \/ ( ((R /\ `y=z`) \/ [x,z]) /\ ((R /\ `x=z`) \/ [z,y]) )

It is in union normal form, right? And

( (`x=z`) \/ ( (R /\ `y=z`) \/ [x,z] ) /\ ( (`z=y`) \/ ( (R /\ `x=z`) \/ [z,y]) )

is not?

> Conjecture: we have enough rewrite rules to achieve this normal form.

While the above example can be worked out to demonstrate that both expressions are equivalent, it fairly easy to cook up the example which can't be reduced to union normal form:

Let R(u,w), S(u), Q(w), T(V), P(u,v), then we can't apply distributivity to simplify

( (R /\ (S \/ Q)) \/ T ) /\ P

> Some of the disjuncts will be conjunctive queries (CQ) and some will
> not. There are three reasons why some are not in CQ: (1) the result
> may be always empty, (2) the result is (infinitely) wide and (3) the
> result is (infinitely) long.

I don't follow this, the example maybe helpful. I can't resist a feeling that those appears to be merely some special cases.

> Conjecture: if a disjunct always returns an empty result it can be
> rewritten to []

In the example above

( (R /\ (S \/ Q)) \/ T ) /\ P

suppose the join of P with the rest of the expression is empty. How do we collapse it to []?

> That would take care disjuntionts with problem (1). For the other two
> problems I conjecture the following:
>
> Conjecture: if a disjunct returns a wide or long result it can be
> rewritten to r /\ W or r /\ W /\ 11 where r is a query in CQ and W a
> conjunction of 'x=y's such that x and y are not in A(r).
>
> If this holds then we can probably split the problem in showing
> completeness for queries in CQ and for special queries such as W.
>
> Conjecture: the rewrite rules are complete for boolean combinations of
> 'x=y's
>
> So we can the concentrate on expressions that represent normal CQ's.
> We can think of those queries as expressed by Horn clauses or:
>
> { (c=x d=y) | R(a=x, b=x) /\ S(a=x, b=y) /\ T(a=y, b=z) }
>
> In fact you might read that as a shorthand for our algebra where for
> example
> - S(a=x, b=y) denotes (S /\ 'a=x' /\ 'b=y') \/ [x,y]
> - { (c=x d=y) | r } denotes (r /\ 'c=x' /\ 'd=y') \/ [c,d]

I'm totally confused here because I don't see any distinction between the 2: the right sides are identical modulo variable names.

> Conjecture: all disjuncts in CQ can be written in this form
>
> For completeness we need to show that if two such queries subsume each
> other we can perform absorption. This is enough because for CQ's it
> holds that disjunctions of CQ's between which there is no subsumption
> that they are equivalent iff it describes the same set of CQ's. So
> what is needed for this? We need to be able to rename the variables,
> e.g., in the above example we should be able to replace x, y, z with
> u, v an w, for example. Of course this is only possible because these
> are projected out afterwards.
>
> Conjecture: the variables can be renamed with the rewriting rules.
>
> Probably this requires that we have or can derive a rule like:
>
> r /\ [H] = (r /\ 'x=y') \/ [H] where x and y not in H, and x in
> A(r) and y not in A(r)
>
> We should also be able to make two variables the same:
>
> r /\ [H] = ((r /\ 'x=y') \/ r) \/ [H]

There must be a typo here, because (r /\ 'x=y') \/ r evaluates to r by absorption. Then r /\ [H] can't be equal to r \/ [H].

> Also needed is a rule to merge two queries:
>
> { (c=x d=y) | r1 } \/ { (c=x d=y) | r2 } = { (c=x d=y) | r1 \/
> r2 }
>
> These would allow us to generate from a disjunct versions with renamed
> variables and versions where more variables are required to be equal,
> which are exactly all queries that are subsumed. So if another
> disjunct is subsumed we can do the reverse, i.e., have absorption.
>
> Until sofar. I hope this made some sense. If have to move to other
> work now, so I might be less present in the coming days.
>
> -- Jan Hidders
Received on Sun Jun 24 2007 - 00:11:42 CEST

Original text of this message