Re: Relational lattice completeness

From: Jan Hidders <hidders_at_gmail.com>
Date: 6 Apr 2006 06:05:03 -0700
Message-ID: <1144328703.855349.57740_at_i39g2000cwa.googlegroups.com>


Mikito Harakiri wrote:
> Jan Hidders wrote:
> > That is under the assumption you are asking whether you are complete
> > for all models of that cardinality. But here we only have *one* model,
> > namely the (infinite) lattice over all relations as defined by the two
> > operations. So, since there is only one, all the models you are
> > considering are isomorphic.
> >
> > Sorry for asking such hard questions.
>
> AFIR you mentioned long time ago that there is some kind of
> incompleteness result for RA. Do you have any reference? (Then it would
> be clear for me what kind of completeness you have in mind.)

I'm not sure I said that, but there is a strongly related result about the undecidability of equivalence of two relational algebra expressions. But I suspect that is not going to help you much. I already gave an almost formal description of the problem in sci.math so I'm not sure what more there is to explain. I will elaborate that a bit:

We are talking about expressions as described by the following syntax (where R is assumed to be the set of relation variables):

  E ::= R | (E /\ E) | (E \/ E)

We define a database as a tuple D = (S, f) where S is called the schema of D and a finite set of relation variables and f a function that maps each element in R to a relation. Clearly if an expression e contains only variables from S then it defines a query over D and its result is denoted as e(D).

Now we define two expressions e1 and e2 as query equivalent if (1) they contain the same set of relation variables, denoted as S, and (2) for every database D with schema S it holds that e1(D) = e2(D). We denote this fact as e1 =q= e2.

I think it is also clear that a set of algebraic identities of the form e1 == e2 defines an equivalence relationship over expressions as defined above by defining two expressions as equivalent iff we can rewrite one to the other with these identities. By rewriting I mean simply that if we apply e1 == e2 to e3 then we try to match e1 with a subexpression in e3 and replace it with e2 while replacing in e2 the variables with the subexpressions to wich they were mapped in e3 when matching e1. We denote this equivalence relation as e1 =a= e2.

We now call a set of algebraic identities *complete* if ifor all two expressions e1 and e2 that contain the same set of relation variables it holds that e1 =q= e2 iff e1 =a= e2.

If this holds then you can be sure that you have all the rewrite rules you need to do query optimization. That's why this is an interesting question. It would also establish that your set of operations is really interesting as it describes a large class of queries but allows to do something that is not possble for the full relational algebra.

Does that clear some things up?

  • Jan Hidders
Received on Thu Apr 06 2006 - 15:05:03 CEST

Original text of this message