Re: TRUE and FALSE values in the relational lattice

From: Vadim Tropashko <vadimtro_invalid_at_yahoo.com>
Date: Thu, 21 Jun 2007 21:52:11 -0700
Message-ID: <1182487931.339457.180120_at_j4g2000prf.googlegroups.com>


On 22 Jun 2007, at 03:20, Jan Hidders wrote:

> On 22 Jun 2007, at 02:51, Vadim Tropashko wrote:
>
>> Jan Hidders wrote:
>>>
>>> On 22 Jun 2007, at 02:36, Vadim Tropashko wrote:
>>>
>>>> I doubt a normal form for RL expressions exists. (If you insist otherwise, we can play a little game when I give you an example and you reduce it to a normal form).
>>>>
>>>> An example may be n-th iteration of transitive closure.
> So you are telling me to normalize:
>
> R(x,y) \/ ( ( (R(x,y)/\`y=z`) \/ [x,z] ) /\ ( (R(x,y)/\`x=z` ) \/ [z,y] ) ) \/ ...
>
> Let me first give you a smaller example. Suppose you have:
>
> R /\ ( S \/ 'z=y') with R(x,y) and S(y,z)

Careful there! You probably don't want to union anything with equality relation. You must have meant:

(R(x,y) /\ `x=z`) \/ [z,y]

where [z,y] is subsequntly generalized to S(y,z).

> We cannot distribute in general, but we have a specific distribution rule:
>
> (1) r /\ ((s \/ [H]) \/ (t\/[H])) = r /\ (s \/ [H]) \/ r*(t \/ [H])

Which is BTW a very limited case embraced by Spight criteria.

Once again, we are interested if union distributes over join, not if join distributes over union. The criteria is more restrictive in this case: if r has attributes that are not in [H] then union doesn't distribute over join. See proposition #2 http://arxiv.org/pdf/cs/0603044

> But for this the arguments of the \/ need to be in a specific form. We can use the following rules to make the headers explicit:
>
> (2) R = R \/ [H] if H is the header of R
> (3) 'x=y' = 'x=y' \/ [x,y]
>
> So we can get the equivalent:
>
> R /\ ((S \/ [y,z]) \/ ('z=y' \/ [y,z]))
>
> And now you can distribute.

> To explain a bit more where I want to go with this i'm going to give you my rough notes.

> How to get into union normal form? There is some form of distributivity for normal union, i.e., a union as in the classical relational algebra, which can be though of as a union of the form (r+[H])+(s+[H]):

> (8) r * ((s+[H]) + (t+[H])) = r*(s+[H]) + r*(t+[H])

so assume s+[H] = s and t+[H] = t

BTW, why don't we define square brackets [R] as an unary operator, expressed in my notation as

[R] = R /\ 00

> If all headers are made explicit, i.e., every subexpression is of the form r+[H], we can make all generalized unions normal by using the following rule:

> (10) [H] + [S] = [H * S] (We let * and + also denote intersection and union over headers)

In my notation:

(10) (H /\ 00) \/ (S /\ 00) = (H \/ S) /\ 00

Proof: apply Spight distributivity criteria.

> This gives us: (r + [H]) + (s + [S]) = r + s + [T] = (r + [T]) + (s + [T]) where T = H * S

So what is acieved here? Is that (r + [T]) + (s + [T]) simpler than (r + [H]) + (s + [S]) ? Yes it is, but it is more complex than r+s!

> How to make the headers explicit:

> (11) R = R+[H] if H is the header of R

R = R \/ (R /\ 00)

Absorption!

> (12) (A=c) = (A=c)+[A]
> (13) (A=B) = (A=B)+[A,B]
> (**) [H] = [H]+[H] derives from (4)
> (14) {()} = {()}+[]
> (**) (r+[H])+(s+[S]) = (r+s)+[T] where T=H*S which follows from (5), (6), > (10)
> (15) (r+[H])*(s+[H]) = ((r+[H])*(s+[H]))+[H]
  the latter gives us (r+[H])*(s+[S]) = ((r+[H])*(s+[S]))+[T] where T=H +S by (10) and (15)

I'm not sure of the significance of (15), but let me reiterate that it appears to be the distributivity that is the blocking factor, which prevents us from unnesting complex expressions. Even Spight distibutivity criteria is not a perfect technique, for example how do we handle the following:

(A /\ 00) \/ (A \/ 11)

On Jun 21, 11:57 am, Jan Hidders <hidd..._at_gmail.com> wrote:
> (7) (A=B) = (B=A)
> (8) (A=B)*(B=C) = (A=C)*(C=B)
> (9) (A=B)*(B=B) = (A=B)

What is the (B=B) (e.g. in set builder notation)?

> Is this complete? You can think of the pairs as edges in an indirected
> graph, and the rules (7) and (8) are sufficient to allow you to order
> the nodes in the graph alphabetically if the graph is connected.
> Moreover, (9) allows you to remove duplicates, so sets of pairs that
> are connected and describe the same set of attributes can be rewritten
> to exactly the same form.
>
> Ok. Next would be (A=c) but you can simulate that with a special
> relation C with header {A}. So all that remains is the + operator, but
> this is of course the most difficult one, not least because it
> combines the set union and the projection. A careful approach might be
> to start with allowing only projection, i.e., expressions of the form E
> +[H].

Not ready to digest this without understanding what (B=B) is. Received on Fri Jun 22 2007 - 06:52:11 CEST

Original text of this message