Re: Relational Lattice, what is it good for?

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 16 Feb 2006 23:19:33 -0800
Message-ID: <1140160773.918066.273190_at_g47g2000cwa.googlegroups.com>


Mikito Harakiri wrote:
> One of my goals indeed was approaching view updates with
> algebraic transformations method. Needless to add, that Marshall
> criteria itself is not something firmly established yet. (It would be
> as soon as we have a proof).

[Best viewed in a fixed width font.]

Background

I was rocked by the Tropashko paper, because I had been searching for an ideal design for a minimal relational algebra, and had been unable to get much handle on the algebraic issues up to that point. (My working algebra had 3 operators: natural join, outer union, and projection, but it was distinctly unsatisfying, in part because projection
was such an out-of-left-field thing compared to the other two.)

When I read the paper, I immediately threw out my old algebra and replaced
it with the Tropashko algebra, since (among other reasons) it had the desirable algebraic properties. But not distributivity.

I observed that the two operators, over operands of relations with zero attributes, was isomorphic to the boolean algebra. Hence, it was clear there were at least *some* conditions over which that algebra was distributive, since the boolean algebra is distributive. But was it distributive only in that one degenerate case, or most of the time, or what? How to tell?

Conventions

Relations are named with capital letters: A, B, C

Sets of attributes are given with lower case letters, according to which relations these attributes appear in. Thus, (ab) is the set of attributes that are common to A and B. This set may be empty. The letters of the name are normalized to be in alphabetical order.

The two operators of the Tropashko Algebra are written "&&" (natural join)
and "||" (inner union.)

We use set builder notation.

Definitions

A relation's attributes are unordered; we consider a relation (a,ab) to be the same type as a relation (ab,a) without further comment.

Given
A:(a, ab)
B:(b, ab)

The Natural Join, A && B is:

{ (a, b, ab) | (a, ab) in A and (b, ab) in B }

The Inner Union, A || B is:

{ (ab) | (ab) in A or (ab) in B }

Above, when we speak of the (ab) in A, we mean the projection of A over the attributes in (ab).

The two operators && and || have boolean operators at their heart, and we leverage known properties of the boolean algebra accordingly.

What are all the possible partitions of attributes between A, B, and C? There are seven partitions: a, b, c, ab, ac, bc, abc. (Draw the Venn diagram
if you need to convince yourself.)

Given:

A:(a, ab, ac, abc)
B:(b, ab, bc, abc)
C:(c, ac, bc, abc)

Distributive

Does natural join distribute over inner union?

Q: Under what circumstances does the below hold?

A && (B || C) = (A && B) || (A && C)

First, let's fully expand the left side.

  1. A && (B || C) = A && ( { (bc,abc) | (bc,abc) in B or (bc,abc) in C } )
  2. = { (a,ab,ac,bc,abc) | (a,ab,ac,abc) in A and ((bc,abc) in B or (bc,abc) in C ) }

Then let's fully expand the right side.

3) (A && B) || (A && C) = ( { (a,ab,ac,abc,b,bc) |

                              (a,ab,ac,abc) in A and (ab,b,bc,abc) in B
}
                          ||
                            { (a,ab,ac,abc,c,bc) |
                              (a,ab,ac,abc) in A and (c,ac,bc,abc) in C
} )
4)                      = { (a,ab,ac,abc,bc) |
                            ( (a,ab,ac,abc) in A and (ab,bc,abc) in B )
or
                            ( (a,ab,ac,abc) in A and (ac,bc,abc) in C )
}
5)                      = { (a,ab,ac,abc,bc) |
                            (a,ab,ac,abc) in A and
                            ((ab,bc,abc) in B or (ac,bc,abc) in C) }

Now let's push the expansion of A && (B || C) in 2 against the expansion of
(A && B) || (A && C) in 5. The formatting goes a little crazy here.

6) { (a,ab,ac,bc,abc) |       =   { (a,ab,ac,abc,bc) |
     (a,ab,ac,abc) in A and         (a,ab,ac,abc) in A and
     ((bc,abc) in B or              ((ab,bc,abc) in B or
      (bc,abc) in C ) }              (ac,bc,abc) in C) }

The right side has an additional constraint.

Can we describe this constraint precisely? Clearly, the constraint is satified when ab and ac are empty, my original conjecture. In that case, the two sides of the equasion are identical and the equality holds. (Hence: ab, ac empty => join distributes over union.)

But there appear to be other circumstances that will satisfy the equality.
(That is, I was wrong about the "if and only if" part.)

So, it's late and I'm exhausted from the effort so far, so I'm going to take my best stab and then go watch Smallville. Tonight's episode has a cyborg.

If,
for all (ab,ac) in A, (ab) in B or (ac) in C, then the equality holds.

Ugh, I don't think I got that quite right, but my brain is sending me the "I'm done for the night" signal.

(ab,ac) empty trivially satisfies the above. It would also be satisfied if ab was a foreign key from A to B, or ac from A to C.

Suggestions, comments, and particularly
corrections/refutations/confirmations
welcome.

Marshall Received on Fri Feb 17 2006 - 08:19:33 CET

Original text of this message