Re: Relational Lattice, what is it good for?

From: paul c <toledobythesea_at_oohay.ac>
Date: Fri, 17 Feb 2006 12:34:34 GMT
Message-ID: <upjJf.30924$sa3.29988_at_pd7tw1no>


Marshall Spight wrote:
> Mikito Harakiri wrote:
>

>>One of my goals indeed was approaching view updates with
>>algebraic transformations method. Needless to add, that Marshall
>>criteria itself is not something firmly established yet. (It would be
>>as soon as we have a proof).

>
>
> [Best viewed in a fixed width font.]
>
> Background
>
> I was rocked by the Tropashko paper, because I had been searching
> for an ideal design for a minimal relational algebra, and had been
> unable to get much handle on the algebraic issues up to that point.
> (My working algebra had 3 operators: natural join, outer union, and
> projection, but it was distinctly unsatisfying, in part because
> projection
> was such an out-of-left-field thing compared to the other two.)
>
> When I read the paper, I immediately threw out my old algebra and
> replaced
> it with the Tropashko algebra, since (among other reasons) it had
> the desirable algebraic properties. But not distributivity.
>
> I observed that the two operators, over operands of relations with
> zero attributes, was isomorphic to the boolean algebra. Hence, it
> was clear there were at least *some* conditions over which that
> algebra was distributive, since the boolean algebra is distributive.
> But was it distributive only in that one degenerate case, or most
> of the time, or what? How to tell?
>
>
> Conventions
>
> Relations are named with capital letters: A, B, C
>
> Sets of attributes are given with lower case letters, according
> to which relations these attributes appear in. Thus, (ab) is the
> set of attributes that are common to A and B. This set may be empty.
> The letters of the name are normalized to be in alphabetical order.
>
> The two operators of the Tropashko Algebra are written "&&" (natural
> join)
> and "||" (inner union.)
>
> We use set builder notation.
>
>
> Definitions
>
> A relation's attributes are unordered; we consider a relation (a,ab) to
> be the same type as a relation (ab,a) without further comment.
>
>
> Given
> A:(a, ab)
> B:(b, ab)
>
> The Natural Join, A && B is:
>
> { (a, b, ab) | (a, ab) in A and (b, ab) in B }
>
> The Inner Union, A || B is:
>
> { (ab) | (ab) in A or (ab) in B }
>
> Above, when we speak of the (ab) in A, we mean the projection of A
> over the attributes in (ab).
>
> The two operators && and || have boolean operators at their heart,
> and we leverage known properties of the boolean algebra accordingly.
>
> What are all the possible partitions of attributes between A, B, and C?
> There are seven partitions: a, b, c, ab, ac, bc, abc. (Draw the Venn
> diagram
> if you need to convince yourself.)
>
>
> Given:
> A:(a, ab, ac, abc)
> B:(b, ab, bc, abc)
> C:(c, ac, bc, abc)
>
> Distributive
>
> Does natural join distribute over inner union?
>
> Q: Under what circumstances does the below hold?
>
> A && (B || C) = (A && B) || (A && C)
> ...

Although I've been using truth tables instead of your set builder notation, I believe the above distribution is true when A, B and C have common columns, including when the common columns are just the empty set, which I think you've already pointed out. To put it another way that might be more useful, it's true when A, B and C are first projected on their common columns, that the distribution as well as de Morgan's laws seem to work okay.

I'm not crazy about calling it 'inner union' because that tempts me to compare it to 'inner join' which usually has a different context. I think of it as 'projection union'. So if there were an analogous 'projection join', (for want of imagination, let me use ||| and &&& instead of || and &&),

A &&& (B ||| C) = (A &&& B) ||| (A &&& C) and also
A ||| (B &&& C) = (A ||| B) &&& (A ||| C)

seem true to me.

I imagine an analogous kind of equality would give the same effect, eg.,

A && (B || C) === (A && B} || (B && C)

which seems sort of reminiscent of the 'semi-' mods to difference and join.

Maybe semi-union is what it is, I don't know if that term is already in use.

I suppose another way of describing it is that if one distributes projection as well as join and union, ie., existentially quantify before applying the other operators, the distribution is true.

Whether this all has any useful purpose and why I should be titillated by it, I don't know, maybe it is just relational masturbation. I believe the 'generalized union' is the same as Hugh Darwen's BS12's common column union, even though Mikito seems to have had a different starting point. Do you think I've got this wrong?

I must admit that I've forgotten whether I had any strong reasons for being intrigued by it (as I said in an earlier note, I didn't pay much attention to the lattice idea as I don't have a good understanding of it and that isn't to say that it mightn't be useful) but a minor one I remember is that it seemed just a little more general than having the usual restriction of limiting union to common columns. (I take it that the really basic reason for having union is so that one can add a tuple to a 'relvar'.) I haven't tried to see how it might help optimizations.   Also, I found it easier to visual materialized disjuntions in my head and am still wondering if it gives any aid when handling keys in views or in having an indirect way to know which columns are column (the projection union's heading would always be the common columns). Maybe it's all just a curiousity and Codd knew this and discarded it long ago - I don't know.

pc

>
> First, let's fully expand the left side.
>
> 1) A && (B || C) = A && ( { (bc,abc) | (bc,abc) in B or (bc,abc) in C }
> )
>
> 2) = { (a,ab,ac,bc,abc) | (a,ab,ac,abc) in A and
> ((bc,abc) in B or (bc,abc) in C
> ) }
>
> Then let's fully expand the right side.
>
> 3) (A && B) || (A && C) = ( { (a,ab,ac,abc,b,bc) |
> (a,ab,ac,abc) in A and (ab,b,bc,abc) in B
> }
> ||
> { (a,ab,ac,abc,c,bc) |
> (a,ab,ac,abc) in A and (c,ac,bc,abc) in C
> } )
>
> 4) = { (a,ab,ac,abc,bc) |
> ( (a,ab,ac,abc) in A and (ab,bc,abc) in B )
> or
> ( (a,ab,ac,abc) in A and (ac,bc,abc) in C )
> }
>
> 5) = { (a,ab,ac,abc,bc) |
> (a,ab,ac,abc) in A and
> ((ab,bc,abc) in B or (ac,bc,abc) in C) }
>
> Now let's push the expansion of A && (B || C) in 2 against the
> expansion of
> (A && B) || (A && C) in 5. The formatting goes a little crazy here.
>
> 6) { (a,ab,ac,bc,abc) | = { (a,ab,ac,abc,bc) |
> (a,ab,ac,abc) in A and (a,ab,ac,abc) in A and
> ((bc,abc) in B or ((ab,bc,abc) in B or
> (bc,abc) in C ) } (ac,bc,abc) in C) }
>
> The right side has an additional constraint.
>
> Can we describe this constraint precisely? Clearly, the constraint
> is satified when ab and ac are empty, my original conjecture. In
> that case, the two sides of the equasion are identical and the
> equality holds. (Hence: ab, ac empty => join distributes over union.)
>
> But there appear to be other circumstances that will satisfy the
> equality.
> (That is, I was wrong about the "if and only if" part.)
>
> So, it's late and I'm exhausted from the effort so far, so I'm going
> to take my best stab and then go watch Smallville. Tonight's episode
> has a cyborg.
>
> If,
> for all (ab,ac) in A, (ab) in B or (ac) in C,
> then the equality holds.
>
> Ugh, I don't think I got that quite right, but my brain is sending
> me the "I'm done for the night" signal.
>
> (ab,ac) empty trivially satisfies the above. It would also be
> satisfied if ab was a foreign key from A to B, or ac from A to C.
>
>
> Suggestions, comments, and particularly
> corrections/refutations/confirmations
> welcome.
>
>
> Marshall
>
Received on Fri Feb 17 2006 - 13:34:34 CET

Original text of this message