Re: a union is always a join!

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 1 Mar 2009 14:43:53 -0500
Message-ID: <_zBql.17414$YU2.17201_at_nlpi066.nbdc.sbc.com>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:Evlql.14337$Db2.10147_at_edtnps83...
> rpost wrote:
>> ...
>> As I already said in my previous message, I see the elegance,
>> but at the same time I find elegance in Codd's use of two kinds
>> of operations. Obviously if you rewrite to Codd's algebra (or do
>> something equivalent to that) you're not going to lose efficiency
>> on the cases where this is possible. But just rejecting the rest
>> is only going to confuse the user. Why not be honest and admit
>> that we really expect, in our implementations, relational queries
>> to work on finite, typically small relations, that WORKS_IN is one
>> and PLUS is not, and that NOT doesn't do so?
>> ...
>
> In the interest of keeping the pot stirred, here are some other
> provocations (basically one of my same old tired recurring themes) .
>
>
> As Fabian P and others have pointed out, a table (or should that be
> R-table?) that involves nulls obviously involves more than one of Codd's
> original relations. Given a smart enough implementation a relation that
> is defined using <OR> could be replaced by two or more relations (I seem
> to remember McGoveran stating similar somewhere or other. I'm guessing
> that Codd wouldn't have agreed but my hunch is that his objection would
> have been on practical, not theoretical grounds, so perhaps he would not
> have objected if somebody had made a practical implementation). As long
> as the closed-world-assumption is followed, a relation that is produced
> using the <NOT> operator doesn't generally need to be materialized because
> queries against it can be transformed as queries against its own negation.
> I'd say the importance of <NOT> has nothing to do with finiteness, rather
> that it parallels boolean algebra's NOT and thus allows a algebra to
> define retraction, aka 'deletion'.
>
>
> <OR> is more interesting (to me) than <NOT>. Without <OR>, RT has no way
> to 'add' a tuple to a relation variable. When one uses a language's
> "INSERT" verb, it must effect the logical equivalent of <OR>.

Paul,

I think you're on the right track if you can just separate what <OR> is from what is an application of <OR>. Of course, this particular application of <OR> is not limited to just INSERT, but also UPDATE and DELETE.

Relational Calculus, and therefore equivalent mechanisms such as Codd's or D&D's algebras, is closely related to if not based upon first order predicate logic. But while first order logic is fine for querying the picture of the world that is represented by a given database value, it is not sufficient for either the series of pictures represented by a succession of database values, or the series of events that describe not just what is changing between pairs of successive pictures, but exactly how what is changing is different.

This is the way I see it: That which states what is the case is the disjunction of all and only true propositions. Divide that into two parts: one that states what has been the case (not necessarily what had been the case--just that which obtained since the last change), and one that states what is happening. Whenever nothing is happening, what has been the case /is/ the case. For example, if Joe has been second in line at the bank and nothing is happening, then Joe is still second in line. On the other hand, if the person who has been first in line is now being served, then Joe is no longer second in line but is instead first in line. The disjunction of the two statements: "Joe has been second in line" and "Joe is no longer second in line but is instead first in line" yields "Joe is first in line." In symbols,

from p \/ (~p \/ q), derive (p \/ ~p) \/ q since (p \/ ~p) is vacuously true, eliminate it and what is left is just q

Now let's apply that to relational theory: the database states what has been the case; a transition states what is happening. The logical sum of these two statements asserts what is the case, and is therefore what is becoming the database: though there may be many databases that could be the database, only one of them can actually be the database at a time.

So the application of <OR> to combine the statement of what has been the case with the statement of what is happening is a fair description of a database update, but it is not limited to just "INSERT." The disjunction, p \/ ~p, when it appears in the logical sum of what has been the case and what is happening can be thought of as the equivalent of a "DELETE," and the disjunction, (p /\ q) \/ (p /\ ~q) \/ (p /\ r), the equivalent of an "UPDATE;" only the disjunction, False \/ p, would be the equivalent of an "INSERT."
> Codd's unadorned relation is a conjunction of simple propositions (by
> simple I mean each proposition contains no logical connectors). If a
> relation variable has the 'quality' of being a 'view' (or 'green' as I
> like to joke), some people say to the effect that the view is a
> conjunction of disjunctions (propositions involving the logical connector
> 'OR'). Whereas if the relation variable is 'base', they say it is just a
> conjunction of simple propositions. For years, I've been trying to
> understand how they can say this at the same time as they say that users
> should preferably see no difference between base and virtual/view relation
> variables (which I agree with, just as I would agree that there should be
> no appreciable difference between 'red' and 'green' relvars. Their
> argument then proceeds to claim that 'inserts' to some union views are
> indeterminate while 'inserts' to an equal 'base' view aren't. I'd say that
> if they are right, then we shouldn't be able to 'insert' to any variable
> unless it points to an empty relation which would mean that no relation
> could have more than one tuple - that bizarre situation would have exactly
> one advantage - no questions about finiteness!

Received on Sun Mar 01 2009 - 20:43:53 CET

Original text of this message