# Re: Relational query with path expressions

Date: Tue, 7 Apr 2009 09:57:01 -0700 (PDT)

Message-ID: <df8ce43c-6f92-4e35-ba89-140cd235bb36_at_n7g2000prc.googlegroups.com>

On Apr 6, 9:06 pm, David BL <davi..._at_iinet.net.au> wrote:

> I'm afraid I didn't follow all of that.

Let me translate into algebra of binary relations (aka relation algebra) along your explanation.

> I'll explain what I was

*> stating in more detail...
**>
**> Let R1(x,y) and R2(y,z) be two named binary relations on named
**> attributes.
*

Consider BR1 and BR2 as unnamed binary relations.

> Define unary function fxy that maps sets of x's to sets of y's using

*> R1 as follows:
**>
**> for all X, fxy(X) = {y | exists x in X, R1(x,y)}
**>
**> Similarly define fyz that maps sets of y's to sets of z's using R2
**>
**> for all Y, fyz(Y) = {z | exists y in Y, R2(y,z)}
**>
**> Let R3(x,z) be the join of R1,R2 on y, projected onto {x,z}. Let fxz
**> be derived from R3 as follows:
**>
**> for all X, fxz(X) = {z | exists x in X, R3(x,z)}
**>
**> Then it turns out that fxz = fyz o fxy
*

In algebra of binary relations we get composition in one step:

BR1 ; BR2

The middle attribute (the second one in the BR1 and the first one in BR2) is projected away automatically.

If BR1 and BR2 are function (that is a relation with functional constraint), their composition is function as well.

Why bother with binary relation algebras? Because the assertions (like composition distributing over union that you mentioned below) are relation algebra laws.

> This justifies the use of these unary functions to express certain

*> queries, and a functional notation can be simple and intuitive.
**> However there are serious limitations with the approach - due to the
**> fact that
**>
**> 1) there is often more than one useful binary relation between a given
**> pair of attributes; and
**>
**> 2) relations aren't generally decomposable into binary relations in
**> the sense of a JD.
**>
**> Given R(x,y), we can define 4 unary functions:
**>
**> for all X, f1(X) = {y | exists x in X, R(x,y)}
**> for all Y, f2(Y) = {x | exists y in Y, R(x,y)}
**> for all X, f3(X) = {y | for all x in X, R(x,y)}
**> for all Y, f4(Y) = {x | for all y in Y, R(x,y)}
**>
**> f1,f2 distribute over union (giving them an additive characteristic)
**> whereas f3,f4 distribute over intersection (giving them a subtractive
**> characteristic).
*

It seems that f3 and f4 correspond to residuation in relation algebra. Formally, BR1/BR2 the left residual of R1 by R2 is the relation consisting of all the pairs (x,y) such that for all z (y,z) in BR2 implies (x,z) in BR2.

> Each one of these preserves all the information in R:

*>
**> R(x,y) = { (x,y) | y in f1({x}) }
**> = { (x,y) | x in f2({y}) }
**> = { (x,y) | y in f3({x}) }
**> = { (x,y) | x in f4({y}) }
**>
**> It seems that most of the time, queries are concerned with the
**> existential form. However that is not always the case - such as Date's
**> example of finding a supplier that supplies /all/ the parts.
**>
**> These unary functions could be used to answer queries like: Find any
**> cousin of every friend of every spouse of any ...
**>
**> I find it curious that the simple "duality" between existential and
**> universally quantified forms for the unary functions don't seem to map
**> simply to two alternative join operators in the RA.
**>
**> Let R1(x,y), R2(y,z) be two relations where y stands for all the
**> attributes in common. In the query language I defined,
**>
**> R1.R2 = { (y,z) in R2 | exists (x,y) in R1 }
**>
**> This can be seen as navigating from R1 into those tuples of R2 that
**> are related to /any/ of the tuples of R1 according to the shared
**> attributes.
**>
**> Consider that we define a new operator * where R1*R2 navigates into
**> tuples of R2 that are related to /all/ tuples from R1 according to the
**> shared attributes.
**>
**> R1*R2 = { (y,z) in R2 | (x,y') in R1 => (y',z) in R2 }
*

This is set containment join operator. It is also known as left residuation operator (the definition above that I copied verbatim from Vaughan Pratt paper "The origins of calculus of Binary relation").

> That would allow for the path expression

*>
**> P*SP.S.S#
**>
**> which finds supplier numbers of suppliers that ships /all/ the parts.
**>
**> The following is interesting:
**>
**> P[COLOR='red']*SP.P#
**>
**> This finds part numbers for any parts shipped by any supplier that
**> ships all the red parts. This may include parts that are not red.
*

Received on Tue Apr 07 2009 - 18:57:01 CEST