Re: Relational query with path expressions

From: David BL <davidbl_at_iinet.net.au>
Date: Tue, 7 Apr 2009 21:02:20 -0700 (PDT)
Message-ID: <dff25bb0-4fbb-4c74-9df2-77080dc93161_at_f30g2000vbf.googlegroups.com>


On Apr 8, 3:40 am, rp..._at_pcwin518.campus.tue.nl (rpost) wrote:
> David BL wrote:
> >I get the impression that a lot of the time, a query (or a large
> >portion of a query) can be thought of as a navigation from attribute
> >to attribute using a sequence of joins. This made me think there may
> >be some merit in thinking of attribute names as global in nature and
> >the user only has to specify the start and end attributes and the DBMS
> >is able to deduce the joins that are required.
>
> I like this idea, but it doesn't generalize to joins in general.
>
> >I think the biggest problem is that quite often there are multiple
> >useful paths.
>
> [...]
>
> Yes, and global renaming isn't always possible:
>
> PERSON(P#, NAME, DAYOFBIRTH, FATHER#, MOTHER#)
>
> Here, every FATHER# and every MOTHER# is also a P#. Pretty much all
> of our queries involving FATHER# and MOTHER# are going to join
> them with P#. We can't rename to express those joins.
>
> Splitting it out doesn't help:
>
> PERSON(P#, NAME, DAYOFBIRTH, FATHER#, MOTHER#)
> FATHER(FATHER#)
> MOTHER(MOTHER#)
>
> We can name FATHER's attribute FATHER# or P#, but not both.
>
> (BTW replacing the # attributes with natural keys doesn't help either.)
>
> >After thinking about that problem for a while, it occurred to me that
> >access paths should emphasise navigation from relation to relation,
> >rather than from attribute to attribute (as suggested by the hyper-
> >graphs in the literature on the Universal Relation).
>
> Yes, but sometimes you're going to have to specify which attributes
> to join on.
>
> [...]
>
> >The idea that R1.R2 can drop attributes from R1 may seem strange. A
> >benefit arises for example in path expressions like R1.R2.R3 because
> >it can avoid joining on attributes which happen to have the same name
> >in R1 and R3. This can help to control the joins along a path.
>
> Nice idea, but not completely general:
> it cannot express joins on FATHER#/MOTHER# above.
>
> [...]
>
> >Renaming
> >--------
>
> >Renaming of attributes is supported. For example
>
> > S.(CITY as SCITY, SP.P.CITY as PCITY)
>
> >retrieves all pairs of city names (SCITY, PCITY) such that a supplier
> >in city SCITY supplies a part to city PCITY
>
> Now I'm happy :-)
>
> I initially assumed that SQL worked like this and was
> startled to discover that it doesn't.

Yes, I agree with all that (although I'm not sure what you meant regarding SQL. It does have the AS keyword for renaming. What did you mean by that comment?).

As an example, the names of fathers of people named 'Fred' could be found using

    PERSON[NAME='Fred'].(FATHER# as P#).PERSON.NAME

I think it can be useful to name reusable sub-sections of paths. For example, let

    FATHER = (FATHER# as P#).PERSON

allowing the fathers of people named Fred to be specified as

    PERSON[NAME='Fred'].FATHER.NAME

FATHER can be regarded as a unary function that maps an input relation to an output relation. R.FATHER represents the application of FATHER on input relation R. In this case it maps a set of PERSON tuples to another set of PERSON tuples (that are their fathers).

We could also define

    MOTHER = (MOTHER# as P#).PERSON

If f,g are unary functions that map relations to relations, then subject to typing rules we could define f union g as follows

    for all R, R.(f union g) = R.f union R.g

Similarly we can define f intersect g, f minus g.

This allows us to define

    PARENT = MOTHER union FATHER

    CHILD = (P# as FATHER#).PERSON union (P# as MOTHER#).PERSON

    SIBLING = PARENT.CHILD minus I

    COUSIN = PARENT.SIBLING.CHILD where 'I' is the identity function (that maps any relation to itself).

I think my comments elsewhere on "additive" versus "subtractive" unary functions are applicable in this setting. I think function composition preserves additivity and it also preserves subtractivity. However, for example union of two functions preserves additivity but not subtractivity whereas intersection of two functions preserves subtractivity but not additivity. The difference operator preserves neither. For example 'SIBLING' is not additive (even though PARENT.CHILD and the identity are additive), meaning that it doesn't distribute over union. That suggests it must be interpreted carefully when used in a path. Eg if you ask for the siblings of all the children of a given parent it will return the empty set!

Let C be a literal relation value. Then one can define f intersect C as follows:

    for all R, R.(f intersect C) = R.f intersect C

Similarly one can define

    f union C
    C minus f
    f minus C

If f is additive then I think the following are necessarily additive

    f intersect C
    f minus C

but not necessarily

    f union C
    C minus f.

Another interesting idea is whether it's possible to define operators that map between the forms f1,f2,f3,f4 that I described in another post. More to the point I'm thinking there is a concept of additive and subtractive closures as well as additive and subtractive inverses of any given unary function defined on relations.

Let f be a given unary function on relations. Let x stand for all the attributes in the input relations to f, and y stand for all the attributes in the output relations from f. Consider that we ignore the behaviour of f on input relations that don't contain a single tuple. In fact we define a relation

    G(x,y) = { (x,y) | exists y in {x}.f }

which is kind of like the graph of a function. From G, we can define four unary functions that map relations to relations:

  f1(X)={y| exists x in X, G(x,y)}   (additive closure)
  f2(Y)={x| exists y in Y, G(x,y)}   (additive inverse)
  f3(X)={y| for all x in X, G(x,y)}  (subtractive closure)
  f4(Y)={x| for all y in Y, G(x,y)}  (subtractive inverse)

For example, the additive closure of SIBLING is the same as PARENT.CHILD. The additive inverse of PARENT is CHILD and vice versa.

Just in case I didn't spell it out clearly enough, please note that additive/subtractive closures/inverses are defined on unary functions on relations irrespective of the number of dimensions of the input and output relations. Received on Wed Apr 08 2009 - 06:02:20 CEST

Original text of this message