Re: atomic

From: David BL <davidbl_at_iinet.net.au>
Date: Sun, 04 Nov 2007 19:59:50 -0800
Message-ID: <1194235190.192785.149160_at_q5g2000prf.googlegroups.com>


On Nov 4, 12:25 am, paul c <toledobythe..._at_ooyah.ac> wrote:
> David BL wrote:
> > On Nov 2, 6:28 am, paul c <toledobythe..._at_ooyah.ac> wrote:
> >> I'm wondering are there applications where RVA values that are "empty"
> >> make sense or are such values just a curious by-product of RVA's? If
> >> the latter, should they be suppressed or somehow prevented?
>
> > In January I started a thread called "RA with MV attributes", in which
> > I provided a formal definition of a join operator. It basically takes
> > set intersections on the joined attributes. Here is the example I
> > used to show it in action, where we are joining on cars...
>
> > r1(Names,Cars)
> > bill, car1,car2,car4
> > john,fred car3
>
> > r2(Cars,Colours)
> > car1,car3,car4 red
> > car2 green
>
> > r1 |X| r2 (Names,Cars,Colours)
> > bill car1,car4 red
> > bill car2 green
> > john,fred car3 red
> > john,fred green
>
> > Note the suspicious last tuple which has an empty set of cars that are
> > green.
>
> > Marshall pointed out that it's rather like a full outer join.
>
> > In that thread I suggested that we could interpret this in FOPL by
> > defining a mapping back to conventional predicates with single valued
> > attributes, based on taking cross products of the multi-valued
> > attributes. Furthermore under that interpretation the above result
> > comes back to an inner join!
>
> > Note therefore that the last tuple actually maps to zero predicates.
>
> ...
>
> I've sometimes wondered whether something like this might offer
> occasional convenience, but as stated I find it hard to reconcile with
> the "first principles" that I tend to follow (or like to think I
> follow).

I appreciate that point of view.

> Not to say D&D's basic definitions are the only ones that can
> be used, but I think we'd have to have agreement on how the operators
> apply. For example, if I understand D&D, r1 |x| r2 above would be empty.

Consider that we have a function I (an "interpretation") that maps a relation with MV attributes to an "equivalent" relation with SV attributes. Formally this simply takes cross products. For example, in the above example I(r1) is

    Name Car


    bill              car1
    bill              car2
    bill              car4
    john              car3
    fred              car3

and I(r2) is

    Car Colour


    car1              red
    car3              red
    car4              red
    car2              green

I think it is straightforward to prove the following in the general case

        I(r1 |x| r2) = I(r1) |x| I(r2)

where |x| on the LHS is the join operator on relations with MV attributes, and |x| on the RHS is the conventional inner join on relations with SV attributes.

So there is a rather simple relationship with the conventional join operator.

> Also I think it is risky in this particular newsgroup to use the term
> "MV" because there are some posters for whom that stands for something
> far afield from relational theory so the threads can get quite murky,
> full of cross-purposes.

Yes, unfortunate baggage!

> To me, it seems inherent in D&D's approach that when we name a relation
> and its attributes, we are also implicitly defining other relations that
> have only a subset of those attributes. But the predicates of the
> implicit relations are existentially qualified, ie., their possible
> extensions depend on the actual extension of the original relation.

Is this what you mean (in Prolog):

    r(Name,Car,Colour) :- Name owns Car with Colour.

    r1'(Name,Car) :- r(Name,Car,Colour).

or in natural language

    r1'(Name,Car) :- there exists Colour such that r(Name,Car,Colour).

A question that interests me is whether r1 = r1', where

    r1(Name,Car) :- Name owns Car.

It seems related to the problem of how to deal with missing information. Consider that we have recorded the fact that Sue owns car17 but we don't know what colour it is. Then this fact could appear in r1\r1'.

Mixed up in this question is the distinction between base relations and derived relations.

> For example, in your r1 |x| r2, it seems to mean among other things
> that there exist green cars that john and fred own at the same time as
> there exist no green cars that they could own because the only green car
> is car2 and they don't own it.

Excellent point. I have been thinking the same thing. To that end I have been looking at modifying my join operator to strip out the tuples where the intersection on the joined attributes is empty.

I think I was led astray because my main reason for considering all this is to support missing data.

Here's a proposal for a new MV join operator...

I'll define r1(x,y) |x| r2(y,z), where y is the common attributes to join on.

For each tuple t1 from r1, and each tuple t2 from r2 the MV join operator should potentially add as many as three tuples into the join result r = r1 |x| r2 as follows.

  1. if t1(y) intersect t2(y) <> {} r += (t1(x), t1(y) intersect t2(y), t2(z))
  2. r += (t1(x), t1(y)\t2(y), {})
  3. r += ({}, t2(y)\t1(y), t2(z))

[Aside: This could pollute the result with redundant information - no doubt a symptom of the non-uniqueness of representation, and for those reasons I'm liking this MV approach less and less. To treat missing data I'm probably better off defining an analogous relational interpretation of tables which may contain 0 or 1 value at a given row, column position]

Anyway, I think the following properties will be true...

	I(r1) =  I( proj( r1 |x| r2 , {x,y} ) )
	I(r2) =  I( proj( r1 |x| r2 , {y,z} ) )
	I(r1 |x| r2) = I(r1) |x| I(r2)

I really should write formal proofs before risking wasting your time!

If these results hold then we have an *information preserving* join operator. What interests me is to use this formalism as a basis for understanding missing information.

It seems that there can be lots of different projectioninterpretations  because in general projection and interpretation don't commute. I'm alluding to the distinction between r1,r1' above.

As another example consider the following distinctions

    person1(N) :- N is the name of a person

    person2(N) :- N is the name of a person known to own a car

    person3(N) :- N is the name of a person known to own a car

                  with a known colour.

The reason for wanting to normalise r(Names,Cars,Colours) into base relations r1(names,Cars) and r2(Cars,Colours) is to avoid redundancy and those nasty update anomalies. This only depends on 3NF.

I suspect at this point we need a different example to illustrate where the MV attribute formalism might be useful. ie one where we moving from 5NF to 6NF causes relations to be decomposed. I'm suggesting that we leave the base relations in 5NF and use empty sets to support missing information. The main departure with conventional RM/RA is the idea that the projection operator is parameterised by two sets of attributes!

        project'( r, X1, X2) = project( I (project(r, X1)), X2 )

Anyway, that's enough hand waving for now!

>
> Secondly, if I have a relation with the first three rows of r1 |x| r2
> above (header Names, Cars, Colours),
>
> > bill car1,car4 red
> > bill car2 green
> > john,fred car3 red
>
> and I wish to insert the fact, and only that fact, that john and fred
> own a green car (if that is what you intend the fourth row to mean),
> some language might allow me to do that without mentioning the Cars
> attribute, but I don't think it would be a D&D language because in D&D,
> "insert" is defined on <OR> and that would mean at minimum we'd need to
> insert tuples with all possible values of the Cars attribute, eg.,
>
> john,fred green
> john,fred car1 green
> john,fred car1,car2 green
>
> ... and so forth.

I agree. I'm suspicious of allowing for stating the fact that john and fred own a green car without providing the identity of the car that they own.

> Personally, I've thought for a couple of years that it might be useful
> to try to entertain a logical system that supports something akin to
> what you are talking about but first I think the underlying logical
> operators would need to be worked out including any precedences that
> might be required.
>
> When I see the word "space", I usually infer that that the writer is
> talking about a physical layer. If that's the case here, I still don't
> think that will wash away my logical questions unless the user interface
> includes physical operators, which I think is not a good idea.
>
> I suspect that you might be more technically capable than I am of
> revising the first principles in a way that might support what you are
> driving at and I hope you will try. Even if it doesn't result in a
> logical system, I think the exercise is good.
>
> ps: (I mention D&D because I find their definitions are more precise
> than Codd's and also because I'm not sure if he ever entertained subsets
> of headers - from what I've read I think he only talked of header
> elements and didn't worry about "empty" headers, but I could be wrong.)
Received on Mon Nov 05 2007 - 04:59:50 CET

Original text of this message