Re: RA with MV attributes

From: David <davidbl_at_iinet.net.au>
Date: 16 Jan 2007 07:36:24 -0800
Message-ID: <1168961783.053971.305700_at_l53g2000cwa.googlegroups.com>


Marshall wrote:
> Interesting post. Some comments inline.
>
> On Jan 16, 12:40 am, "David" <davi..._at_iinet.net.au> wrote:
> >
> > Definition: A relation r consists of a relation-type A(r) together
> > with a set of tuples T(r), where each tuple is a map from each
> > attribute a inA(r) to a subset of D(a).
>
> I realize it's becoming fashionable, but I still dislike the idea
> that a relation "consists of" its type and its value, both.
> It's type is its type and its value is its value; it doesn't
> make sense to me otherwise. If we consider the relation
> as being a combination of two things, we ought to be
> able to consider those two things separately. In which
> case we can ask the question, what is the type of a
> relation's value? Presumably that is also its type, and
> so we have infinite regress.
>
> Or consider how it sounds for a value of a non-parameterized
> type: "An integer consists of a numeric-type int together with
> an int." ???
>
> Yet another reason to dislike it is that it combines in a single
> construct both a static construct and a dynamic one. It is
> entirely reasonable to consider a system where types exist
> at compile time and do not exist at runtime, and values exist
> at runtime and do not exist at compile time. So how could we
> combine them?
>
> Anyway, this is a diversion from your main point.

I agree with your points. Note that we can't define a join on two relation values without access to their type information. A weakly typed language that allows for joins on two relations would be forced to explicitly store the type together with the relation-value as part of the run-time state. As you suggest above, a strongly typed language allows values to be stored without explicit run time type information.

I regard the combination of type and value into a single construct as a convenience for the exposition.

> > T(r) =
> > { t |
> > there exists t1 in T(r1) &
> > there exists t2 in T(r2) &
> > for each a in A(r1) \ A(r2), t(a) = t1(a) &
> > for each a in A(r2) \ A(r1), t(a) = t2(a) &
> > for each a in (A(r1) intersect A(r2)), t(a) = (t1(a) intersect
> > t2(a))
> > }
>
> Wow. Crazy.
>
> > This is actually an outer join.
>
> It's actually a sort of full outer join, yes?

Yes.

> I'm not sure why you'd want to *replace* regular natural join with
> this operator, assuming we consider it useful. It seems more like
> an add-on.

It is interesting that a full outer join has a simpler definition than a natural join.

> An inner join requires removal of
> > tuples with empty intersections on the common attributes.
> >
> > Some advantages of MV attributes
> >
> > 1. Avoids redundancy. This could be significant – say when there
> > are a large number of people that co-own a large number of cars.
>
> I don't see how this avoids redundancy.

MV attributes allow for storing the same relation with less stored values. I interpret that as meaning it can avoid redundancy.

Consider a single tuple with Ni values for the ith attribute. The number of values stored by the tuple is sum(Ni). Without MV attributes the number of values stored is numAttribs x product(Ni) which could be very large in pathological cases.

> > 2. It deals rather elegantly with missing information.
>
> I don't get this point either.

Subject to integrity constraints, a tuple can map an attribute to an empty set.

> > and some significant disadvantages
> >
> > 1. Each tuple only represents a *lower bound* on the known
> > relationships. Therefore it is important to avoid making assumptions
> > about what is not true when looking at a given tuple. For example the
> > last tuple of r1 |X| r2 above doesn’t imply that John and Fred
> > don’t own any cars.
> >
> > 2. The simple relationship back to propositions and FOPL is lost.
> > Perhaps the meaning of a relation could be understood in terms of
> > underlying 6NF predicates.
> >
> > 3. There is a non-uniqueness of representation because tuples can
> > group values in different ways yet achieve the same set-theoretic
> > result.
>
> 2 seems like a deal-breaker, doesn't it? Unless we can use something
> other than FOPL to assign meaning to terms.

I wonder whether a complete and elegant RA is enough to make MV attributes respectable. Furthermore I have no trouble seeing how it relates back to 6NF predicates. I expect such a mapping could be formalised.

> > In the end I’m not sure whether MV attributes are a good idea or not.
>
> Well, I'm not entirely sure whether your analysis was of MV or of this
> operator you came up with. I still see modest benefit to nested
> relations,
> for example. And they seem to work fine without any additional
> operators
> besides the usual ones, although I haven't analyzed this all that much.

Another interesting point of comparison relates to updates. I can see pros and cons of MV attributes.

I imagine a system using MV attributes should support cardinality integrity constraints on each attribute. For example,

    person(Person, Parents) where Person: 1..1 and Parents: 0..2

I expect referential integrity constraints will be easy to formalise. Received on Tue Jan 16 2007 - 16:36:24 CET

Original text of this message