Re: Types and "join compatibility"

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 30 Jul 2005 15:08:13 -0700
Message-ID: <1122761293.933576.29120_at_o13g2000cwo.googlegroups.com>


André Næss wrote:
>
> As I started considering how to add a simple type system to the
> implementation I stumbled upon something that my (possibly lacking)
> googling skills can't find an answer to, so I hope someone here might
> shed some light on it. The question is: given a relation A, with a
> single attribute a of type TypeA, and a relation B with a single
> attribute b of type TypeB, where TypeA is a subtype of TypeB, what (if
> anything) does it mean to perform the natural join of A and B (the same
> goes for union I guess...) Of course I here assume that you either
> rename A.a to b or B.b to a before performing the operation.

What an excellent question!

As to the rename, yes of course, otherwise it wouldn't be a *natural* join. In fact, I'm going to rename your whole example for clarity.

(read ":" as "has type"; P for parent, C for child) Relation PR : (a : P)
Relation CR : (a : C)

What is the type of PR join CR?

Date&Darwen's TTM says it's a relation of type (a : P). This has never made sense to me; the result can only contain values that correspond to C, so what's the point of having it be P?

Furthermore, if you want to have updatable views, the result attribute type *has* to be C, because otherwise the type would permit inserting values of type (a : P), which would violate the type of CR. (If the fact that this join is a union seems to invalidate this concern, recast the example such that PR and CR each have an additional, distinctly named column and you'll see it doesn't.)

My take: subtyping adds a huge amount of complexity to these issues and adds little of value. Subtyping is hugely valuable in OOP languages but I'm not convinced it adds much to relational languages. It's possible I'm wrong about the value added, but I'm certainly right about the complexity; it's worse than it looks.

> My take is that it should be possible to do so, and that the type of
> the single resulting attribute should be TypeA, as this is the most
> generic type. So the b attribute of B must be cast before the join is
> performed.
>
> But I also wondered about being able to cast the attribute a into
> TypeB. Because TypeB presumably constrains the possible values of
> TypeA, this would mean that if A.a was cast to TypeB, some tuples in A
> no longer satisfy the conditions defined by TypeB and should be
> discarded. So in theory you could perform a restrict on A where the
> restriction constraint would be the constraints of TypeB, leaving you
> with a relation with a single attribute whose values are all compatible
> with TypeB.

Allowing casting into your language means the type system is no longer sound. Try to avoid doing that if at all possible.

> I'm interesting in both theoretical considerations as well as practical
> experiences. Does there exist any (prototype) implementations of the
> relational algebra coupled with a fairly powerful type system?

You'd think there would be, but I'm not aware of any.

> If someone can provide straight answers that would be great, but I'm
> also very interested in pointers to material I might study to
> understand this in more detail. If something similar has been discussed
> here before (I'm sure it has) I would be very interesting in reading
> those threads. As I mentioned, except for an article about using a
> union type as the attribute type I've not been able to find any threads
> that cover this.

In general, the type theory people and the relational theory people don't mix much. You will find a few people here and there who are interested in both. (I'm interested in both but I'm no expert.)

If you want to read more about type theory, you can't do any better than "Types and Programming Languages" by Benjamin Pierce.

http://www.amazon.com/exec/obidos/tg/detail/-/0262162091/qid=1122761096/sr=8-1/ref=pd_bbs_1/002-4661063-2538456?v=glance&s=books&n=507846

It's not an easy read by any means.

By the way, I asked this question myself a few years back, both here and separately in comp.lang.functional, which is where a lot of type-theory gurus hang out. The rough consensus in c.l.f was the result should be "type error." I don't remember what the consensus was here, but I think it was split between the parent type and the child type.

Again, my advice is, don't allow subtyping into your language.

Marshall Received on Sun Jul 31 2005 - 00:08:13 CEST

Original text of this message