Re: type of result column after heterogeneous join

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Mon, 1 Sep 2003 14:12:29 +0100
Message-ID: <bivgp9$1672$1_at_gazette.almaden.ibm.com>


"Marshall Spight" <mspight_at_dnai.com> wrote in message news:pyp4b.315439$Ho3.44679_at_sccrnsc03...
> Hi all,

>
> Suppose there exists a relational database that supports
> a type system that includes union types. (For example,
> (int|string), "an int or a string" is a valid type.) Now
> suppose one had two tables, the first with one column
> named "a" of type int, the second with one column
> named "a" of type string. Table 1 has exactly one row,
> and that row has a=1. Table 2 has exacly one row, and
> that row has a="hello".
>
> What should be the result of a join/union of the two tables?
> (Join and union are the same in this example.) Specifically,
> what is the column type of the resulting relation?

Nice question Marshall.

To quote Cardelli
  "The fundamental purpose of a type system is to prevent the occurrence of execution errors during the running of a program." or to put it another way: A type system helps to refine what we consider an execution error.

A type system with multiple inheritance would hopefully provide for a more intricate (i.e. more usefull) definition of what is or is not an execution error, than would a type system without type inheritance. Date & Darwen don't, if I recall correctly, say much about the motivation for type inheritance, other than to suggest that it is more or less commonly agreed to be a good idea. They could benifit, I think, by picking up least the exectuion error point from Cardelli.

So the key to your dilemma is to ask which choice best helps define/prevent execution errors.

> 1. None; this causes a type error
> 2. (int|string)
> 3. Most specific supertype of int and string. (Possibly 'alpha'.)
> 4. alpha, the maximal supertype.

With 3 or 4, you effectively have an untyped relational algebra. You might not let function parameters get automatically cast up to the most specific supertype and therfore preserve some typing, but then you would have introduced a logical difference between functions and relations that I for one would prefer not to see.

The TTM answer is (I think - I don't have a copy to hand) that the declared type of the two attributes must either be the same, or one be a sub type of another (with the supertype becoming the declared type of the result). Therefore without explicit casting (TREAT_UP) before the join/union, the TTM answer is 1 - a *compile time* type error.

Now there might be a case for leting certain specified types automatically cast up to thier most specific common decared type. This could be done on a type by type basis. I would not know if this is a good idea however.

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Mon Sep 01 2003 - 15:12:29 CEST

Original text of this message