Re: Nested Relations / RVAs / NFNF

From: Tony Douglas <tonyisyourpal_at_netscape.net>
Date: 29 Oct 2004 19:01:56 -0700
Message-ID: <bcb8c360.0410291801.4a915a5e_at_posting.google.com>


"Mikito Harakiri" <mikharakiri_at_iahu.com> wrote in message news:<IIegd.45$of1.104_at_news.oracle.com>...
> "Tony Douglas" <tonyisyourpal_at_netscape.net> wrote in message
> news:bcb8c360.0410281331.305f289e_at_posting.google.com...

> > For
> > example, how would you represent "two kilograms" in a program ? 2
> > doesn't cut it, because that's an integer. 2.0 is a real number (or a
> > float, or a double, or...) You can't typically readily say 2kg, so
> > you're left with awful stropping, like kg(2) or somesuch.
>
> Why not to include 3 more columns into the dictionary table ALL_COLUMNS:
>
> power_kg RATIONAL,
> power_m RATIONAL,
> power_s RATIONAL
>
> ?

Why should we ? This raises an interesting question; just how much info about types belongs in the system catalogue of the DBMS ? I would argue next to none, as the DBMS has no right grovelling around in the guts of the types. The DBMS needs to know the types exist, and what functions/operators are defined over those types, but not how those types are constructed.

Also, this doesn't address the point above; what I was saying was that our languages are generally poor at adding new symbols. The possible, easily usable alphabets are set when the compiler's parser is built, and that's it. That is, using integers, floats, strings, characters and enumeration members is easy enough, as the parser is prebuilt to know about them. Adding new alphabets, like floats-with-kg-appended to denote kilogrammes, or number-lb-number-oz to denote pounds and ounces generally isn't possible, which helps propel programmers to using inappropriate general purpose types like integers and reals to represent non-general purpose data. I think if we want to get *really* serious about defining our own types, we have to think about this, and I suppose that's a programming language issue more than a database theory one, really, although databases have to be aware of them.

>
> In your example
>
> table Person (
> ...
> weight NUMBER
> ...
> )
>
> would be represented in the dictionary as a tuple
>
> <tab_name=Person, col_name=weight,
> power_kg=1, power_m=0, power_s=0>
>

Mmmm.

> This is standartization and [other things being equal] standartization is
> always good.
>

How standard is standard ? How general is this type of response ? Staying within the example to hand, what about imperial measurements, for example ? Would our triple become ever broader to handle pounds, ounces, hundredweights, tons, metric tonnes, or others ?  

> Next, we would be able to issue queries like this:
>
> "what is the minimal time spent by vehicle in a network of routes"
>
> select min(routes.distance/cars.velocity)
> from cars, routes
>
> SQL type engine should be able figure out that the query result column must
> have signature <power_kg=0, power_m=0, power_s=1>.
>

Why should SQL per se have a type engine ? The type engine is a thing existing outwith the relational operators. For example, I would see the above query being executed along the lines of ...

  1. relational engine performs a natural join of cars and routes
  2. it checks that a divide function has been defined over the types of distance and velocity, and that the result of that divide has a less than or equal to operator defined on it
  3. for each tuple in the result set, call whichever function implements the division
  4. call whichever function implements less than or equal to on the result of (3) to find the lowest.

The divide and less than or equal to functions have to be known at the signature level to the relational/SQL operators, but not the detail.

> I remind you that according to naive type idea, you are not allowed to
> divide columns that have different types (that is, in my language, different
> <power_kg, power_m, power_s> signature) or, at least, have to wrap them into
> silly methods like this
>
> float divide(float distance_in_meters, float speed_in_meters_per_seconds)
>
> or this
>
> Seconds divide(Meters distance, MetersPerSeconds velocity)

Well, if you define new types, you're going to have to provide your own implementation of divide (if it is sensible for your type) anyway, unless the type engine can reasonably derive it from the type's representation, and if your type can be multiplied or divided by, or added to or subtracted from by values of other types, you'll have to explain that too to make it legal. Effectively, all you're saying is you'll have to provide implementations for the function signature

divide :: metres * metres_per_seconds -> seconds

That's ok, and to be expected.

All of this raises another interesting question : who's the boss, the type engine or the implementation of relations ? Are relations just another set of types, so are handled by the type engine like any other type, or are they something altogether more mystical than that ? I'm leaning towards the former, personally, although I'm open to persuasion...

Cheers,

  • Tony
Received on Sat Oct 30 2004 - 04:01:56 CEST

Original text of this message