Re: Distributed foreign keys (was Re: Category Types)

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Mon, 7 Jul 2003 00:03:52 +0100
Message-ID: <beaa1d$1atm$1_at_gazette.almaden.ibm.com>


"Bob Badour" <bbadour_at_golden.net> wrote in message news:0CONa.349$VF5.47393910_at_mantis.golden.net...
> You seem to suggest the system catalog should directly represent every value
> of the full range--that's more than just the extent--of every type known to
> the dbms

Directly as a relation literal, or indirectly as a relational expression over type literals used in the type's possible representations.

> The cardinality of values in the full range of an interval type or
> of a tuple type or of a relation type could be huger than huge. I don't
> think that makes much sense.

Most types would be represented as a relational expression. E.g. the type consisting of all intervals of positive integers less than 2^32 could be declared to the DMBS by inserting the following relational expression and relational expression name into the DBMS catalog.

Assuming the type INTEGERS_0-2^32 is already know to the DBMS (say as a relational expression over some inbuilt BIT type?), then INTEGER_0-2^32_INTERVALS could be defined by

INSERT INTO RE_TYPES ( Type_Name, Type_RE_Defn) VALUES ( "INTEGER_0-2^32_INTERVALS"
, "SELECT i.i as Bound_Lower, j.i as Bound_Upper

    FROM INTEGERS_0-2^32 i, INTEGERS_0-2^32 j     WHERE i.i >= j.i"
)

If a user then does something like say

    SELECT AVG(Bound_Lower)
    FROM INTEGER_0-2^32_INTERVALS and the DBMS decides it will need to fully generate the type relation to
compute the result, then I say the DBMS "safety system" (i.e. the time estimate from the query compiler) should restrict such a query from actually running (unless the user has the time spare to wait...)

The same "safety system" argument applies, for me, to the <OR> and <NOT> operators of D&D's "A" algebra. I would like to see them supported in the database language and leave the computational difficulties to the safety system and the query estimator.

> The system catalog must represent the name of the type, the subtypes, the
> supertypes, the possible representations and the constraints. I assume we
> can treat the operations separately.
>
> If the type is parameterized, how does the above change? If the type has an
> arbitrary number of parameters, how does the above change? For instance, how
> does one describe types like tuples and relations?

The tuple and relation type generators are surly defined by the DBMS outside of the database. My scheme might define particular, named tuple and relation types using e.g.

The binary tuple type

    TUPLE_X:INTEGER-Y:INTEGER
could be defined as

    SELECT i.i as X, j.i as Y FROM INTEGERS i, INTEGERS j

The relation type

    RELATION_TUPLE_X:INTEGER-Y:INTEGER
could be defined using a relation valued attribute (if you will excuse the crap syntax) as

    TABLE(POWER_SET(
        SELECT i.i as X, j.i as Y FROM INEGERS i, INTEGERS j     ))

?!

> > Operators are again relation constants or views in the catalog. The
> relation
> > name can be used as the operator name. A convention such as the attribute
> with
> > either a standard name such as 'result' or as the same name as the
> operator
> > could distinguish the operator result from the operands.
> > Any operators that are functions would have a functional dependence
> constraint
> > on their catalog relation. Functions that have properties such as being
> > commutive, associative, reflexctive, identity etc would have corresponding
> > relcon constraints.
>
> Are you suggesting that the system catalog describes operations and types
> exactly the same as it describes relations? I am just not satisfied with
> that. It seems to me there are logical differences among type, operation and
> relation.

It's just the information principle. The total information content of the database should be represented as relation values. I'm not saying that there are no logical differences (at a certain level of abstraction), I'm just saying that they can be all represented as relations, so maybe we should.

> What are the differences between operations that equate to assignment vs.
> operations that do not? How does that get represented in the catalog?
>
>
> > So those two seem a lot easier to get a grip on compared to the ideas
> around
> > database algebra (as I think you nicely demonstrated ;-)
>
> I think the jury is still out on a database algebra.

I didn't know it had gone to court yet ;-)

> It seems clear that the
> system catalog has to be at least partially defined operationally, but then
> again possible representations comprise operations, don't they? So, that
> probably is not as significant as it might seem.
>
>
> > Type generators? Maybe 'meta programming' has something to say on this
> one.
> > Not sure.
> >
> > P.S. I've not gone all that far down the database algebra path, although
> one
> > thing does look clear; that it works better if relvar names are dropped.
> > Maybe I'll post my ideas here, although as I said I'm not sure that this
> > newsgroup is the ideal outlet. Is there a sourceforge.net for logical
> > models???
>
> Drop the names of relations? How does one refer to them without names?

By their tuple type. Obviously that does mean enforcing POOD however.

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Mon Jul 07 2003 - 01:03:52 CEST

Original text of this message