Re: Representing Types in the Catalog (was Re: Distributed foreign keys)

From: Bob Badour <bbadour_at_golden.net>
Date: Wed, 9 Jul 2003 00:42:04 -0400
Message-ID: <YY0Pa.509$5o.70409340_at_mantis.golden.net>


"Paul Vernon" <paul.vernon_at_ukk.ibmm.comm> wrote in message news:bef3uj$1vlg$1_at_gazette.almaden.ibm.com...
> "Bob Badour" <bbadour_at_golden.net> wrote in message
> news:1_DOa.435$%R5.65300577_at_mantis.golden.net...
> > A type declaration comprises possible representations and constraints.
One
> > can derive the relation literal as a cartesian product of the components
of
> > a possible representation restricted by the constraints.
> >
> >
> > > So it's both the components and the constraints on them that we need.
> >
> > Yes, I agree. The question is how best to represent components and how
best
> > to represent constraints in the catalog.
>
> OK, although I think we are suggesting roughly equivalent representations.

I don't think we are. I am suggesting a representation from which we can derive the relational literal, but we cannot necessarily derive the representation from the relational literal. The relational literal only describes the set of values in one possible representation. A type is more than a set of values.

> I guess I am bundling the components and the constraints into one
relational
> expression. You suggest holding the components separately from the
> constraints, then if we want a literal, building a single RE by cross
> producing the components and restricting the result with the constraint.
> Sounds OK for simple types, but how would you define how two poss reps for
a
> single type equate?

I am not sure I would--the equality operator already defines it at the logical level. If we have two possible representations, P1 & P2, with corresponding relational literals P1_LIT and P2_LIT, we can determine the equal values in P2 for a P1 literal by:

P2_LIT WHERE P2(/*...components from P2_LIT...*/) = P1(/*...literal components...*/)

And vice versa.

Due to implementation issues the expression above might not yield a relation with cardinality = 1.

Since a standard catalog necessarily describes the logical model and not any physical implementation artifacts, exposing more detail to equate possible representations would break physical independence.

> E.g. the boolean type poss rep literal (and, in this case it's defining
RE)
> could look like
>
> VAR Boolean CONST RELATION
> { Boolean Boolean SELF_REF // I *think* we need this self
> referencing attribute
> , Bit_Rep BIT_1
> , Eng_String_Rep STRING_5
> }
> = RELATION { // I ignore the self ref'ed column here..
> TUPLE ( Bit_Rep BIT_1('0'), Eng_String_Rep('False') }
> , TUPLE ( Bit_Rep BIT_1('1'), Eng_String_Rep('True') }
> }
>
> If BIT_1 and STRING_5 are noted separately as components of poss reps of
> Boolean, how do we link 'False' to '0' and 'True' to '1'. In which pos
rep's
> constraint do we make this link?

Since we are talking about a type and not a variable, the reference statement should declare a type:

TYPE Boolean
  POSSREP ( STRING_5 ) /* possrep and component name default to Boolean */
  POSSREP Bit ( BIT_1 ) /* component name defaults to Bit */   CONSTRAINT ( MATCHES(Boolean, RegExp('True|False') ) ) /* Boolean refers to component of the Boolean possrep */
;

Thus the Boolean possrep relation literal is: Boolean_LIT := ( ( STRING_5_LIT RENAME STRING_5 AS Boolean ) WHERE MATCHES(Boolean, RegExp('True|False') ) );

The Bit possrep relation literal is:
Bit_LIT := ( BIT_1_LIT RENAME BIT_1 AS Bit )

The equals relation between the possible representations is: EqualsBoolean_LIT := ( Boolean_LIT JOIN Bit_LIT ) WHERE Bit(Bit) = Boolean(Boolean);

I was tempted to append a _VAL suffix onto the attribute names to distinguish them as component names as opposed to possrep names, but I decided to keep the attribute names the same as the component names. It would not change much but it might be clearer as:

Boolean_LIT := ( ( STRING_5_LIT RENAME STRING_5_VAL AS Boolean_VAL ) WHERE MATCHES(Boolean_VAL, RegExp('True|False') ) ); Bit_LIT := ( BIT_1_LIT RENAME BIT_1_VAL AS Bit_VAL ) EqualsBoolean_LIT := ( Boolean_LIT JOIN Bit_LIT ) WHERE Bit(Bit_VAL) = Boolean(Boolean_VAL);

> [snip]
> >. That there is an equivalence between pos
> > > rep values (none of this where one POINT type has cartesiaon and polar
> > > poss reps that cannot be equivalent using RATIONAL numbers).
> >
> > Possible representation literals won't prevent that.
>
> See above. If a type has more than one poss rep, then the values of those
poss
> reps need to be explicitly linked (in the current scheme by sharing the
same
> tuple in the type literal)

You assume that each possrep is a candidate key of the equals relation, and I do not. I assume only that the equals relation is 1NF. While types whose possreps are all candidate keys of the equals relation probably have important and desirable properties just as relations in 5NF have important and desirable properties, I think it is a matter of design to choose those properties just as it is a matter of design to choose 5NF.

> > > We also get to formalise the "THE_" operators (aka. GET & SET) of poss
> > > representations, because these just become the relation attributes of
the
> > > pos rep literal.
> >
> > The "THE_" operators are already formalized without setting anything. If
a
> > variable's declared type has a possible representation, P1, with A and B
> > components, the "THE_" operator to the left of the assignment operator
is
> > just a shorthand for another assignment:
> >
> > THE_A(var1) := "NewAValue";
> >
> > is a shorthand for
> >
> > var1 := P1( A := "NewAValue", B := THE_B( var1 ) );
>
> Maybe, but I want to see the P1() function above be cast in purely
relational
> terms.

It is already. It is a domain value identifier.

> Functions are just shorthand for some <COMPOSE> with a relation
> literal. In this case, the relation literal for the function is the poss
rep
> literal for the type.

The relation literal is the <COMPOSE> of each of the component relation literals restricted by the constraints. Of course, <COMPOSE> degenerates to cartesian product when none of the tuples share attributes so this just restates an earlier observation. Unlike functions, a selector's parameters are the result so, in a sense, the relation describing the function of every selector is just TABLE_DEE.

> And why don't I like functions? Well it's just that they are one extra
concept
> that (at the level of abstraction for that I'm talking about here) are not
> needed IMO.
>
> > > Non-canonical representations (i.e. multiple poss rep values mapping
to
> > one
> > > type value) would, I guess, be held in a separate catalog literal that
> > maps to
> > > the canonical form for the particular poss rep.
> >
> > Catalog literals seem like a dead end to me. The catalog needs to
represent
> > the type, and from this representation the dbms can derive a literal as
> > necessary.
>
> In my scheme the literal is derived from a relational expression. The ER
> defines the type. What is it about types that you think cannot be defined
by a
> (set of) REs?

Operations and parameterized types.

> > It does not seem feasible to parameterize catalog literals, but it is
easy
> > enough to parameterize other representations of a type. Using the
> > parameterized representation, the dbms can instantiate a literal when
> > provided arguments for the parameters.
>
> Agreed and I suggest "parameterized" REs as such an representation. (Well,
OK,
> I've not yet defined what the basic RE type looks like, never mind
> parameterized REs, but that's to-do next).
>
>
> > > All in all, it looks to me like a type system that does have some
useful
> > > properties, and solves some of the problems that I am (and I think
that
> > > databases are) interested it.
> >
> > I am less interested in some of the properties you want, but I agree a
> > relation literal can represent the set of values of a possible
> > representation easily enough. However, I do not think a relation literal
is
> > the appropriate way to represent a type in the catalog.
>
> Again, the literal is (most times) derived. From REs such as:
>
> "SELECT i.i as Bound_Lower, j.i as Bound_Upper
> FROM INTEGERS_0-2^32 i, INTEGERS_0-2^32 j
> WHERE i.i <= j.i"

I want to add another level of operational indirection to support type generators. The literal is derived from an expression is instantiated operationally from some parameterized representation and a set of parameter arguments.

> > The catalog representation must suffice to derive the appropriate
relation
> > literal, of course.
>
> Agreed. Including the rule that each poss rep for a type has the same
number
> of values, and has one value per type value.

I disagree. Whether there is a 1:1 correspondence between values of implemented types is a choice of the type designer.

> > > [snip]
> > > > I suggest changing the focus from using expressions to define the
set of
> > > > values as a relational literal to using expressions to describe the
> > possible
> > > > representations etc.
> > >
> > > Same thing (if the expressions are non "parameterized")
> > >
> > > > From a description of the possible representations, one
> > > > can derive or generate an expression to define the set of values. A
> > > > parameterized type would use a parameterized expression for the
possible
> > > > representation etc.
> > >
> > > Sounds reasonable. What we need is the definition of a Relational
> > Expression
> > > as a data type. What are some plausible possible representations of
> > Relational
> > > Expressions? Are parameterised REs a super-type of a REs?
> >
> > Parameterised REs are types with operations that result in REs.
>
> Agreed.
>
> [snip]
> > > Agreed. I wonder if <TCLOSE> should be replaced with generalised
> > recursion?
> >
> > TTM has recursion as a relational prescription.
>
> True. Which must therefore be an extension of their algebra (assuming that
> <TCLOSE> cannot be used to do generalised recursion), unless they say that
D
> does the recursion rather than the algebra.

"Recursion shall be permitted in relational expressions"

That sounds like it is in the D algebra even if it is not in the A algebra.

> Even more reason to replace <TCLOSE> with generalised recursion.

Tutorial D includes both. Received on Wed Jul 09 2003 - 06:42:04 CEST

Original text of this message