Re: Implications of Relation-Valued Attributes

From: Bob Badour <bbadour_at_golden.net>
Date: Wed, 2 Jul 2003 03:27:10 -0400
Message-ID: <kIwMa.165$2q3.21361772_at_mantis.golden.net>


"Marshall Spight" <mspight_at_dnai.com> wrote in message news:jpsMa.1260$a45.3095_at_rwcrnsc52.ops.asp.att.net...
> Hi all,
>
> I've been thinking about relation-valued attributes lately. It strikes
> me that, while they have many useful properties, they also may
> have nasty implications.
>
> Consider: in current products, attributes are limited to "scalar" types
> in general. If I read my TTM correctly, (or just by following this group)
> that idea appears distinctly too narrow. It seems we need at least:
>
> scalars, including both system-defined and user defined
> sum types
> lists (or sequences)
> tuples
> relations
>
> and that these may be applied recursively. (TTM requires that the only
> globals be relations, but in light of having all the above type
constructors,
> that begins to seem artificial to me.)

Even scalar values have internal structure or more properly operations that reveal internal structure. For instance, one can define operations on integers that expose a sequence of bits and other operations on integers that expose a sequence of decimal digits.

As a general principle, a dbms should support a variety of useful types, type generators and extensibility through user-defined types.

The relational model does not have any constructors. A constructor is an entirely physical artifact. It is a method called to allow a class to initialize the physical memory occupied by a variable when the memory for the variable is allocated and the variable enters scope for the first time.

Memory, allocation, initialization are all physical concepts having no business in the logical model. Instead, the relational model has selectors that identify object values according to one possible representation or another. Selectors may seem vaguely similar to constructors in their use, but they are decidedly different.

Memory, allocation, initialization are all left to the optimizer to determine when it builds the physical access path for a query.

> Once we allow these to be used recursively, we get allow some complicated
> structures. That is, we may have relations with attributes that are
relations
> with attributes that are tuples that contain lists, etc. (The kind of
thing I did
> in C in college :-)

First, we do not get complicated structures. The only structures we have remain relations with single-valued attributes. Of course, the simplest possible representation for these single-valued attributes can become quite complicated.

Second, as designers, we need to know there is often a difference between "can" and "should". For instance, I suspect we should minimize complicated type systems in base relations even if we provide your C program a view of those base relations that uses complicated type systems.

> Once we have these complicated structures, we have to be able to query
them
> and update them, etc. I will simple wave my hand around the update issue
for
> now; as we all know, the only necessary update operator is dbvar
assignment. :-)

I think the jury is still out on that one, but it does have a certain appeal.

> But even just the query issue worries me.
>
> I wonder if it might be legitimate, in light of the above, to want to
query things
> according to structure. And now I start wandering into XQuery territory,
and
> I get very uncomfortable.

I do not know what you mean by structure here. One uses operations on values in queries. Just as one uses a substring operation on a string, one can use a restrict or project operation on a relation-valued attribute. Updates assign new values to variables and one can nest operations on values in them. The only structures are relations.

> One of my big objections to XQuery is how complicated it is. Worse, it
misses
> out on one of the big wins of the RM: that structure and data are both
expressed
> as data, which removes the need to have separate ways of querying and
updating
> structure vs. data.

I am not sure what you mean here. In the relational model, the only structure is the relation, and all data are represented as values in relations. The relation is chosen as the only structure because simpler structures lack functionality thereby inhibiting effective data management and more complex structures provide no additional benefit. The relation is simple enough that it effectively prevents representing data as structure so that all data is represented and managed similarly. More complex structures encourage people to represent data as structure requiring redundant arbitrary methods for managing data and if any of those redundant methods is missing interfering with effective data management.

From the above, I think the relational model makes a clear distinction between data and structure. The system catalog describes structure as data represented as values in relations and will continue to do so.

> Anyone have any thoughts? Does allowing recursive data structures take us
down
> a rabbit hole of querying by structure and into XQuery land? Or can we
express
> every query we need to with just the relational algebra?

The relational algebra remains unchanged no matter how many additional types we introduce provided we have at least a boolean type. However, we have to recognize that additional types mean additional operations and more for users to learn. Received on Wed Jul 02 2003 - 09:27:10 CEST

Original text of this message