Re: Extending my question. Was: The relational model and relationalalgebra - why did SQL become the industry standard?

From: Bob Badour <bbadour_at_golden.net>
Date: Sun, 16 Feb 2003 13:34:11 -0500
Message-ID: <8bR3a.84$2W5.14858339_at_mantis.golden.net>


"Paul Vernon" <paul.vernon_at_ukk.ibmm.comm> wrote in message news:b2isne$16ui$1_at_sp15at20.hursley.ibm.com...
> "Bob Badour" <bbadour_at_golden.net> wrote in message
> news:l7W2a.1404$7C3.168426612_at_mantis.golden.net...
> > Paul,
> >
> > I can certainly understand your criticism of ORDER BY if it were applied
to
> > SELECT expressions nested within other operations such as INSERT INTO,
> > CREATE VIEW etc. -- if those are even allowed by the standard or by any
> > product.
> >
> > I am not sure I understand your criticism of ORDER BY for presentation.
In
> > this situation, doesn't ORDER BY apply to the operation that physically
> > encodes the result for transmission outside the DBMS? It seems to me the
> > operation in question is quite explicit even if the syntax requires no
> > keyword for it.
>
> Maybe it is a matter of taste, but I find it highly non-explict that
>
> select distinct SAL from emp
>
> returns a Relation, but
>
> select distinct SAL from emp order by SAL
>
> returns something else.

Except that 'select distinct SAL from emp' as a complete statement does not return a relation. It returns the external physical representation of a relation in an arbitrary order.

> > Even if one were to type-cast the result to an implicitly ordered
non-scalar
> > type such as an array, one would still require a final operation that
> > physically encodes the array for transmission outside the DBMS.
> >
> > To put it precisely, in both cases the DBMS starts with an internal
physical
> > encoding of an abstract logical value and converts it to an external
> > physical encoding of the same value.
> >
> > Why is it necessary or desirable to specify two operations instead of
one?
>
> Necessary for good language design and conceptual clarity I would say.
> Implicit casting is not a very good idea at the best of times and
certanlly
> not for something as big as a change from one non-scalar type to another.

We are not talking about implicit casting to a non-scalar type within the type system of the dbms; we are talking about an operation that encodes data into an external physical representation. While the language requires no keyword for this operation, it is quite explicit. A select statement is different from a select expression.

> > IE. one operation that changes the type using an explicit order followed
by
> > a second operation that physically encodes the ordered type VS. a single
> > operation that physically encodes the result in a specified order
>
> CAST_AS_ARRAY(select SUM(SAL), DEPT from emp)
> Y_ORDERED_BY SAL
>
> would be logicaly more correct that the SQL syntax, but finding a nice
syntax
> (which that above certainly is not) is another matter.

I disagree. What you have given above involves a redundant type conversion that the dbms should optimize away in any case. The final result of the operation is not an array--it is an external physical representation of a set of tuples in an explicit order.

> > Referring to the concepts in TTM, it seems like a point of design with
> > interesting design tradeoffs. When designing type generators for
relations
> > and arrays, what possible representations will the generated types have?
> > What are the pro's and con's of choosing one set of possible
representations
> > over another?
>
> Poss reps are really only for scalar types.

I don't see how one can have any usable type without at least one possible representation. However, after thinking about things, I realise the issue here involves operations for converting tuples and relations into external physical representations, and has little to do with internal possible representations. The possible representations of the values in a tuple or relation determine the possible representations of the tuple or relation.

> > It sounds like you may have already put more thought into these design
> > tradeoffs than I have, and I am interested in your opinions.
>
> Possibly. It's certainly an interesting area when one considers 'multiple
> selection' and then a mapping of databases and database subsets to other
> non-scalar variables. For example, what is the non-scalar type that is
> represented by say an OLAP reporting tool or maybe an email application?
How
> much independence is there between the graphical display of information
and
> non-scalar types. My suspision is that jumping straight from relations to
> graphical displays is too much of a leap, and what is required is
possiably a
> number of intermidiate non-scalar types that allow for data display
> specificaton and more limited but more intuative data manipulation
options.

Isn't the real issue here a question of how to convert end-user gestures into appropriate dbms queries? Aren't the types used in an OLAP tool or email application necessarily external types from the perspective of the DBMS? Granted, I would like to see systems built with sufficient integration that the internal/external distinction becomes almost moot. Programming environments sophisticated enough to work directly with the higher-level types of the dbms would be great, and I agree that operations to cast relations into arrays would require an explicit order specification. Received on Sun Feb 16 2003 - 19:34:11 CET

Original text of this message