Re: Extending my question. Was: The relational model and relationalalgebra - why did SQL become the industry standard?

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Fri, 14 Feb 2003 13:57:38 -0000
Message-ID: <b2isne$16ui$1_at_sp15at20.hursley.ibm.com>


"Bob Badour" <bbadour_at_golden.net> wrote in message news:l7W2a.1404$7C3.168426612_at_mantis.golden.net...
> Paul,
>
> I can certainly understand your criticism of ORDER BY if it were applied to
> SELECT expressions nested within other operations such as INSERT INTO,
> CREATE VIEW etc. -- if those are even allowed by the standard or by any
> product.
>
> I am not sure I understand your criticism of ORDER BY for presentation. In
> this situation, doesn't ORDER BY apply to the operation that physically
> encodes the result for transmission outside the DBMS? It seems to me the
> operation in question is quite explicit even if the syntax requires no
> keyword for it.

Maybe it is a matter of taste, but I find it highly non-explict that

  select distinct SAL from emp

returns a Relation, but

  select distinct SAL from emp order by SAL

returns something else.

> Even if one were to type-cast the result to an implicitly ordered non-scalar
> type such as an array, one would still require a final operation that
> physically encodes the array for transmission outside the DBMS.
>
> To put it precisely, in both cases the DBMS starts with an internal physical
> encoding of an abstract logical value and converts it to an external
> physical encoding of the same value.
>
> Why is it necessary or desirable to specify two operations instead of one?

Necessary for good language design and conceptual clarity I would say. Implicit casting is not a very good idea at the best of times and certanlly not for something as big as a change from one non-scalar type to another.

> IE. one operation that changes the type using an explicit order followed by
> a second operation that physically encodes the ordered type VS. a single
> operation that physically encodes the result in a specified order

    CAST_AS_ARRAY(select SUM(SAL), DEPT from emp)     Y_ORDERED_BY SAL would be logicaly more correct that the SQL syntax, but finding a nice syntax (which that above certainly is not) is another matter.

> Referring to the concepts in TTM, it seems like a point of design with
> interesting design tradeoffs. When designing type generators for relations
> and arrays, what possible representations will the generated types have?
> What are the pro's and con's of choosing one set of possible representations
> over another?

Poss reps are really only for scalar types. Although according to Celko there are many ways to represent a tree, however I can't see it being applicable to relations.

> It sounds like you may have already put more thought into these design
> tradeoffs than I have, and I am interested in your opinions.

Possibly. It's certainly an interesting area when one considers 'multiple selection' and then a mapping of databases and database subsets to other non-scalar variables. For example, what is the non-scalar type that is represented by say an OLAP reporting tool or maybe an email application? How much independence is there between the graphical display of information and non-scalar types. My suspision is that jumping straight from relations to graphical displays is too much of a leap, and what is required is possiably a number of intermidiate non-scalar types that allow for data display specificaton and more limited but more intuative data manipulation options.

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Fri Feb 14 2003 - 14:57:38 CET

Original text of this message