Re: Extending my question. Was: The relational model and relationalalgebra - why did SQL become the industry standard?

From: Bob Badour <bbadour_at_golden.net>
Date: Thu, 13 Feb 2003 18:23:14 -0500
Message-ID: <l7W2a.1404$7C3.168426612_at_mantis.golden.net>


"Paul Vernon" <paul.vernon_at_ukk.ibmm.comm> wrote in message news:b2g807$su6$1_at_sp15at20.hursley.ibm.com...
> "Anton Versteeg" <anton_versteeg_at_nnll.iibbmm.com> wrote in message
> news:3E4B9144.8CBC4296_at_nnll.iibbmm.com...
> >
> >
> > Lauri Pietarinen wrote:
> >
> > > I don't think that anybody is suggesting that intermediate results
need
> > > to remove duplicates. It's
> > > the end result that counts. E.g. in the following code fragment
> > >
> > >
> >
> > Well, I think that even for end results duplicates can be useful.
>
> If you define 'end results' as not being the result of a relational query,
but
> the result of some extra non-relational transformation of a relation then
OK,
> but please don't try to argue that a database is best supported by a 'bag'
> algebra (or an array algebra, or a network algebra,...) rather than a
> relational algebra.
>
> > It is the difference between the set theory and query results in
practice.
>
> No. It is not a case of theory not matching practice (which, if true,
tells us
> that a theory is broken), rather it is the lack of a clear understanding
of
> where relational algebra ends and something else (like array variable and
> values) picks up.
>
> Of course theorists are sometimes guilty of over stating the scope of a
> theory, but much more usually it is the practitioners that do the try to
apply
> a given theory outside of it's bounds.
>
>
> > For instance: a set doesn't have an order but it would be impossible to
> present
> > results to a user of our database if we cannot order the end result.
>
> We live in a four dimensional world, and there is order everywhere:
up/down,
> left/right, before/after. Because of this we indeed cannot ever see or
hear or
> feel or somehow sense a Relation per se. The best we can do is to
logically
> transform an relation into say an array, then present such an array using
some
> visual display unit, like a computer screen. Things like arrays, trees,
lists
> etc are all slightly closer to being able to be sensed than relations
> (although I'm not sure one can really see an array either, all we see is
> light...)
>
> > To give an example of the use of duplicates:
> > Suupose we have a table that holds text (letters for instance).
> > We would probably have a line number field and a text field.
> > To improve readability we will have several occurrences of blank lines.
> > If we then select the text column ordered by the line number, we will
have
> > (meaningful) duplicates in the end result.
>
> OK, but just be clear such an 'end result' is not a relation and so any
syntax
> to help produce such 'end results' is not part of a relational algebra.
> A logically clean syntax would be to 'cast' a relational query result to
say
> an array, then further 'select' certain array elements for display in some
> specified(or unspecified) order.
>
> Unfortunately SQL does not make such a clean separation. Regardless of it
> being a bag algebra (of sorts), it lets ORDER BY operate on it's bags
without
> any explicit casting to a orderable non-scalar type such as an array.
>
> Regards
> Paul Vernon
> Business Intelligence, IBM Global Services

Paul,

I can certainly understand your criticism of ORDER BY if it were applied to SELECT expressions nested within other operations such as INSERT INTO, CREATE VIEW etc. -- if those are even allowed by the standard or by any product.

I am not sure I understand your criticism of ORDER BY for presentation. In this situation, doesn't ORDER BY apply to the operation that physically encodes the result for transmission outside the DBMS? It seems to me the operation in question is quite explicit even if the syntax requires no keyword for it.

Even if one were to type-cast the result to an implicitly ordered non-scalar type such as an array, one would still require a final operation that physically encodes the array for transmission outside the DBMS.

To put it precisely, in both cases the DBMS starts with an internal physical encoding of an abstract logical value and converts it to an external physical encoding of the same value.

Why is it necessary or desirable to specify two operations instead of one?

IE. one operation that changes the type using an explicit order followed by a second operation that physically encodes the ordered type VS. a single operation that physically encodes the result in a specified order

Referring to the concepts in TTM, it seems like a point of design with interesting design tradeoffs. When designing type generators for relations and arrays, what possible representations will the generated types have? What are the pro's and con's of choosing one set of possible representations over another?

It sounds like you may have already put more thought into these design tradeoffs than I have, and I am interested in your opinions. Received on Fri Feb 14 2003 - 00:23:14 CET

Original text of this message