Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Bob Badour <bbadour_at_golden.net>
Date: Thu, 13 Mar 2003 14:24:36 -0500
Message-ID: <Dn5ca.61$5Q2.8409221_at_mantis.golden.net>


"Lauri Pietarinen" <lauri.pietarinen_at_atbusiness.com> wrote in message news:3E709784.4030003_at_atbusiness.com...
>
>
> Lauri Pietarinen wrote:
>
> >jan.hidders_at_REMOVE.THIS.ua.ac.be (Jan Hidders) wrote in message
news:<3e6ef1b4.0_at_news.ruca.ua.ac.be>...
> >
> >
> >>Bob Badour wrote:
> >>
> >>
> >>>Jan is just being contrary to suit his emotional needs.
> >>>
> >>>
> >>Absolutely. I get highly upset if I see all the sloppy reasoning,
sweeping
> >>generalizations and unwarranted assumptions in an area that I happen to
love
> >>and know a thing or two about. I don't blame Lauri for making highly
> >>debatable claims such as that optimizing GOTO code is harder and
compilers
> >>for GOTO languages are necessarily bigger and buggier, because he does
not
> >>claim to be an expert.
> >>
> >>
> >Also of interest
> >is the Turing Award Lecture by Tony Hoare, which shows how
> >"featurism" killed Algol.
> >
> >See
> >http://www.braithwaite-lee.com/opinions/p75-hoare.pdf
> >
>
> Yes, and now I remember where I got the idea of GOTO's being hard to
> implement
> and optimise. It was in just that lecture by Tony Hoare. Here is the
> relevant
> part of the lecture. This was, of course, back in the early 60's so
perhaps
> compiling and optimising techniques have developed since then, but I have
> the suspicion that what Hoare writes here still holds true.
>
> [...]
> As a result of this work on ALGOL, in August 1962, I
> was invited to serve on the new Working Group 2.1 of
> IFIP, charged with responsibility for maintenance and
> Among the other proposals for the development of a
> new ALGOL was that the switch declaration of ALGOL 60
> should be replaced by a more general feature, namely an
> array of label-valued variables and that a program
> should be able to change the values of these variables by
> assignment. I was very much opposed to this idea, similar
> to the assigned, Go TO of FORTRAN, because I had found
> a surprising number of tricky problems in the implemen-
> tation of even the simple labels and switches of ALGOL
> 60. I could see even more problems in the new feature
> including that of jumping back into a block after it had
> been exited. I was also beginning to suspect that pro-
> grams that used a lot of labels were more difficult to
> understand and get correct and that programs that as-
> signed new values to label variables would be even more
> difficult still.
> [...]

Lauri,

The thing that is difficult to optimize is a branch to a variable location. Because the branch destination is unknown at compile time, there is no way to determine which blocks of code are reachable from the block ending in the "goto variable location" statement, and there is no way of knowing whether a labelled block of code is reachable.

This has important consequences for induction, code motion and dead code elimination.

The goto equivalent of a while loop does not use variable destinations. A potential replacement for a switch or case statement, however, would use variable destinations.

A switch or case statement compares a (variable) expression with a number of constants to decide which block of code to execute. This is much more optimizable than a variable destination. For instance, if the compiler can determine that the expression will always evaluate to the same value, it may eliminate all the other blocks as dead code. Or it may devise a simple arithmetic transformation that allows a simple array lookup for the destination address. And it limits the scope of variability: The optimizer does not need to consider the switch or case statement when optimizing external blocks of code whereas a variable goto destination might land anywhere.

Most compilers will translate the while loop into something closely approximating the goto solution prior to optimization so the while loop introduces neither executable code performance benefits nor costs.

Dijkstra's condemnation of goto was far more sweeping and included branches to fixed locations. Dijkstra's condemnation had more to do with human comprehension than with performance optimization. I suspect Dijkstra would condemn goto even it it meant performance degradation, because it is easier to foresee improved automated optimization techniques by expanding human specialist knowledge than it is to foresee a fundamental change in human perception.

That last clause above describes a key design principle for all kinds of subject areas including programming and data management.

I have been thinking about Jan's recent criticisms of Date's articles. I don't think Jan is taking the target audience into account. Date addresses his articles to a wider audience than an elite clique of researchers.

I don't think it matters to a general audience whether an elite clique of researchers all understand what each other mean when they reuse a term in a non-literal manner. A small group of people all understand the same jargon... Big deal. I don't think it is hubris to clarify to a wider audience what that jargon really means.

With respect to your analogy that Chris quoted, I don't think it matters if an analogy is absolutely correct in every minute detail as long as the target audience understands the analogy. And, personally, I think a Dijkstra analogy is much more effective than a Hoare analogy when considering whether to drop logical identity as a requirement.

After all, the underlying principle driving Dijkstra's condemnation of goto resonates with one of the most basic principles of data management and of the relational model. We can easily improve or change the physical properties of a database or even of a dbms. For now, we are stuck with the fact that humans are humans. Genetic engineering is insufficiently developed to change the basic properties of human physiology and cognition.

However, the Hoare speech has a lot to say about what the hell went wrong with SQL. Consider the following quotes from the speech:

"When any new language design project is nearing completion, there is always a mad rush to get new features added before standardization. The rush is mad indeed, because it leads into a trap from which there is no escape. A feature which is omitted can always be added later, when its design and its implications are well understood. A feature which is included before it is fully understood can never be removed later." (Duplicate rows, NULL and references for instance.)

"At first I hoped that such a technically unsound project would collapse but I soon realized it was doomed to success. Almost anything in software can be implemented, sold, and even used given enough determination."

"But there is one quality that cannot be purchased in this way--and that is reliability. The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay."

"All this happened a long time ago. Can it be regarded as relevant in a conference dedicated to a preview of the Computer Age that lies ahead? It is my gravest fear that it can. The mistakes which have made in the last twenty years are being repeated today on an even grander scale."

(Indeed, some of the events described in the speech were contemporaneous with my conception and birth. The speech was given before my involvement with computers began, and completely predates SQL. Much I have since observed in my career justifies Hoare's fear. Another 23 years have passed since he gave the acceptance speech, and I don't see things changing any time soon.)

And finally: "The tailor is canonized as the patron saint of all consultants, because in spite of the enormous fees that he extracted, he was never able to convince his clients of his dawning realization that their clothes have no Emperor." Received on Thu Mar 13 2003 - 20:24:36 CET

Original text of this message