Re: Why is "group by" obligatory in SQL?

From: paul c <toledobythesea_at_oohay.ac>
Date: Fri, 24 Jul 2009 15:00:24 GMT
Message-ID: <c0kam.37958$PH1.28420_at_edtnps82>


Cimode wrote:
> Snipped.

>> Who can know how the original designers arrived at their decision?

> Probably because they believed in the myth of natural programming
> languages. SQL is no exception.
>
> Dijskra clearly described the fundamental problem below...
>
> The virtue of formal texts is that their manipulations, in order to be
> legitimate, need to satisfy only a few simple rules; they are, when
> you come to think of it, an amazingly effective tool for ruling out
> all sorts of nonsense that, when we use our native tongues, are almost
> impossible to avoid.
>
> (http://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667.html)
>
>> There were certainly other languages from the early SQL era (such as
>> QUEL) that would allow a similar formulation to what you propose,
>> meaning what you intend.  (And indeed QUEL would allow other sensible
>> things with aggregates that SQL still doesn't easily allow.)
>>
>> Personally I imagine the logic of SQL aggregation got defined in haste,
>> was "good enough", and they just moved swiftly on to the next challenge.
>> A small team, internal IBM rivalries, publication demands, etc. finished
>> the job.

> When looking at the grammar of the initial BS12, it seems obvious that
> we slowly shifted from a declarative oriented language to a verbose
> driven language.
>
> In BS12 the expresssion of aggregation was by the assignment of
> variables seprating the aggregates from the operation such as
>
> T1 = SUMMARY(EMP, GROUP(DEPTNUM), EMPS=COUNT, SALSUM=SUM(SALARY))
> T2 = JOIN(T1, DEPT)
> T3 = SELECT(T2, SALSUM > BUDGET)
>
> comparing with SQL
>
> SELECT d.Deptnum, Count(*) as Emps,
> Sum(e.Salary) as Salsum, Budget
> FROM Emp as e
> JOIN Dept as d ON e.Deptnum = d.Deptnum
> GROUP BY d.Deptnum, Budget
> HAVING Sum(e.Salary) > Budget
>
> It is reasonnable to assume that the BS12 approach consisting of
> differentiating operations and presentation, while more abrupt, forced
> the programmer to understand better the underlying concepts. In other
> words, the introduction of *aesthetics* is one of the factors that
> killed SQL's chances to ever become a relational operation language.
>

Great post, Cimode, Djikstra as profound as ever, this stuff bears repeating every so often. Too bad so little remains of the history of db language development from the 1970's, I think there are still lessons to be learned from the scanty archives. The motivations of those days must be murky to anybody who didn't grow up with assembler languages, in fact I'd say they remaiin murky for most of the people who were there. Many assembler people I knew seemed to have a constant question in the back of their minds, but one they never verbalized - "what is the interface?". I think they probably never even conceived it, it was just an intermittent doubt in their subconcious mind. I don't think I heard anybody ask that question until the 1980's. The programming majority in the mainframe world then used Cobol and I'd say the language trend was mostly trying to avoid background details because those were somehow associated with the obscurity of machine language techniques (btw, the term 'exit strategy' was in fairly common use by the system programming world then, long before Pentagon historians dissected the military problems of Vietnam.) In the commercial world of that time there was a lot of emphasis on so-called 'structured programming techniques', which we know today is just a form of language mysticism, at least insofar as it disguises the real programmer's interface. Long before I even knew that SQL assumed duplicate rows, I could never get why 'project' was called 'select'.

The BS12 example emphasizes to me that the basic programming interface is the relation and it's not hard to imagine it leading to a functional language. I have only ever seen one database book that mentioned BS12 and that was a fairly brief section.

Another quote from the McJones site is from Don Chamberlin:

"I had a conversion experience, and I still remember this. Ted Codd came to visit Yorktown, I think it might have been at this symposium that Irv alluded to. He gave a seminar and a lot of us went to listen to him. This was as I say a revelation for me because Codd had a bunch of queries that were fairly complicated queries and since I’d been studying CODASYL, I could imagine how those queries would have been represented in CODASYL by programs that were five pages long that would navigate through this labyrinth of pointers and stuff. Codd would sort of write them down as one-liners. These would be queries like, “Find the employees who earn more than their managers.” [laughter] He just whacked them out and you could sort of read them, and they weren’t complicated at all, and I said, “Wow.” This was kind of a conversion experience for me, that I understood what the relational thing was about after that." Received on Fri Jul 24 2009 - 17:00:24 CEST

Original text of this message