Re: Databases as objects

From: Sampo Syreeni <decoy_at_iki.fi>
Date: Fri, 22 Dec 2006 16:44:07 +0200
Message-ID: <Pine.SOL.4.62.0612221536120.10891_at_kruuna.helsinki.fi>


On 2006-12-22, Thomas Gagne wrote:

> I could write an entire program with inlined C code and not use macros
> or functions. But functions improve modularity, readability,
> reliability, and reusability. Why wouldn't SQL benefit from the same
> organization?

Because relational algebra is rather different from your typical procedural host language: it is a high level, declarative language, it deals with entire sets of things at a time, and it's highly productive (whatever you get back from a query can be used freely in formulating new queries). When you wrap it up in a number of procedures -- in most cases a much lower level abstraction -- you usually end up losing most of the expressivity and compositionality that made the relational model attractive for data management in the first place.

For example, you might package the lookup of a single customer's details as a procedure, but if you then wanted to do something to all customers, the best that a procedural host language is usually able to offer you is a cursor loop. Using something like that is obviously a bad idea, because this sort of thing is much more naturally and efficiently implemented as a set update, optimized by the DBMS. The problem is that after you've expressed that set update as a low level construct like an explicit loop, the compiler cannot be expected to be intelligent enough to figure out that you actually wanted to quantify over a set; going from high level abstractions to low level details is much simpler than the converse. Of course you could then tell to the compiler what it doesn't understand: just implement a new method which encapsulates the set update. But how is it cleaner or more productive to have umpteen methods to do various specialized things to your database, than to have the small set of closed, general, high level, declarative primitives that the relational algebra represents?

So, I'd argue that you have it backwards. You don't want to wrap higher level abstractions in lower level ones, even if it seemingly makes things more uniform, because the uniformity comes at the price of reducing the expressivity of the higher level language to that of the lower level one. In reality you'll want the highest level abstraction to guide the design of the rest of your language, because powerful abstractions are where productivity, possibilities for automated program transformation and comprehensible semantics come from. In the above example, you'd ideally want to declare some operation by quantifying another one over the set of all customers, and have the compiler and the DBMS work out the precise low level implementation of your intended semantics automagically. You can't do that sort of thing in a procedural language, while in relational algebra and, say, most functional languages, you can. What this tells us is that the procedural host language is not as powerful as the alternatives. It should be amended or traded in for something better; we *definitely* shouldn't try to force the higher level primitives into the lower level mold.

-- 
Sampo Syreeni, aka decoy - mailto:decoy_at_iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Fri Dec 22 2006 - 15:44:07 CET

Original text of this message