Re: Dreaming About Redesigning SQL

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Tue, 21 Oct 2003 11:24:59 +0100
Message-ID: <bn31nq$aiq$1_at_gazette.almaden.ibm.com>


"Anthony W. Youngman" <thewolery_at_nospam.demon.co.uk> wrote in message news:5fFJs4BJqEl$Ewqo_at_thewolery.demon.co.uk...
> In article <bn0j82$1gnm$1_at_gazette.almaden.ibm.com>, Paul Vernon
> <paul.vernon_at_ukk.ibmm.comm> writes
> >No, I think Anthony is just saying that he doesn't "believe" in
science/the
> >scientific method. Or maybe he believes that engineering is not based on
> >scientific knowledge!
>
> Actually, I *DO* believe in the Scientific Method.
>
> I just fail to see the connection between Scientific Method and
> Relational. The former is Science, the latter is Maths. Please tell me
> how I can use relational theory to predict the future. Without that,
> relational is unprovable, and hence unscientific.
>
> Note I didn't say relational is *incorrect* - the ideas of
> "mathematically correct" and "scientifically provable" are orthogonal,
> and have nothing to say about each other.

Fair enough. Maths just has to be self consistent (given some axioms), Science has to be 'useful', in that it predicts what is possible, what can happen.

You'd agree that the relational model is self consistent - or at least you've no reason to doubt that? OK. Is it useful, or more importantly is it *more* useful than any other self consistent model that has been proposed?

Well, our answer is that is it *more* useful because it is necessacary and sufficient for logically representing data. There is no data that is not representable as sets of like tuples in a set of relations. Any other known representation, such as lists, arrays, trees, networks etc adds no extra data representation capabilities. Moreover any other representaion adds complexity (aka stuff) but not any extra power.

What relational theory predicts is that a system that needs data persistence will be more complex if built on a non-relational foundataion than on a relational one - everything else being equal.

I'll admit to not knowing if the above has been formally proved, but, well I'm tempted to say it's obvous ;-)

Now, I might be generous and say that the above is predicated on the assumption of data independence - the idea that the physical layout to store the data be separated from the abstract representation that user use to view, create, query and modify that data. So if don't believe in this principle (which I suspect you don't) then I guess we will struggle to convince you of the essentiality of the relational model.

For myself, I do happen to think that physical data independence is not quite a perfectly achievable ideal. The ideal would be that all queries, no matter how complex, would take the same time to execute, or rather that execution time would be directly proportional to query complexity - which would be knowable at compile time. I don't know that such an ideal is possible, but I do know that we can get close, and that fall backs such as up front query time estimates produced by relational execution engines help bridge the gap. To give up, and expose physical matters (i.e. performance) via the logical model, is in no way warranted in the general case.

A model that exposes users to the physical might have the benefit of giving users more direct knowledge of expected performance, but the cost in lack of generality is way to high (for all but the most specialised situations, or the most messed up market places).

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Tue Oct 21 2003 - 12:24:59 CEST

Original text of this message