Re: A new proof of the superiority of set oriented approaches: numerical/time serie linear interpolation

From: Brian Selzer <brian_at_selzer-software.com>
Date: Mon, 30 Apr 2007 01:52:40 -0400
Message-ID: <JufZh.2145$tp5.1552_at_newssvr23.news.prodigy.net>


"David Cressey" <cressey73_at_verizon.net> wrote in message news:vs6Zh.425$YW4.10_at_trndny06...
>

[snip]
>
> All of these comments should lead to one conclusion: one engages in this
> sort of analysis when one is intent on writing a better optimizer, or
> when
> coming up with a workaround in the case of a deficient optimizer. Not in
> the case of using a good optimizer to good effect.
>

In my opinion, the OP used a poor example to illustrate his point: proof of the superiority of set oriented approaches. Cimode chose as part of his solution the one instance where it can be beneficial to use an iterative approach. There is an unfortunate tendency exhibited by many of the contributors on this group to dismiss out of hand anything that even hints at iteration--even quantification, despite the fact that it is necessary to compare two sets:

X = Y iff forall x in X forall y in Y x in Y and y in X X in Y iff forall x in X x in Y

For example, converting an update into an assignment causes the inherent existential quantification to be lost in translation:

exists t in Q not t or T(t)

is translated into

not Q or Q'

But iteration must be an underlying component in any implementation of project, join, restrict, computing aggregates, enforcing referential integrity, etc., because their definitions are quantified or employ set comparison, which as shown above involves quantification. Since a join is defined as a projection of a subset of a cross-product of two relations, it is by far the most expensive with respect to the number of iterations required. It follows then that an optimal solution should have a minimum number of joins. So if it is possible to introduce a relation valued function F(R) that eliminates the need for self-joins of R, then instead of a best case |R| * 2 iterations in the case of a single self-join, or |R| * 3 iterations when there are two self-joins, only |R| iterations would be required. I don't think it is possible or at least practical for an optimizer to generate F on the fly, given the myriad of possible intrarelational dependencies; therefore, the definition of F must be supplied. As a consequence, some mechanism for iterating over the contents of a relation must be available in order to define F.

I disagree, therefore: analysis of this sort does indeed fall into the case of using a good optimizer to good effect.

>

[snip] Received on Mon Apr 30 2007 - 07:52:40 CEST

Original text of this message