Re: Hashing for DISTINCT or GROUP BY in SQL

From: Cimode <cimode_at_hotmail.com>
Date: Mon, 18 Oct 2010 09:31:51 -0700 (PDT)
Message-ID: <3681483e-5b4b-4e6e-a4e8-342895372677_at_x42g2000yqx.googlegroups.com>


On 18 oct, 15:59, paul c <anonym..._at_not-for-mail.invalid> wrote:
> On 18/10/2010 2:31 AM, Cimode wrote:
>
> > What is the object of optimization in a relational perspective ? What
> > does it aim at, physically wise ?  What are the modalities to defined
> > to achieve such aim ? Don't you have the impression that the
> > definition of*optimization*  in such broad perspective is the
> > relational theorists's way sweep the dust under the rug ?
>
> I just mean that the optimizations typically used are adhoc.  In
> algebra, we could write a 'reference' constraint, a more general form of
> 'foreign key' as B{F} = B{F} <AND> A{K}.  That could be implemented
> directly.
Sure but what tells you that doing that such implementation is preferable than some other ? You are describing a possibility but how do you formalize this solution as being a part of a model? How do you evaluate the limits of such approach ?

> It must be a very desireable optimization to minimize physical work even
> if that work is only the movement of electrons in the face of the
> equally desireable ability to 'update', say when a relvar is assigned.
> When faced with B' := B <OR> I, where the apostrophe refers to a relvar
> B after an 'insert', it's fairly obvious that the reference constraint
> can be tested against the 'before' value and only the tuples 'I' need to
> be tested (as opposed to all the tuples in B').  This strikes me as both
> a logical and physical optimization.
Actually, I believe it is the way around. We should not start from attempting to minimize physical IO's but rather see how the logical IO's can be reduced starting from the logical relational operation at hand. The physical IO's can only be reduced *if and only if* the logical IO's can properly be quantified and optimized.

> Being somewhat inept at relational calculus I don't have a ready example
> for that.  I know that Codd expected his calculus to be more easily
> optimizable but I think that would still result in a similar situation.
Exactly, but Codd somehow suggested that the chances for a physical IO to be optimized when the logical IO's is not , are NULL To my understanding, he considered that logical IO should always take precedence over the physical IO in the evaluation of a computing scheme.

> This may have been written about and I just haven't seen it, but there
> seems to be some missing implementation theory, to do with axioms, if
> you will, for analyzing such expressions and recognizing when the set
> doesn't need to be tested, ie., when only one term in an algebraic
> expression needs to be tested against a constraint.
>
> (The reference constraint becomes quite convoluted when a strict foreign
> key constraint is desired and there are accordingly more ways to write
> it, but an optimization 'theory' would need to handle that too.  A
> strict algebra won't have any keywords such as 'KEY'.  A comprehensive
> optimization theory (eg. axioms?) for algebra expressions doesn't exist
> as far as I know.  Even though Date points out a number of useful
> optimizations in his Intro book, as I recall he doesn't get down to
> brass tacks with the perhaps most elementary one, candidate keys.)
Foreign KEY is a already a complex case. My point is that we lack the model to represent individual relations in a computing perspective in the first place. In such perspective, quantifying logical IO's for operations seems premature. Received on Mon Oct 18 2010 - 18:31:51 CEST

Original text of this message