Re: Functions and Relations

From: NENASHI, Tegiri <tnmail42_at_gmail.com>
Date: Tue, 21 Nov 2006 15:32:56 +0100 (CET)
Message-ID: <Xns9882613C04C67asdgba_at_194.177.96.26>


Sampo Syreeni <decoy_at_iki.fi> wrote in news:Pine.SOL.4.62.0611211353350.4310_at_kruuna.helsinki.fi:

> On 2006-11-21, NENASHI, Tegiri wrote:
>

>> There are relation compositions that are not associative:
>>
>> 3 -2 - 1: let const3 = 3; m2:x ->x-2; m1:x->x-1;
>>
>> 'const3 o m2 o m1'  is not associative.

>
> Could you elaborate?

It is my stupid error: I did not think straightly: the function composition is always associative and it is the consequence of the definition of the composition only. Please look at my response to 'vc'.

>

>>> SQL optimization. Consider
>>>
>>> select sal+100 from emp
>>>
>>> what exactly is "sal+100" and how to show it on the explain plan?
>>
>> What is the problem?

>
> I think you'll get a better example if the optimizer can somehow take
> explicit advantage of the exposed relational structure.
>
> Suppose you want an equijoin f(x)=y where x and y are in different
> relations. Most optimizers throw away any statistics on x because of
> the function application or use some approximate statistic derived
> from x. With high cardinality on both attributes, the plan tree might
> become something like hash_join(f(x), y). If f happens to be some
> function like mod(x,2) which is very far from identity and even
> injective, this can be the wrong choice. For example, if there are few
> ones and zeroes in y, it probably makes more sense to do
> index_nested_loops(f(x), y). The problem is how to give this knowledge
> to the optimizer in a regular form that it is able to utilize.
>
> If the function is treated as a relation, the machinery is already
> there in the form of table statistics. In this case they just need to
> be filled in at query compilation or dynamic sampling time.

I understand and I am agree that the optimiser can utilize the same algorithms with the function-like-relation like it utilizes with "normal" relations but I think that the loss of performance thanks to the interpretation of the partially evaluated function can be more that the gain from the optimiser statistics. But the proof will be the experiment of course ;)

One other objection is that the modern optimizer can take count of the function application to tuples. I have seen that MS SQL Server does it.  I do not know about Oracle but I can ask our DBA. Received on Tue Nov 21 2006 - 15:32:56 CET

Original text of this message