Re: Functions and Relations

From: Sampo Syreeni <decoy_at_iki.fi>
Date: Tue, 21 Nov 2006 15:00:51 +0200
Message-ID: <Pine.SOL.4.62.0611211353350.4310_at_kruuna.helsinki.fi>


On 2006-11-21, NENASHI, Tegiri wrote:

> There are relation compositions that are not associative:
>
> 3 -2 - 1: let const3 = 3; m2:x ->x-2; m1:x->x-1;
>
> 'const3 o m2 o m1' is not associative.

Could you elaborate? The only way I get (m1 o (m2 o const3))(x) to not equal ((m1 o m2) o const3)(x) is if I fail to include the intermediate results in the domains and ranges of the respective function. But then, if you don't include them, the sets on top of which the three functions are defined are such that the composition operator is not defined in the first place. The representation in terms of relations and joins is a bit laxer in that it automagically restricts the functions into subsets where composition *is* defined, but other than that, it seems to work the same.

>> SQL optimization. Consider
>>
>> select sal+100 from emp
>>
>> what exactly is "sal+100" and how to show it on the explain plan?
>
> What is the problem?

I think you'll get a better example if the optimizer can somehow take explicit advantage of the exposed relational structure.

Suppose you want an equijoin f(x)=y where x and y are in different relations. Most optimizers throw away any statistics on x because of the function application or use some approximate statistic derived from x. With high cardinality on both attributes, the plan tree might become something like hash_join(f(x), y). If f happens to be some function like mod(x,2) which is very far from identity and even injective, this can be the wrong choice. For example, if there are few ones and zeroes in y, it probably makes more sense to do index_nested_loops(f(x), y). The problem is how to give this knowledge to the optimizer in a regular form that it is able to utilize.

If the function is treated as a relation, the machinery is already there in the form of table statistics. In this case they just need to be filled in at query compilation or dynamic sampling time.

-- 
Sampo Syreeni, aka decoy - mailto:decoy_at_iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Tue Nov 21 2006 - 14:00:51 CET

Original text of this message