Re: Functions and Relations

From: vc <boston103_at_hotmail.com>
Date: 21 Nov 2006 17:43:27 -0800
Message-ID: <1164159806.949444.158300_at_f16g2000cwb.googlegroups.com>


Aloha Kakuikanu wrote:
> vc wrote:
> > In Oracle 9i:
> > The nested loops plan does 15 times more work (30617 c.g. vs. 2216
> > c.g.) than the hash join. It's hardly "logically about the same amount
> > of work".
>
> This is because hash matching is done in memory and therefore is not
> reflected in the logical IO.

In Oracle, a logical get operation may obtain a block either from the disk cache or result in a physical disk access. So, no, the "logically gotten" blocks for the purpose of a hash join should be all accounted for.

>The matching itself is logically very
> similar, or is it not?

Well, it's just a join implementation so I guess they are similar, or you mean something else ?

>
> BTW, you omited the actual execution time difference. This is becuase
> it confiurms my 10 times heuristic, isn't it?-)

In a sort of way, yes, it does if you think that a 5 times difference in the execution time is OK. Here's some real table join:

select *
from
 tu_c a, tu_p b where a.SEQ_FIN_TXN_ID = b.SEQ_FIN_TXN_ID

call count cpu elapsed disk query current

    rows
------- ------ -------- ---------- ---------- ---------- ----------


Parse        1      0.01       0.00          0          0          0
       0
Execute      1      0.00       0.00          0          0          0
       0
Fetch   334959    189.76     277.10     283438     190958          0
 5024367
------- ------ -------- ---------- ---------- ---------- ----------

total 334961 189.77 277.11 283438 190958 0  5024367

Rows Row Source Operation

-------  ---------------------------------------------------
5024367 HASH JOIN
5024367 TABLE ACCESS FULL TU_c
4628606 TABLE ACCESS FULL TU_p

select /*+ use_nl(a,b) */ *
from
 tu_c a, tu_p b where a.SEQ_FIN_TXN_ID = b.SEQ_FIN_TXN_ID

call count cpu elapsed disk query current

    rows
------- ------ -------- ---------- ---------- ---------- ----------


Parse        1      0.00       0.00          0          0          0
       0
Execute      1      0.00       0.00          0          0          0
       0
Fetch   334959    241.60    1481.30     212586   15798901          0
 5024367
------- ------ -------- ---------- ---------- ---------- ----------

total 334961 241.60 1481.31 212586 15798901 0  5024367

Rows Row Source Operation

-------  ---------------------------------------------------
5024367 NESTED LOOPS
5024367 TABLE ACCESS FULL TU_c
5024367 TABLE ACCESS BY INDEX ROWID TU_p 5024367 INDEX UNIQUE SCAN XPK_p (object id 60817)

The NL elapsed time is 1481 sec vs. the SJ time of 277 sec.

>
> > That answers "why you think it'll be more performant" but does not
> > answer "How do you suggest to materialize 'f(x)' as a relation ?". So
> > how do you suggest to do the trick ?
>
> No, I objected to the method where we materialize the `y=f(x)`
> relation. I suggested a vague idea that we access this relation via
> index -- function invokation.

Can you make this idea a little bit less vague ?

>
> > > Are you aware that selectivity/cardinality estimation related to
> > > function invokations in SQL is finicky in today's RDBMS implementations?
> >
> > I am aware of the cardinality problem with 'select * from t1 where x in
> > (select x from table(function_1(a,b,c))' and its variations, but the
> > problem can be cured with the cardinality hint. It may not be
> > esthetically pleasing, granted, but what is the alternative ?
>
> Not only that. Can you specify card(A) = 10 * card(B)? How do you
> specify cardinality after some predicates are applied? How do you
> specify cardinality for a volatile (or temporaty) table that changes
> cardinality by orders of magnitude?

The last one is implemented by Oracle as "dynamic sampling". Not very well though.

>
> > One can
> > use a temp table too which perhaps essentially the same as your
> > suggestion (materializing a function as a temp relation). However,
> > there may be unexceptable overhead when using temp relations, or such
> > temp relation may be infinite and therefore impossible.
>
> The cost of such materialization should be evaluated apriory. Given
> that there is unavoidable selectivity/cardiality estimation errors, I'd
> even suggest a heuristics that explicitly prohibits such a
> materialization.
Received on Wed Nov 22 2006 - 02:43:27 CET

Original text of this message