Optimizer wonk sought

From: Mary Elizabeth McNeely <mary_mcneely_at_yahoo.com>
Date: Wed, 23 Apr 2008 22:36:55 -0700 (PDT)
Message-ID: <551517.94126.qm@web54101.mail.re2.yahoo.com>


Can some brave optimizer wonk interested in a challenge please help me see what I'm missing here?

Terms of reading further: Only read further with the kindest of intent. No laughing at the blonde DBA. No taunting the blonde DBA about all the silly things she tried to do to fix this.

This is a long drawn out explanation; excuse typos in my psuedocode if there are any (due to the nature of my situation, I have to stay with psuedocode):

I have a query that currently works like this, which calls a view, and the view has a stored function populating one of the columns returned:

select some_tables.some_columns,
          a_view_with_stored_function.column_with_function_result,
          a_view_with_stored_function.other_columns
from  some_tables,
        a_view_with_stored_function
where some_criteria_about_the_view_are_met and
          some_joins_on_the_tables_are_true and 
          a_view_with_stored_function.function_result > sysdate;

The stored function is executing many more times (144,000) than number of rows that would be returned by the view alone (1266), and also more than the number of rows returned by the view when joined with the rest of the query (1245) (for brevity, I'll call that "too darned often"). What I want it to do instead is execute the stored function only against the candidate rows that are returned from the already-completed view-table join. (I have done my homework and am certain this is the more efficient way to do it.)

My problem is, I can't convince Oracle to follow the execution path I want.

Here's what I've tried:

TRY 2 - take the stored function out of the view and put it directly into the main query

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

select some_tables.some_columns,
          stored_function.function_result(columns_returned_from_the_view) result_column_alias,
          a_view_with_stored_function.other_columns
from  some_tables,
        a_view_without_stored_function
where some_criteria_about_the_view_are_met and
          some_joins_on_the_tables_are_true and 
          result_column_alias > sysdate;

Result: Won't parse - Oracle's mad that I have tried to use the result_column_alias in the where statement.

TRY 3 - take the stored function out of the view and put it directly into the main query and where clause of main query

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

select some_tables.some_columns,
          stored_function.function_result(columns_returned_from_the_view) result_column_alias,
          a_view_with_stored_function.other_columns
from  some_tables,
        a_view_without_stored_function
where some_criteria_about_the_view_are_met and
          some_joins_on_the_tables_are_true and 
          stored_function.function_result(columns_returned_from_the_view) > sysdate;

Result: Function still executes "too darned often".

TRY 4 - take the stored function out of the view and put it directly into the callling query and where clause of calling query, but in a subquery

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

select query_alias.result_column_alias, query_alias.other_columns from
(

select some_tables.some_columns,
          a_view_with_stored_function.column_with_function_result result_column_alias,
          a_view_with_stored_function.other_columns
from  some_tables,
        a_view_without_stored_function
where some_criteria_about_the_view_are_met and
          some_joins_on_the_tables_are_true
) query_alias
where

          query_alias.result_column_alias> sysdate;

Result: ERROR at line 67:
ORA-01843: not a valid month

I'm feeding it the same columns it was getting from the view, not sure what's up with that, but abandoning this case for now, especially based on results I saw in later cases ... this approach probably wouldn't have worked anyway.

TRY 5 - take the stored function out of the view and also (I had planned/hoped) externalize it from the table-view join activity +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

select query_alias.result_column_alias, query_alias.other_columns from
(

select some_tables.some_columns,

          stored_function.function_result(columns_returned_from_query_alias1) result_column_alias,
          query_alias2.other_columns

from

    (
   select some_tables.some_columns
   from some_tables,

           a_view_without_stored_function    where some_criteria_about_the_view_are_met and

          some_joins_on_the_tables_are_true     ) query_alias1
) query_alias2
where

          query_alias2.result_column_alias > sysdate;

Result: Function still executes "too darned often".

TRY 6 - bang head against the desk, then decompose the TRY 5 query to learn more about it, tracing during each step

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
TRY 6A:    select some_tables.some_columns
   from some_tables,

           a_view_without_stored_function    where some_criteria_about_the_view_are_met and

          some_joins_on_the_tables_are_true -- innermost query alone returns 1245 rows - the function isn't called in this query - just trying to prove the function doesn't execute during a_view_without_stored_function without my realizing it, causing part of the problem - trace files prove the function isn't executing here

TRY 6B:
select some_tables.some_columns,

          stored_function.function_result(columns_returned_from_query_alias1) result_column_alias,
          query_alias2.other_columns

from

    (
   select some_tables.some_columns
   from some_tables,

           a_view_without_stored_function    where some_criteria_about_the_view_are_met and

          some_joins_on_the_tables_are_true     ) query_alias1
)
-- add the stored function in: innermost query and next outer query alone returns 1245 rows, and executes the stored function about 1266 times - this is about the number of times I want the function to execute, so far, so good

TRY 6C:
Add the outermost query back in, but not the outermost where clause from TRY 5:
(

select some_tables.some_columns,

          stored_function.function_result(columns_returned_from_query_alias1) result_column_alias,
          query_alias2.other_columns

from

    (
   select some_tables.some_columns
   from some_tables,

           a_view_without_stored_function    where some_criteria_about_the_view_are_met and

          some_joins_on_the_tables_are_true     ) query_alias1
) query_alias2
-- still returns 1245 rows, and executes the stored function about 1266 times - so far, still so good, and I'm getting the CPU, I/O count, and run-time gains I'd hoped for. The only thing left to do is add the where clause back in.

  • put the outermost where clause back select query_alias2.result_column_alias, query_alias2.other_columns from
    (
    select some_tables.some_columns, stored_function.function_result(columns_returned_from_query_alias1) result_column_alias, query_alias2.other_columns from ( select some_tables.some_columns from some_tables, a_view_without_stored_function where some_criteria_about_the_view_are_met and some_joins_on_the_tables_are_true ) query_alias1 ) query_alias2 where query_alias2.result_column_alias > sysdate;
  • still returns 1245 rows, but the wheels fall off at this point - we're back to function still executing "too darned often" (144000 times)

SO, THE OPTIMIZER IS BEING "HELPFUL", TRYING TO APPLY THE OUTERMOST WHERE PREDICATE DURING THE EXECUTION OF THE INNER QUERIES, SOMEHOW HAVING REWRITTEN THEM IN A "SMARTER" FASHION. Not really helpful - uses twice the I/O, twice the CPU, and twice the run time of the query without the outermost where clause.

Other things I tried:

TRY 7 - externalizing the query into another view (so now there are two layers of views)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

create view whywhywhy_dammit as
select query_alias2.result_column_alias, query_alias2.other_columns from
(

select some_tables.some_columns,

          stored_function.function_result(columns_returned_from_query_alias1) result_column_alias,
          query_alias1.other_columns

from

    (
   select some_tables.some_columns,

             query_alias2.result result_column_alias,
            query_alias2.other_columns
   from  some_tables,
           a_view_without_stored_function
   where some_criteria_about_the_view_are_met and
          some_joins_on_the_tables_are_true
    ) query_alias1
) query_alias2;

select some_columns, result_column_alias from whywhywhy_dammit
where result_column_alias > sysdate;

Result: Function still executes "too darned often" - the optimizer continues to be HELPFUL

TRY 8 - externalizing the two "internal" queries from TRY 5 into a WITH clause

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Result: Function still executes "too darned often" - the optimizer continues to be HELPFUL

TRY 9 - getting desperate, add NO_MERGE hint to TRY 5

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Result: Function executes 2525 times - not the 1266 I wanted, but much better, and I/O is lower, but bizarre execution plan takes a long time to execute and lots of CPU time, both worse than the original query

TRY 9 - getting more desperate, add PUSH_SUBQ hint to TRY 5, even though I'm not sure what it does, but it has the word subquery in its help text, so it must be all good, right?????

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Result: Function executes 2525 times - not the 1266 I wanted, but much better, and I/O is lower, but same bizarre execution plan as TRY 9 takes a long time to execute and lots of CPU time, more than the original query

TRIES 10 and 11 - now completely thrashing and having nothing to lose (pride hours ago having evaporated), tried adding UNNEST and then NO_UNNEST hints added to TRY 5

+++++++++++++++++++++++++++++++++++++++++++++++++
Result: back to function still executing "too darned often", in both cases

TYPING IN ALL CAPS BECAUSE I'M NOW YELLING: ALL I WANT IS THE OPTIMIZER TO EXECUTE THIS USING THE PLAN IT USED IN TRY 6C: select query_alias.result_column_alias, query_alias.other_columns from
(

select some_tables.some_columns,

          stored_function.function_result(columns_returned_from_query_alias1) result_column_alias,
          query_alias2.other_columns

from

    (
   select some_tables.some_columns
   from some_tables,

           a_view_without_stored_function    where some_criteria_about_the_view_are_met and

          some_joins_on_the_tables_are_true     ) query_alias1
) query_alias2

THEN DO THIS STEP, only seeing result_column_alias as an atomic value being returned from its child query, not an opportunity to perform slick optimizer techniques: where

          query_alias2.result_column_alias > sysdate;

HOW CAN I GET THERE FROM HERE? What am I missing? alter system set smart_alec_optimizer=off; ?

As always, thanks for whatever hints you can offer. Let me know if you need more information - hope this is enough for you to get the concept even with just pseudocode.

Mary Elizabeth
(not feeling very smart, with head sore from being banged on desk, client team members think she's crazy because she's taken to walking around the office muttering this afternoon)

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Apr 24 2008 - 00:36:55 CDT

Original text of this message