Re: Using bind variables for multiple values

From: Shakespeare <whatsin_at_xs4all.nl>
Date: Tue, 25 Nov 2008 10:05:22 +0100
Message-ID: <492bbfd8$0$182$e4fe514c@news.xs4all.nl>

Robert Klemme schreef:
> On 24.11.2008 17:26, Jonathan Lewis wrote:

>> "Robert Klemme" <shortcutter_at_googlemail.com> wrote in message 
>> news:6c7ef223-0934-4cf9-a44b-ad268937b86f_at_j32g2000yqn.googlegroups.com...
>>> I wondered: why is it that the join version does a full table scan but
>>> the IN ( subselect ) version does not?  I did a 10053 trace and found
>>> out that the same access strategy which leads to the INDEX UNIQUE SCAN
>>> in the case of IN ( subselect ) is costed much higher (~ factor of 30)
>>> in the JOIN case.  BASE STATISTICS are identical in both cases and it
>>> seems the culprit is the join cardinality (255 vs. 8168) which seems
>>> derived from the outer table cardinality. The big myth to me is: where
>>> does the CBO get the cardinality of 255 from all of a sudden since it
>>> originally assumed 8168?
>>
>> If you check the cardinality, you will see that it is 255.25, which
>> is EXACTLY 8168 / 32.

>
> Yes that's what the report for the join analysis said. I didn't bother
> to quote the fractional digits in the text because I did not notice this
> relationship. Thanks for the heads up!
>

>> When Oracle doesn't know what the real statistics are, it uses some
>> arbitrary estimate. In this case, because the IN approach transforms
>> into an inline 'select distinct column_value' and Oracle doesn't know
>> how many distinct values there are, I guess is simply assumes 32.
>>
>> Just as 8168 is a side effect of the default block size, this 32 may also
>> be based on the block size. So if you've got a database with a
>> different block size handy, you might like to test what happens there.

>
> Hm... I'll see whether I find the time to retest with a different non
> standard sized tablespace.
>
> Interestingly an explicit DISTINCT does not save the join and we now get
> a hash join - concluding from the rows the CBO does not assume any
> duplicate rows - which is in line with the 10053 trace of that run,
> which assumes 8162 rows for the subquery:
>
> SQL> timing start "Join and distinct"
> SQL> select length(dat) from t1 join ( select distinct column_value from
> table(iset(-1,2,3)) ) tt on tt.column_value = t1.id
> 2 /
>
> LENGTH(DAT)
> -----------
> 90
> 90
>
>
> Execution Plan (truncated bytes and other cols for readability)
> ----------------------------------------------------------
> Plan hash value: 731532569
>
> ------------------------------------------------------------------
> | Id | Operation | Name | Rows |
> ------------------------------------------------------------------
> | 0 | SELECT STATEMENT | | 8168 |
> |* 1 | HASH JOIN | | 8168 |
> | 2 | VIEW | | 8168 |
> | 3 | HASH UNIQUE | | 8168 |
> | 4 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | |
> | 5 | TABLE ACCESS FULL | T1 | 1000K|
> ------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 1 - access("TT"."COLUMN_VALUE"="T1"."ID")
>
>
> Statistics
> ----------------------------------------------------------
> 1 recursive calls
> 0 db block gets
> 14192 consistent gets
> 0 physical reads
> 0 redo size
> 363 bytes sent via SQL*Net to client
> 377 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 2 rows processed
>
> SQL> timing stop
> timing for: Join and distinct
> Elapsed: 00:00:00.23
>
> Basically since the IN with subselect is the version that suits my needs
> best I am quite happy that it is blessed with a good plan. Although I
> have to say that it makes me a bit wary that I do not exactly understand
> what goes on in the other cases. I guess it's time for "CBO
> Fundamentals" - if only I had enough time...
>
> Again, thanks for your help!
>
> Kind regards
>
> robert

The explain plan changes back to nested loops when the optimizer goal is set to first rows.

Shakespeare Received on Tue Nov 25 2008 - 03:05:22 CST

This message: [ Message body ]
Next message: sybrandb: "Re: RMAN restore of full backup fails completely"
Previous message: raja: "Re: AWR Sample Report"
In reply to: Robert Klemme: "Re: Using bind variables for multiple values"
Next in thread: Robert Klemme: "Re: Using bind variables for multiple values"
Reply: Robert Klemme: "Re: Using bind variables for multiple values"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message