"Charles Hooper" <hooperc2000_at_yahoo.com> wrote in message
news:e0a2ef39-df04-406b-89f8-e934e61193ce_at_e23g2000prf.googlegroups.com...
> On Nov 25, 12:26 pm, Charles Hooper <hooperc2..._at_yahoo.com> wrote:
> (snip)
>> The above uses a subquery, which may perform slow on some Oracle
>> releases compared to the use of an inline view. Assume that I have a
>> table named PART, which has columns ID, DESCRIPITION, PRODUCT_CODE,
>> and COMMODITY_CODE, with ID as the primary key. I want to find ID,
>> DESCRIPTION, and COMMODITY_CODE for all parts with the same
>> DESCRIPTION and PRODUCT_CODE, where there are at least 3 matching
>> parts in the group:
>>
>> The starting point, which looks similar to your initial query:
>> SELECT
>> DESCRIPTION,
>> PRODUCT_CODE,
>> COUNT(*) NUM_MATCHES
>> FROM
>> PART
>> GROUP BY
>> DESCRIPTION,
>> PRODUCT_CODE
>> HAVING
>> COUNT(*)>=3;
>>
>> When the original query is slid into an inline view and joined to the
>> original table, it looks like this:
>> SELECT
>> P.ID,
>> P.DESCRIPTION,
>> P.COMMODITY_CODE
>> FROM
>> (SELECT
>> DESCRIPTION,
>> PRODUCT_CODE,
>> COUNT(*) NUM_MATCHES
>> FROM
>> PART
>> GROUP BY
>> DESCRIPTION,
>> PRODUCT_CODE
>> HAVING
>> COUNT(*)>=3) IP,
>> PART P
>> WHERE
>> IP.DESCRIPTION=P.DESCRIPTION
>> AND IP.PRODUCT_CODE=P.PRODUCT_CODE;
>>
>> Here is the DBMS_XPLAN:
>> --------------------------------------------------------------------------------------------------------------------
>> | Id | Operation | Name | Starts | E-Rows | A-Rows | A-
>> Time | Buffers | OMem | 1Mem | Used-Mem |
>> --------------------------------------------------------------------------------------------------------------------
>> |* 1 | HASH JOIN | | 1 | 1768 | 11525 |
>> 00:00:00.21 | 2748 | 1048K| 1048K| 1293K (0)|
>> | 2 | VIEW | | 1 | 1768 | 1156 |
>> 00:00:00.11 | 1319 | | | |
>> |* 3 | FILTER | | 1 | | 1156 |
>> 00:00:00.11 | 1319 | | | |
>> | 4 | HASH GROUP BY | | 1 | 1768 | 23276 |
>> 00:00:00.08 | 1319 | | | |
>> | 5 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
>> 00:00:00.04 | 1319 | | | |
>> | 6 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
>> 00:00:00.04 | 1429 | | | |
>> --------------------------------------------------------------------------------------------------------------------
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>> 1 - access("IP"."DESCRIPTION"="P"."DESCRIPTION" AND
>> "IP"."PRODUCT_CODE"="P"."PRODUCT_CODE")
>> 3 - filter(COUNT(*)>=3)
>>
>> The query format using the subquery looks like this:
>> SELECT
>> P.ID,
>> P.DESCRIPTION,
>> P.COMMODITY_CODE
>> FROM
>> PART P
>> WHERE
>> (DESCRIPTION,PRODUCT_CODE) IN
>> (SELECT
>> DESCRIPTION,
>> PRODUCT_CODE
>> FROM
>> PART
>> GROUP BY
>> DESCRIPTION,
>> PRODUCT_CODE
>> HAVING
>> COUNT(*)>=3);
>>
>> The DBMS_XPLAN, note that Oracle 10.2.0.2 transformed the query above:
>> ------------------------------------------------------------------------------------------------------------------------
>> | Id | Operation | Name | Starts | E-Rows | A-Rows
>> | A-Time | Buffers | OMem | 1Mem | Used-Mem |
>> ------------------------------------------------------------------------------------------------------------------------
>> |* 1 | HASH JOIN RIGHT SEMI | | 1 | 1 | 11525 |
>> 00:00:00.21 | 2748 | 1048K| 1048K| 1214K (0)|
>> | 2 | VIEW | VW_NSO_1 | 1 | 1768 | 1156 |
>> 00:00:00.12 | 1319 | | | |
>> |* 3 | FILTER | | 1 | | 1156 |
>> 00:00:00.12 | 1319 | | | |
>> | 4 | HASH GROUP BY | | 1 | 1768 | 23276 |
>> 00:00:00.09 | 1319 | | | |
>> | 5 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
>> 00:00:00.04 | 1319 | | | |
>> | 6 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
>> 00:00:00.01 | 1429 | | | |
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>> 1 - access("DESCRIPTION"="$nso_col_1" AND
>> "PRODUCT_CODE"="$nso_col_2")
>> 3 - filter(COUNT(*)>=3)
>>
>> Without allowing the automatic transformations in Oracle 10.2.0.2, the
>> query takes _much_ longer than 0.21 seconds to complete.
>>
>> The method using analytical functions starts like this:
>> SELECT
>> P.ID,
>> P.DESCRIPTION,
>> P.COMMODITY_CODE,
>> COUNT(*) OVER (PARTITION BY DESCRIPTION, PRODUCT_CODE) NUM_MATCHES
>> FROM
>> PART P;
>>
>> Then, sliding the above into an inline view:
>> SELECT
>> ID,
>> DESCRIPTION,
>> COMMODITY_CODE
>> FROM
>> (SELECT
>> P.ID,
>> P.DESCRIPTION,
>> P.COMMODITY_CODE,
>> COUNT(*) OVER (PARTITION BY DESCRIPTION, PRODUCT_CODE) NUM_MATCHES
>> FROM
>> PART P)
>> WHERE
>> NUM_MATCHES>=3;
>>
>> The DBMS_XPLAN for the above looks like this:
>> ------------------------------------------------------------------------------------------------------------------
>> | Id | Operation | Name | Starts | E-Rows | A-Rows | A-
>> Time | Buffers | OMem | 1Mem | Used-Mem |
>> ------------------------------------------------------------------------------------------------------------------
>> |* 1 | VIEW | | 1 | 35344 | 11525 |
>> 00:00:00.31 | 1319 | | | |
>> | 2 | WINDOW SORT | | 1 | 35344 | 35344 |
>> 00:00:00.27 | 1319 | 2533K| 726K| 2251K (0)|
>> | 3 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
>> 00:00:00.04 | 1319 | | | |
>> ------------------------------------------------------------------------------------------------------------------
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>> 1 - filter("NUM_MATCHES">=3)
>>
>> Note that there is only one TABLE ACCESS FULL of the PART table in the
>> above. The execution time required 0.31 seconds to complete, which is
>> greater than the first two approaches, but that is because the
>> database server is concurrently still trying to resolve the query
>> method using the subquery with no permitted transformations (5+
>> minutes later).
>>
>
> Just an update, the subquery version of the query without allowing
> transformations just completed, requiring 46 minutes and 21 seconds,
> and performed 29475 full table scans of the PART table. Here is the
> DBMS_XPLAN:
> ---------------------------------------------------------------------------------------
> | Id | Operation | Name | Starts | E-Rows | A-Rows | A-
> Time | Buffers |
> ---------------------------------------------------------------------------------------
> |* 1 | FILTER | | 1 | | 11525 |
> 00:46:21.46 | 38M|
> | 2 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
> 00:00:00.25 | 1429 |
> |* 3 | FILTER | | 29474 | | 6143 |
> 00:46:06.52 | 38M|
> | 4 | HASH GROUP BY | | 29474 | 1 | 613M|
> 00:33:24.30 | 38M|
> | 5 | TABLE ACCESS FULL| PART | 29474 | 35344 | 1041M|
> 00:00:02.54 | 38M|
> ---------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
> 1 - filter( IS NOT NULL)
> 3 - filter(("DESCRIPTION"=:B1 AND "PRODUCT_CODE"=:B2 AND
> COUNT(*)>=3))
>
> Compare that execution time (46 minutes, 21.46 seconds) with the same
> for the analytical version (0.19 seconds):
> -----------------------------------------------------------------------------------------------------------------
> | Id | Operation | Name | Starts | E-Rows | A-Rows | A-
> Time | Buffers | OMem | 1Mem | Used-Mem |
> -----------------------------------------------------------------------------------------------------------------
> |* 1 | VIEW | | 1 | 35344 | 11525 |
> 00:00:00.19 | 1319 | | | |
> | 2 | WINDOW SORT | | 1 | 35344 | 35344 |
> 00:00:00.17 | 1319 | 2533K| 726K| 2251K (0)|
> | 3 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
> 00:00:00.01 | 1319 | | | |
> -----------------------------------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
> 1 - filter("NUM_MATCHES">=3)
>
> As Maxim Demenko suggests, test the performance of the solution on the
> expected Oracle database version before committing to one solution or
> another. Make certain that the data size is reasonably large when
> performing the performance test, otherwise the test results may not
> predict actual performance once the solution is deployed.
>
> Charles Hooper
> IT Manager/Oracle DBA
> K&M Machine-Fabricating, Inc.
Thanks Charles, that is some very insightful information!! I am certainly
not that advanced (I understand what you are showing me, but no clue how to
get those results) but am hoping to get there, I am learning :)
Again thanks to you and Maxim for the help and information.
Jeff
Received on Sun Nov 25 2007 - 12:39:32 CST