Re: Help with sql query speed. Explain plan included

From: Valentin Minzatu <valentinminzatu_at_yahoo.com>
Date: 20 Mar 2007 07:30:52 -0700
Message-ID: <1174401052.693791.106100@e65g2000hsc.googlegroups.com>

On Mar 20, 8:47 am, "Charles Hooper" <hooperc2..._at_yahoo.com> wrote:
> On Mar 20, 6:33 am, "Valentin Minzatu" <valentinminz..._at_yahoo.com>
> wrote:
>
> > On Mar 19, 9:00 pm, "Charles Hooper" <hooperc2..._at_yahoo.com> wrote:
> > > SELECT
> > > INDEX_NAME,
> > > LEAF_BLOCKS
> > > FROM
> > > DBA_INDEXES
> > > WHERE
> > > INDEX_NAME LIKE 'T1%';
>
> > > INDEX_NAME LEAF_BLOCKS
> > > ------------------------------ -----------
> > > T1_INDEX1 517,844
> > > T1_INDEX2 347,977
> > > T1_INDEX3 713,742
>
> > > ALTER SESSION SET TRACEFILE_IDENTIFIER = 'TEST201';
>
> > > select tx.residentid
> > > from t1 t0,t1 tx
> > > where t0.residentid!=tx.residentid
> > > and t0.questionnumber=tx.questionnumber
> > > and t0.answer=tx.answer
> > > and t0.residentid=1486674
> > > group by tx.residentid
> > > having count(*)>10;
>
> > > Plan Table
> > > ============
> > > ---------------------------------------------------
> > > +-----------------------------------+
> > > | Id | Operation | Name | Rows | Bytes |
> > > Cost | Time |
> > > ---------------------------------------------------
> > > +-----------------------------------+
> > > | 0 | SELECT STATEMENT | | | |
> > > 15K | |
> > > | 1 | FILTER | | |
> > > | | |
> > > | 2 | SORT GROUP BY | | 102K | 2244K |
> > > 15K | 00:04:48 |
> > > | 3 | NESTED LOOPS | | 2040K | 44M |
> > > 6879 | 00:02:46 |
> > > | 4 | TABLE ACCESS BY INDEX ROWID | T1 | 75 | 825
> > > | 80 | 00:00:02 |
> > > | 5 | INDEX RANGE SCAN | T1_INDEX1| 75 |
> > > | 4 | 00:00:01 |
> > > | 6 | INDEX RANGE SCAN | T1_INDEX3| 27K | 297K
> > > | 91 | 00:00:02 |
> > > ---------------------------------------------------
> > > +-----------------------------------+
> > > Predicate Information:
> > > ----------------------
> > > 1 - filter(COUNT(*)>10)
> > > 5 - access("T0"."RESIDENTID"=1486674)
> > > 6 - access("T0"."QUESTIONNUMBER"="TX"."QUESTIONNUMBER" AND
> > > "T0"."ANSWER"="TX"."ANSWER")
> > > 6 - filter(("TX"."RESIDENTID"<>1486674 AND
> > > "T0"."RESIDENTID"<>"TX"."RESIDENTID"))
>
> > > ROUP BY cardinality: 2089330.00, TABLE cardinality: 2089330.00
> > > HAVING selectivity: 0.05 -> GROUPS: 104467
> > > SORT resource Sort statistics
> > > Sort width: 88 Area size: 314368 Max Area size:
> > > 15728640
> > > Degree: 1
> > > Blocks to Sort: 8953 Row size: 35 Total
> > > Rows: 2089330
> > > Initial runs: 5 Merge passes: 1 IO Cost /
> > > pass: 7382
> > > Total IO sort cost: 7830 Total CPU sort cost: 2103620313
> > > Total Temp space used: 70493000
>
> > > ALTER SESSION SET TRACEFILE_IDENTIFIER = 'TEST202';
>
> > > SELECT residentId, count(1)
> > > FROM t1
> > > WHERE ( questionNumber, answer ) IN ( SELECT questionNumber, answer
> > > FROM t1
> > > WHERE residentId = 1486674 )
> > > AND RESIDENTID<>1486674
> > > GROUP BY residentId
> > > HAVING COUNT(1) > 10;
>
> > > Plan Table
> > > ============
> > > ----------------------------------------------------
> > > +-----------------------------------+
> > > | Id | Operation | Name | Rows | Bytes |
> > > Cost | Time |
> > > ----------------------------------------------------
> > > +-----------------------------------+
> > > | 0 | SELECT STATEMENT | | |
> > > | 11K | |
> > > | 1 | FILTER | | |
> > > | | |
> > > | 2 | SORT GROUP BY | | 102K | 2244K
> > > | 11K | 00:03:57 |
> > > | 3 | NESTED LOOPS | | 2040K | 44M |
> > > 3526 | 00:00:54 |
> > > | 4 | SORT UNIQUE | | 75 | 825
> > > | 80 | 00:00:02 |
> > > | 5 | TABLE ACCESS BY INDEX ROWID | T1 | 75 | 825
> > > | 80 | 00:00:02 |
> > > | 6 | INDEX RANGE SCAN | T1_INDEX1| 75 |
> > > | 4 | 00:00:01 |
> > > | 7 | INDEX RANGE SCAN | T1_INDEX3| 27K | 297K
> > > | 91 | 00:00:02 |
> > > ----------------------------------------------------
> > > +-----------------------------------+
> > > Predicate Information:
> > > ----------------------
> > > 1 - filter(COUNT(*)>10)
> > > 6 - access("RESIDENTID"=1486674)
> > > 7 - access("QUESTIONNUMBER"="QUESTIONNUMBER" AND "ANSWER"="ANSWER")
> > > 7 - filter("RESIDENTID"<>1486674)
>
> > > GROUP BY cardinality: 2089330.00, TABLE cardinality: 2089330.00
> > > HAVING selectivity: 0.05 -> GROUPS: 104467
> > > SORT resource Sort statistics
> > > Sort width: 88 Area size: 314368 Max Area size:
> > > 15728640
> > > Degree: 1
> > > Blocks to Sort: 8953 Row size: 35 Total
> > > Rows: 2089330
> > > Initial runs: 5 Merge passes: 1 IO Cost /
> > > pass: 7382
> > > Total IO sort cost: 7830 Total CPU sort cost: 2103620313
> > > Total Temp space used: 70493000
>
> > > Another attempt:
> > > SELECT /*+ ORDERED */
> > > T1.RESIDENTID
> > > FROM
> > > (SELECT
> > > RESIDENTID,
> > > QUESTIONNUMBER,
> > > ANSWER
> > > FROM
> > > T1
> > > WHERE
> > > RESIDENTID=1486674) MT1,
> > > T1
> > > WHERE
> > > MT1.QUESTIONNUMBER=T1.QUESTIONNUMBER
> > > AND MT1.ANSWER=T1.ANSWER
> > > AND MT1.RESIDENTID<>T1.RESIDENTID
> > > GROUP BY
> > > T1.RESIDENTID
> > > HAVING
> > > COUNT(*)>10;
>
> > > ---------------------------------------------------
> > > +-----------------------------------+
> > > | Id | Operation | Name | Rows | Bytes |
> > > Cost | Time |
> > > ---------------------------------------------------
> > > +-----------------------------------+
> > > | 0 | SELECT STATEMENT | | | |
> > > 15K | |
> > > | 1 | FILTER | | |
> > > | | |
> > > | 2 | HASH GROUP BY | | 102K | 2244K |
> > > 15K | 00:04:48 |
> > > | 3 | NESTED LOOPS | | 2040K | 44M |
> > > 6870 | 00:02:46 |
> > > | 4 | TABLE ACCESS BY INDEX ROWID | T1 | 75 | 825
> > > | 80 | 00:00:02 |
> > > | 5 | INDEX RANGE SCAN | T1_INDEX1| 75 |
> > > | 4 | 00:00:01 |
> > > | 6 | INDEX RANGE SCAN | T1_INDEX3| 27K | 297K
> > > | 91 | 00:00:02 |
> > > ---------------------------------------------------
> > > +-----------------------------------+
> > > Predicate Information:
> > > ----------------------
> > > 1 - filter(COUNT(*)>10)
> > > 5 - access("RESIDENTID"=1486674)
> > > 6 - access("QUESTIONNUMBER"="T1"."QUESTIONNUMBER" AND
> > > "ANSWER"="T1"."ANSWER")
> > > 6 - filter(("RESIDENTID"<>"T1"."RESIDENTID" AND
> > > "T1"."RESIDENTID"<>1486674))
>
> > > GROUP BY cardinality: 2089330.00, TABLE cardinality: 2089330.00
> > > HAVING selectivity: 0.05 -> GROUPS: 104467
> > > SORT resource Sort statistics
> > > Sort width: 358 Area size: 314368 Max Area size:
> > > 62914560
> > > Degree: 1
> > > Blocks to Sort: 8953 Row size: 35 Total
> > > Rows: 2089330
> > > Initial runs: 2 Merge passes: 1 IO Cost /
> > > pass: 7382
> > > Total IO sort cost: 7830 Total CPU sort cost: 2103620313
> > > Total Temp space used: 70493000
>
> > > ALTER SESSION SET SORT_AREA_SIZE=209715200;
> > > ALTER SESSION SET HASH_AREA_SIZE=209715200;
>
> > > Same results as the above.
>
> > > Charles Hooper
> > > PC Support Specialist
> > > K&M Machine-Fabricating, Inc.
>
> > This is good progress: already the cost dropped from 208k to 11k.
>
> > The plan shows that when using the 3 columns index (T1_INDEX3), there
> > is no table access at all.
> > I am not sure what are the columns are indexed by T1_INDEX1 and T1,
> > but try creating a single index with the same columns, in the
> > following order: RESIDENTID, QUESTIONNUMBER, ANSWER - please note that
> > RESIDENTID is first now.
> > That should eliminate all table accesses from the plan and reduce it
> > to a minimum. If you want, you can compress keys in both indexes
> > (T1_INDEX3 and the one described above) and see if that makes any
> > difference in terms of response time.
>
> Let's not get too carried away with indexes trying to solve a problem
> for the OP. The current index structure that I created (just looking
> at the leaf blocks) using just the T1_INDEX1 and T1_INDEX3 indexes
> requires more than twice the number of 8KB blocks as the table data:
> INDEX_NAME LEAF_BLOCKS
> T1_INDEX1 517,844
> T1_INDEX2 347,977
> T1_INDEX3 713,742
>
> SELECT
> BLOCKS
> FROM
> DBA_TABLES
> WHERE
> TABLE_NAME='T1';
>
> BLOCKS
> 551,931
>
> As new rows are added to the table, the index structures must also be
> update. The one IO on the table to insert the row turns into one IO
> on the table plus three, four (five, ten IOs) per index.
>
> By reading through the partial 10053 trace that I included, we can
> see: "Total Temp space used: 70493000" ~ 70MB. The little 7200 RPM
> hard drive in my laptop running Vista takes a couple seconds to write
> the 70MB and then read that information back. On a server with a 10+
> drive RAID 10 array with 15000 RPM drives, the 70MB write may be split
> between five+ spindles, or may be stored just in the hard drive
> controller's cache. The temp space usage during the GROUP BY likely
> accounted for a good portion of the 2 seconds consumed during the
> GROUP BY (a little less than half of the total run time). The nested
> loops operation between the questions answered by residentid 1486674
> and the index lookup results from the T1_INDEX3 accounted for 2.5
> seconds of the total run time (a little more than than half of the
> total run time). The only table access required 0.02 seconds of the
> run time.
>
> Charles Hooper
> PC Support Specialist
> K&M Machine-Fabricating, Inc.

It looks like the OP lost interest a while ago ... once it got down to 20 seconds. My point was that using the two indexes would beat the performance of the other indexing schemas and eliminate the need for sort area changes. Received on Tue Mar 20 2007 - 09:30:52 CDT