Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Help with sql query speed. Explain plan included
On Mar 15, 3:49 pm, "fitzjarr..._at_cox.net" <fitzjarr..._at_cox.net> wrote:
> On Mar 15, 8:14 am, JL19..._at_HOTMAIL.CO.UK wrote:
>
>
>
> > Hello,
>
> > I have a query about how to improve the speed of our sql and am asking
> > for advice.
>
> > My real system is complicated and relates to a call center, automated
> > options chosen, and time on line using an automated telephone system.
> > Rather than having to describe all the complicated bits, I have made
> > up a situation using census data instead because we can all understand
> > it.
>
> > I have itemized the points of the made up situation. Here they are.
>
> > 1. The data is census information.
> > 2. There are 3 million people in the country.
> > 3. Each person has to fill out a census form.
> > 4. There are 80 questions on each census form.
> > 5. Each question on the census form has an answer that is between 1
> > and 100.
> > 6. The country residents do not have to answer every question.
> > 7. If they do not answer a question, a row in the table is not
> > created.
>
> > The information is stored in an Oracle database with the following
> > structure.
>
> > residentID questionNumber answer
> > 1 1 10
> > 1 2 45
> > 1 3 67
> > 1 4 55
> > 2 1 15
> > 2 2 56
> > 2 3 12
>
> > etc
>
> > We need a query that that reads "give us the resident ID of any person
> > that has answered more than 10 questions the same as the resident with
> > ID number 1486674"
>
> > Here is our sql.
>
> > select tx.residentid
> > from test1 t0,test1 tx
> > where t0.residentid!=tx.residentid
> > and t0.questionnumber=tx.questionnumber
> > and t0.answer=tx.answer
> > and t0.residentid=1486674
> > group by tx.residentid
> > having count(*)>10;
>
> > The problem with this query is that it only returns 177 rows yet takes
> > "Elapsed: 00:01:10.75" which is about 70 seconds.
>
> > The only thing that is important to us is query speed. If the data
> > needs restructuring to get speed then this is OK. But firstly is there
> > a better SQL to this without restructuring of data.
>
> > The plan statistics are calculated on 100% of rows in all tables with
> > cascade at 2am every morning. They are up to date.
>
> > Here is my plan. Index idxe has been created with "create unique index
> > idxe on test1(residentid,questionnumber,answer)". Just in case it
> > matters, there are no other table indexes but can easily add them if
> > it improves performance.
>
> > We use 10.2 on a Solaris V245.
>
> > Thank you for helping in advance.
> > John
>
> > --------------------------------------------------------------------------------------
> > | Id | Operation | Name | Rows | Bytes |TempSpc| Cost
> > (%CPU)| Time |
> > --------------------------------------------------------------------------------------
> > | 0 | SELECT STATEMENT | | 33322 | 715K| | 26691
> > (7)| 00:05:21 |
> > |* 1 | FILTER | | | |
> > | | |
> > | 2 | HASH GROUP BY | | 33322 | 715K| 21M| 26691
> > (7)| 00:05:21 |
> > |* 3 | HASH JOIN | | 666K| 13M| | 24919
> > (7)| 00:05:00 |
> > |* 4 | INDEX RANGE SCAN | IDXE | 22 | 242 | | 3
> > (0)| 00:00:01 |
> > |* 5 | TABLE ACCESS FULL| TEST1 | 37M| 398M| | 24428
> > (6)| 00:04:54 |
>
> > PLAN_TABLE_OUTPUT
> > ---------------------------------------------------
> > Predicate Information (identified by operation id):
> > ---------------------------------------------------
>
> > 1 - filter(COUNT(*)>10)
> > 3 - access("T0"."QUESTIONNUMBER"="TX"."QUESTIONNUMBER" AND
> > "T0"."ANSWER"="TX"."ANSWER")
> > filter("T0"."RESIDENTID"<>"TX"."RESIDENTID")
> > 4 - access("T0"."RESIDENTID"=1486674)
> > 5 - filter("TX"."RESIDENT"<>1486674)
>
> > 22 rows selected.
>
> What is causing you 'grief' is this:
>
> where t0.residentid!=tx.residentid
>
> An index on residentid won't help here as you're asking for every
> record EXCEPT one; even with an index present given that condition you
> really can't avoid a full table scan and you don't want to. But I
> think that's not really the issue here, I think this might be a
> situation where your sort area may be much too small for such a task
> and you're swapping out to disk instead of doing such sorts in memory
> (group by queries tend to order the data by the grouped columns,
> though I wouldn't recommend using this behaviour to ensure ordered
> output). I would have your DBA check if you're sorting to disk; if so
> (and I believe you are) the sort_area_size should be increased. Once
> that's done try your query again; you may find it runs in less time.
>
> David Fitzjarrell
Charles and David. Thank you for your help on this. Our DBA has reset the hash_area_size. She has forward this to me for the sort_area_size and hash_area_size.
SQL> SHOW PARAMETER sort_area_size
NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ sort_area_size integer 31457280SQL> SHOW PARAMETER hash_area_size
NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ hash_area_size integer 293601280SQL> The query time is now typically between 10 and 20seconds which is a lot better than 70 seconds. Better still would be less than 10 seconds.
Our development database is a clone of the production database taken every week. The data I was working on has now changed sadly. I have asked for my data to not be renewed so that there is no moving target. But the number of rows in the table and the data distribution are close. I have taken a 10104 trace as Charles suggested and then run tkprof on it. The output is below. I will not post the original trace file because it is too big and people may get annoyed having to download it. Despite the query executing now in 10-20 seconds, can you see anything obvious to reduce the time further and improve performance. The only thing our users are interested in is query speed.
Thank you again
John
TKPROF: Release 10.2.0.1.0 - Production on Sun Mar 18 15:51:20 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Trace file:
Sort options: prscpu
count = number of times OCI procedure was executed cpu = cpu time in seconds executing elapsed = elapsed time in seconds executing disk = number of physical reads of buffers from disk query = number of buffers gotten for consistent read current = number of buffers gotten in current mode (usually for update) rows = number of rows processed by the fetch or execute call ********************************************************************************
EXPLAIN PLAN option disabled.
Alter session set SQL_Trace = true
call count cpu elapsed disk query current rows
Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 0 0.00 0.00 0 0 0 0
total 2 0.00 0.00 0 0 0 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36
Elapsed times include waiting on following events:
Event waited on Times Max. Wait TotalWaited
ALTER SESSION SET EVENTS '10104 TRACE NAME CONTEXT FOREVER, LEVEL 8'
call count cpu elapsed disk query current rows
Parse 1 0.00 0.00 0 0 0 0 Execute 2 0.00 0.00 0 0 0 0 Fetch 0 0.00 0.00 0 0 0 0
total 3 0.00 0.00 0 0 0 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36
Elapsed times include waiting on following events:
Event waited on Times Max. Wait TotalWaited
select tx.residentid
from test1 t0,test1 tx
where t0.residentid!=tx.residentid
and t0.questionnumber=tx.questionnumber
and t0.answer=tx.answer
and t0.residentid=74134
group by tx.residentid
having count(*)>10;
call count cpu elapsed disk query current rows
Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 16 22.58 22.09 23874 59075 0 213
total 18 22.58 22.09 23874 59075 0 213
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 36
Rows Row Source Operation
------- ---------------------------------------------------
213 FILTER (cr=59075 pr=23874 pw=0 time=22076561 us) 1206039 HASH GROUP BY (cr=59075 pr=23874 pw=0 time=23895072 us) 5084617 HASH JOIN (cr=59075 pr=23874 pw=0 time=30510206 us)
22 INDEX RANGE SCAN IDXE (cr=3 pr=0 pw=0 time=104 us)(object
id 13647)
26532858 TABLE ACCESS FULL TEST1 (cr=59072 pr=23874 pw=0
time=53065796 us)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait TotalWaited
Alter session set SQL_Trace = false
call count cpu elapsed disk query current rows
Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 0 0.00 0.00 0 0 0 0
total 2 0.00 0.00 0 0 0 0
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 36
Elapsed times include waiting on following events:
Event waited on Times Max. Wait TotalWaited
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
Parse 4 0.00 0.00 0 0 0 0 Execute 5 0.00 0.00 0 0 0 0 Fetch 16 22.58 22.09 23874 59075 0 213
total 25 22.58 22.09 23874 59075 0 213
Misses in library cache during parse: 2
Elapsed times include waiting on following events:
Event waited on Times Max. Wait TotalWaited
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
Parse 0 0.00 0.00 0 0 0 0 Execute 0 0.00 0.00 0 0 0 0 Fetch 0 0.00 0.00 0 0 0 0
total 0 0.00 0.00 0 0 0 0
Misses in library cache during parse: 0
4 user SQL statements in session.
0 internal SQL statements in session.
4 SQL statements in session.
0 statements EXPLAINed in this session.
0 session in tracefile. 4 user SQL statements in trace file. 0 internal SQL statements in trace file. 4 SQL statements in trace file. 4 unique SQL statements in trace file. 617 lines in trace file. 22 elapsed seconds in trace file.Received on Sun Mar 18 2007 - 11:08:49 CDT