Re: Help with sql query speed. Explain plan included

From: <JL19348_at_HOTMAIL.CO.UK>
Date: 18 Mar 2007 09:08:49 -0700
Message-ID: <1174234129.055233.12440@o5g2000hsb.googlegroups.com>

On Mar 15, 3:49 pm, "fitzjarr..._at_cox.net" <fitzjarr..._at_cox.net> wrote:
> On Mar 15, 8:14 am, JL19..._at_HOTMAIL.CO.UK wrote:
>
>
>
> > Hello,
>
> > I have a query about how to improve the speed of our sql and am asking
> > for advice.
>
> > My real system is complicated and relates to a call center, automated
> > options chosen, and time on line using an automated telephone system.
> > Rather than having to describe all the complicated bits, I have made
> > up a situation using census data instead because we can all understand
> > it.
>
> > I have itemized the points of the made up situation. Here they are.
>
> > 1. The data is census information.
> > 2. There are 3 million people in the country.
> > 3. Each person has to fill out a census form.
> > 4. There are 80 questions on each census form.
> > 5. Each question on the census form has an answer that is between 1
> > and 100.
> > 6. The country residents do not have to answer every question.
> > 7. If they do not answer a question, a row in the table is not
> > created.
>
> > The information is stored in an Oracle database with the following
> > structure.
>
> > residentID questionNumber answer
> > 1 1 10
> > 1 2 45
> > 1 3 67
> > 1 4 55
> > 2 1 15
> > 2 2 56
> > 2 3 12
>
> > etc
>
> > We need a query that that reads "give us the resident ID of any person
> > that has answered more than 10 questions the same as the resident with
> > ID number 1486674"
>
> > Here is our sql.
>
> > select tx.residentid
> > from test1 t0,test1 tx
> > where t0.residentid!=tx.residentid
> > and t0.questionnumber=tx.questionnumber
> > and t0.answer=tx.answer
> > and t0.residentid=1486674
> > group by tx.residentid
> > having count(*)>10;
>
> > The problem with this query is that it only returns 177 rows yet takes
> > "Elapsed: 00:01:10.75" which is about 70 seconds.
>
> > The only thing that is important to us is query speed. If the data
> > needs restructuring to get speed then this is OK. But firstly is there
> > a better SQL to this without restructuring of data.
>
> > The plan statistics are calculated on 100% of rows in all tables with
> > cascade at 2am every morning. They are up to date.
>
> > Here is my plan. Index idxe has been created with "create unique index
> > idxe on test1(residentid,questionnumber,answer)". Just in case it
> > matters, there are no other table indexes but can easily add them if
> > it improves performance.
>
> > We use 10.2 on a Solaris V245.
>
> > Thank you for helping in advance.
> > John
>
> > --------------------------------------------------------------------------------------
> > | Id | Operation | Name | Rows | Bytes |TempSpc| Cost
> > (%CPU)| Time |
> > --------------------------------------------------------------------------------------
> > | 0 | SELECT STATEMENT | | 33322 | 715K| | 26691
> > (7)| 00:05:21 |
> > |* 1 | FILTER | | | |
> > | | |
> > | 2 | HASH GROUP BY | | 33322 | 715K| 21M| 26691
> > (7)| 00:05:21 |
> > |* 3 | HASH JOIN | | 666K| 13M| | 24919
> > (7)| 00:05:00 |
> > |* 4 | INDEX RANGE SCAN | IDXE | 22 | 242 | | 3
> > (0)| 00:00:01 |
> > |* 5 | TABLE ACCESS FULL| TEST1 | 37M| 398M| | 24428
> > (6)| 00:04:54 |
>
> > PLAN_TABLE_OUTPUT
> > ---------------------------------------------------
> > Predicate Information (identified by operation id):
> > ---------------------------------------------------
>
> > 1 - filter(COUNT(*)>10)
> > 3 - access("T0"."QUESTIONNUMBER"="TX"."QUESTIONNUMBER" AND
> > "T0"."ANSWER"="TX"."ANSWER")
> > filter("T0"."RESIDENTID"<>"TX"."RESIDENTID")
> > 4 - access("T0"."RESIDENTID"=1486674)
> > 5 - filter("TX"."RESIDENT"<>1486674)
>
> > 22 rows selected.
>
> What is causing you 'grief' is this:
>
> where t0.residentid!=tx.residentid
>
> An index on residentid won't help here as you're asking for every
> record EXCEPT one; even with an index present given that condition you
> really can't avoid a full table scan and you don't want to. But I
> think that's not really the issue here, I think this might be a
> situation where your sort area may be much too small for such a task
> and you're swapping out to disk instead of doing such sorts in memory
> (group by queries tend to order the data by the grouped columns,
> though I wouldn't recommend using this behaviour to ensure ordered
> output). I would have your DBA check if you're sorting to disk; if so
> (and I believe you are) the sort_area_size should be increased. Once
> that's done try your query again; you may find it runs in less time.
>
> David Fitzjarrell

Charles and David. Thank you for your help on this. Our DBA has reset the hash_area_size. She has forward this to me for the sort_area_size and hash_area_size.

SQL> SHOW PARAMETER sort_area_size

NAME                                 TYPE        VALUE
------------------------------------ -----------
------------------------------
sort_area_size                       integer     31457280

SQL> SHOW PARAMETER hash_area_size

NAME                                 TYPE        VALUE
------------------------------------ -----------
------------------------------
hash_area_size                       integer     293601280

SQL> The query time is now typically between 10 and 20seconds which is a lot better than 70 seconds. Better still would be less than 10 seconds.

Our development database is a clone of the production database taken every week. The data I was working on has now changed sadly. I have asked for my data to not be renewed so that there is no moving target. But the number of rows in the table and the data distribution are close. I have taken a 10104 trace as Charles suggested and then run tkprof on it. The output is below. I will not post the original trace file because it is too big and people may get annoyed having to download it. Despite the query executing now in 10-20 seconds, can you see anything obvious to reduce the time further and improve performance. The only thing our users are interested in is query speed.

Thank you again
John

TKPROF: Release 10.2.0.1.0 - Production on Sun Mar 18 15:51:20 2007

Trace file:
Sort options: prscpu

count    = number of times OCI procedure was executed
cpu      = cpu time in seconds executing
elapsed  = elapsed time in seconds executing
disk     = number of physical reads of buffers from disk
query    = number of buffers gotten for consistent read
current  = number of buffers gotten in current mode (usually for
update)
rows     = number of rows processed by the fetch or execute call
********************************************************************************

EXPLAIN PLAN option disabled.

Alter session set SQL_Trace = true

call     count       cpu    elapsed       disk      query
current        rows

------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.00       0.00          0          0
0           0
Execute      1      0.00       0.00          0          0
0           0
Fetch        0      0.00       0.00          0          0
0           0

------- ------ -------- ---------- ---------- ---------- ----------

total        2      0.00       0.00          0          0
0           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36

Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total

Waited

Waited ----------
SQL*Net message to client 1 0.00 0.00 SQL*Net message from client 1 0.00 0.00

ALTER SESSION SET EVENTS '10104 TRACE NAME CONTEXT FOREVER, LEVEL 8'

call     count       cpu    elapsed       disk      query
current        rows

------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.00       0.00          0          0
0           0
Execute      2      0.00       0.00          0          0
0           0
Fetch        0      0.00       0.00          0          0
0           0

------- ------ -------- ---------- ---------- ---------- ----------

total        3      0.00       0.00          0          0
0           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36

Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total

Waited

Waited ----------
SQL*Net message to client 2 0.00 0.00 SQL*Net message from client 2 0.00 0.00

select tx.residentid
from test1 t0,test1 tx
  where t0.residentid!=tx.residentid
   and t0.questionnumber=tx.questionnumber    and t0.answer=tx.answer
   and t0.residentid=74134
    group by tx.residentid
     having count(*)>10;

call     count       cpu    elapsed       disk      query
current        rows

------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.00       0.00          0          0
0           0
Execute      1      0.00       0.00          0          0
0           0
Fetch       16     22.58      22.09      23874      59075
0         213

------- ------ -------- ---------- ---------- ---------- ----------

total       18     22.58      22.09      23874      59075
0         213

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 36

Rows Row Source Operation

-------  ---------------------------------------------------

213 FILTER (cr=59075 pr=23874 pw=0 time=22076561 us) 1206039 HASH GROUP BY (cr=59075 pr=23874 pw=0 time=23895072 us) 5084617 HASH JOIN (cr=59075 pr=23874 pw=0 time=30510206 us)

22 INDEX RANGE SCAN IDXE (cr=3 pr=0 pw=0 time=104 us)(object id 13647)
26532858 TABLE ACCESS FULL TEST1 (cr=59072 pr=23874 pw=0 time=53065796 us)

Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total

Waited

Waited ----------
SQL*Net message to client 16 0.00 0.00 db file sequential read 218 0.00 0.00 db file scattered read 188 0.00 0.29 SQL*Net message from client 16 0.00 0.01

Alter session set SQL_Trace = false

call     count       cpu    elapsed       disk      query
current        rows

------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.00       0.00          0          0
0           0
Execute      1      0.00       0.00          0          0
0           0
Fetch        0      0.00       0.00          0          0
0           0

------- ------ -------- ---------- ---------- ---------- ----------

total        2      0.00       0.00          0          0
0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 36

Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total

Waited

Waited ----------
SQL*Net message to client 1 0.00 0.00 SQL*Net message from client 1 0.00 0.00

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query
current        rows

------- ------ -------- ---------- ---------- ---------- ----------

Parse        4      0.00       0.00          0          0
0           0
Execute      5      0.00       0.00          0          0
0           0
Fetch       16     22.58      22.09      23874      59075
0         213

------- ------ -------- ---------- ---------- ---------- ----------

total       25     22.58      22.09      23874      59075
0         213

Misses in library cache during parse: 2
Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total

Waited

Waited ----------
SQL*Net message to client 25 0.00 0.00 SQL*Net message from client 24 0.00 0.01 db file sequential read 218 0.00 0.00 db file scattered read 188 0.00 0.29

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query
current        rows

------- ------ -------- ---------- ---------- ---------- ----------

Parse        0      0.00       0.00          0          0
0           0
Execute      0      0.00       0.00          0          0
0           0
Fetch        0      0.00       0.00          0          0
0           0

------- ------ -------- ---------- ---------- ---------- ----------

total        0      0.00       0.00          0          0
0           0

Misses in library cache during parse: 0

    4 user SQL statements in session.
    0 internal SQL statements in session.     4 SQL statements in session.
    0 statements EXPLAINed in this session.

Trace file:
Trace file compatibility: 10.01.00
Sort options: prscpu

       0  session in tracefile.
       4  user  SQL statements in trace file.
       0  internal SQL statements in trace file.
       4  SQL statements in trace file.
       4  unique SQL statements in trace file.
     617  lines in trace file.
      22  elapsed seconds in trace file.

Received on Sun Mar 18 2007 - 11:08:49 CDT