Why is this query slow?

From: vsevolod afanassiev <vsevolod.afanassiev_at_gmail.com>
Date: Fri, 12 Mar 2010 02:34:02 -0800 (PST)
Message-ID: <87251f3a-e0c4-4c43-a8a9-5446e216e41b_at_l24g2000prh.googlegroups.com>



I have been doing SQL query tuning for a while and in most cases this is relatively straightforward excercise. For example: if a query selects only a few rows from large table then it should be accessed through index, if a query selects more than 10% of the rows the table should be accessed through full table scan. In many cases I was able to reduce number of buffer gets from something like 200,000,000 to less than 1000 per execution by creating indexes and adding a few hints.

The difficulty I have now is that the query is very simple and as far as I can see there is nothing wrong
with plan. Still it is slow.

The query is

select * from transactions
where tx_date between (date1 and date2) and status = 'AAA' and
ID not in
(select ID from transaction where tx_date between (date3 and date4)
and status = 'AAA')

So we have a query and a subquery. Subquery is not correlated. Table has 11 million rows.
Nothing fancy: no LOBS, LONG, etc, all columns are either NUMBER or VARCHAR2.
Column ID has type VARCHAR2(32)

Executed separately both query and subquery run in less than a second,
use index on tx_date. Main query returns 250 records while subquery returns 2500.

I am running it on lightly loaded UNIX server with plenty of free memory, SGA is 4 GB, PGA is 500 MB, there are no other heavy queries. Still it takes 2 minutes.

Autotrace shows 2.2 million buffer gets and no disk reads.

There is nothing in wait events: no latch waits, no buffer busy, etc.

My main problem is: there is nothing wrong with execution plan for complete query.
It shows use of the same index on tx_date by both main query and subquery and FILTER.

If I save results of the subquery in a temporary table then such modified query
completes in less than a second:

select * from transactions
where tx_date between (date1 and date2) and status = 'AAA'
ID not in
(select ID from temp_table)

Surely Oracle should be able to hold 2,500 records from the subquery in
memory and access it more efficiently than my temporary table? But it doesn't do it.

I can re-write the query using MINUS and it will run much faster. I am just puzzled why is it so slow.

This is plan from AUTOTRACE (9.2.0.8 on AIX):

Execution Plan


   0 SELECT STATEMENT Optimizer=CHOOSE (Cost=45545 Card=81 Bytes=3888)

   1 0 FILTER

   2    1     FILTER
   3    2       ABLE ACCESS (BY INDEX ROWID) OF
'TRANSACTIONS' (Cost=1076 Card=81 Bytes=3888)
   4    3         NDEX (RANGE SCAN) OF 'TRANSACTIONS_IND01' (NON-
UNIQUE) (Cost=821 Card=105815)
   5    1     FILTER
   6    5      TABLE ACCESS (BY INDEX ROWID) OF
'TRANSACTIONS' (Cost=549 Card=4 Bytes=192)
   7    6        INDEX (RANGE SCAN) OF 'TRANSACTIONS_IND01' (NON-
UNIQUE) (Cost=419Card=53853)

Statistics


          0  recursive calls
          0  db block gets
    2326009  consistent gets
          0  physical reads
          0  redo size
       7609  bytes sent via SQL*Net to client
        832  bytes received via SQL*Net from client
         18  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        253  rows processed
Received on Fri Mar 12 2010 - 04:34:02 CST

Original text of this message