Re: Why is this query slow?

From: The Boss <usenet_at_No.Spam.Please.invalid>
Date: Fri, 12 Mar 2010 23:29:08 +0100
Message-ID: <4b9ac034$0$10921$e4fe514c_at_dreader11.news.xs4all.nl>



Maxim Demenko wrote:
> On 12.03.2010 14:46, Charles Hooper wrote:
>> On Mar 12, 5:34 am, vsevolod afanassiev
>> <vsevolod.afanass..._at_gmail.com> wrote:
>
>> I would be inclined to re-write the SQL statement to look something
>> like this:
>> select
>> T.*
>> from
>> transactions T,
>> (select DISTINCT
>> ID
>> from
>> temp_table) TT
>> where
>> T.tx_date between date1 and date2
>> and T.status = 'AAA'
>> AND t.ID=TT.ID(+)
>> AND TT.ID IS NULL;
>>
>>
>> Charles Hooper
>> Co-author of "Expert Oracle Practices: Oracle Database Administration
>> from the Oak Table"
>> http://hoopercharles.wordpress.com/
>> IT Manager/Oracle DBA
>> K&M Machine-Fabricating, Inc.
>
> It would be however not equivalent to the original query ( in terms of
> returned resultset, that execution plan will differ - it is obvious).
> By anti join you would return the rows from TRANSACTIONS table, which
> *may* contain ID=NULL. By semijoin (as provided by OP) no NULLs can be
> returned. That is why, if both, TRANSACTION.ID and TRANSACTIONS.ID are
> known to optimizer as NOT NULL, it may be able to rewrite semijoin to
> antijoin on its own (as shown by Lothar). Here is small testcase on
> 10.2.0.4, but, i think, on 9.2.0.8 it would look similar.
> Of course, not null constraints can be used by optimizer equally as
> specifying not null conditions...
>
>
>
>> set echo on autot off feed off
>> create table transactions(id number,tx_date date,status varchar2(6));
>> create table transaction(id number,tx_date date,status varchar2(6));
>> insert into transaction values(1,trunc(sysdate),'AAA');
>> insert into transaction values(2,trunc(sysdate),'AAA');
>> insert into transaction values(null,trunc(sysdate),'AAA');
>> insert into transactions values(1,trunc(sysdate),'AAA');
>> insert into transactions values(3,trunc(sysdate),'AAA');
>> insert into transactions values(null,trunc(sysdate),'AAA');
>> set feed on
>> -- no rows are returned because NULL's in NOT IN()
>> select *
> 2 from transactions
> 3 where tx_date between sysdate -1 and sysdate +1
> 4 and status = 'AAA'
> 5 --and id is not null
> 6 and id not in (select id
> 7 from transaction
> 8 where tx_date between sysdate -1 and sysdate +1
> 9 --and id is not null
> 10 and status = 'AAA');
>
> no rows selected
>
>>
>> select *
> 2 from transactions
> 3 where tx_date between sysdate -1 and sysdate +1
> 4 and status = 'AAA'
> 5 --and id is not null --! nulls are not excluded in the outer
> query, but no nulls are returned
> 6 and id not in (select id
> 7 from transaction
> 8 where tx_date between sysdate -1 and sysdate +1
> 9 and id is not null
> 10 and status = 'AAA');
>
> ID TX_DATE STATUS
> ---------- ------------------ ------
> 3 12-MAR-10 AAA
>
> 1 row selected.
>
>>
>> select
> 2 t.*
> 3 from
> 4 transactions t,
> 5 (select distinct
> 6 id
> 7 from
> 8 transaction) tt
> 9 where
> 10 t.tx_date between sysdate -1 and sysdate +1
> 11 and t.status = 'AAA'
> 12 and t.id=tt.id(+) --! here nulls from t are returned!!!
> 13 and tt.id is null;
>
> ID TX_DATE STATUS
> ---------- ------------------ ------
> 12-MAR-10 AAA
> 3 12-MAR-10 AAA
>
> 2 rows selected.
>
>>
>> set autotrace traceonly explain
>> select *
> 2 from transactions
> 3 where tx_date between sysdate -1 and sysdate +1
> 4 and status = 'AAA'
> 5 and id is not null
> 6 and id not in (select id
> 7 from transaction
> 8 where tx_date between sysdate -1 and sysdate +1
> 9 and id is not null
> 10 and status = 'AAA');
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 2554144648
>
> ------------------------------------------------------------------------------------
>> Id | Operation | Name | Rows | Bytes | Cost
> (%CPU)| Time |
> ------------------------------------------------------------------------------------
>> 0 | SELECT STATEMENT | | 1 | 54 | 7
> (15)| 00:00:01 |
>> * 1 | FILTER | | | |
>> |
>> * 2 | HASH JOIN ANTI | | 1 | 54 | 7
> (15)| 00:00:01 |
>> * 3 | TABLE ACCESS FULL| TRANSACTIONS | 2 | 54 | 3
> (0)| 00:00:01 |
>> * 4 | TABLE ACCESS FULL| TRANSACTION | 2 | 54 | 3
> (0)| 00:00:01 |
> ------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 1 - filter(SYSDATE_at_!-1<=SYSDATE@!+1)
> 2 - access("ID"="ID")
> 3 - filter("STATUS"='AAA' AND "ID" IS NOT NULL AND
> "TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE@!+1)
> 4 - filter("STATUS"='AAA' AND "ID" IS NOT NULL AND
> "TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE@!+1)
>
>
>
> Best regards
>
> Maxim

I'm a little confused here.
Although the OP references 2 tables in his original query ('transactions' and 'transaction'), I guess he actually only has 1 table ('transactions'): 1. OP stated: "Table has 11 million rows" which doesn't make much sense if he has 2 tables.
2. The original Execution Plan only references table 'transactions', not 'transaction'.

Cheers!

-- 
Jeroen 
Received on Fri Mar 12 2010 - 16:29:08 CST

Original text of this message