Re: Why is this query slow?
From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Fri, 12 Mar 2010 12:54:41 -0800 (PST)
Message-ID: <bcf21f0e-359f-401a-994c-6da9c1c70d87_at_c16g2000yqd.googlegroups.com>
On Mar 12, 2:26 pm, Maxim Demenko <mdeme..._at_gmail.com> wrote:
> On 12.03.2010 14:46, Charles Hooper wrote:
> > On Mar 12, 5:34 am, vsevolod afanassiev
> > <vsevolod.afanass..._at_gmail.com> wrote:
> > I would be inclined to re-write the SQL statement to look something
> > like this:
> > select
> > T.*
> > from
> > transactions T,
> > (select DISTINCT
> > ID
> > from
> > temp_table) TT
> > where
> > T.tx_date between date1 and date2
> > and T.status = 'AAA'
> > AND t.ID=TT.ID(+)
> > AND TT.ID IS NULL;
>
> > Charles Hooper
> > Co-author of "Expert Oracle Practices: Oracle Database Administration
> > from the Oak Table"
> >http://hoopercharles.wordpress.com/
> > IT Manager/Oracle DBA
> > K&M Machine-Fabricating, Inc.
>
> It would be however not equivalent to the original query ( in terms of
> returned resultset, that execution plan will differ - it is obvious). By
> anti join you would return the rows from TRANSACTIONS table, which *may*
> contain ID=NULL. By semijoin (as provided by OP) no NULLs can be
> returned. That is why, if both, TRANSACTION.ID and TRANSACTIONS.ID are
> known to optimizer as NOT NULL, it may be able to rewrite semijoin to
> antijoin on its own (as shown by Lothar). Here is small testcase on
> 10.2.0.4, but, i think, on 9.2.0.8 it would look similar.
> Of course, not null constraints can be used by optimizer equally as
> specifying not null conditions...
>
> SQL> set echo on autot off feed off
> SQL> create table transactions(id number,tx_date date,status varchar2(6));
> SQL> create table transaction(id number,tx_date date,status varchar2(6));
> SQL> insert into transaction values(1,trunc(sysdate),'AAA');
> SQL> insert into transaction values(2,trunc(sysdate),'AAA');
> SQL> insert into transaction values(null,trunc(sysdate),'AAA');
> SQL> insert into transactions values(1,trunc(sysdate),'AAA');
> SQL> insert into transactions values(3,trunc(sysdate),'AAA');
> SQL> insert into transactions values(null,trunc(sysdate),'AAA');
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
Date: Fri, 12 Mar 2010 12:54:41 -0800 (PST)
Message-ID: <bcf21f0e-359f-401a-994c-6da9c1c70d87_at_c16g2000yqd.googlegroups.com>
On Mar 12, 2:26 pm, Maxim Demenko <mdeme..._at_gmail.com> wrote:
> On 12.03.2010 14:46, Charles Hooper wrote:
> > On Mar 12, 5:34 am, vsevolod afanassiev
> > <vsevolod.afanass..._at_gmail.com> wrote:
> > I would be inclined to re-write the SQL statement to look something
> > like this:
> > select
> > T.*
> > from
> > transactions T,
> > (select DISTINCT
> > ID
> > from
> > temp_table) TT
> > where
> > T.tx_date between date1 and date2
> > and T.status = 'AAA'
> > AND t.ID=TT.ID(+)
> > AND TT.ID IS NULL;
>
> > Charles Hooper
> > Co-author of "Expert Oracle Practices: Oracle Database Administration
> > from the Oak Table"
> >http://hoopercharles.wordpress.com/
> > IT Manager/Oracle DBA
> > K&M Machine-Fabricating, Inc.
>
> It would be however not equivalent to the original query ( in terms of
> returned resultset, that execution plan will differ - it is obvious). By
> anti join you would return the rows from TRANSACTIONS table, which *may*
> contain ID=NULL. By semijoin (as provided by OP) no NULLs can be
> returned. That is why, if both, TRANSACTION.ID and TRANSACTIONS.ID are
> known to optimizer as NOT NULL, it may be able to rewrite semijoin to
> antijoin on its own (as shown by Lothar). Here is small testcase on
> 10.2.0.4, but, i think, on 9.2.0.8 it would look similar.
> Of course, not null constraints can be used by optimizer equally as
> specifying not null conditions...
>
> SQL> set echo on autot off feed off
> SQL> create table transactions(id number,tx_date date,status varchar2(6));
> SQL> create table transaction(id number,tx_date date,status varchar2(6));
> SQL> insert into transaction values(1,trunc(sysdate),'AAA');
> SQL> insert into transaction values(2,trunc(sysdate),'AAA');
> SQL> insert into transaction values(null,trunc(sysdate),'AAA');
> SQL> insert into transactions values(1,trunc(sysdate),'AAA');
> SQL> insert into transactions values(3,trunc(sysdate),'AAA');
> SQL> insert into transactions values(null,trunc(sysdate),'AAA');
Maxim,
As usual, nice demonstration.
This is from 11.2.0.1:
select *
from transactions
where tx_date between sysdate -1 and sysdate +1
and status = 'AAA'
and id is not null
and id not in (select id
from transaction
where tx_date between sysdate -1 and sysdate +1
and id is not null
and status = 'AAA')
SQL_ID curjz9ggzrgvt, child number 0
select * from transactions where tx_date between sysdate -1 and sysdate +1 and status = 'AAA' and id is not null and id not in (select id from transaction where tx_date between sysdate -1 and sysdate +1 and id is not null andstatus = 'AAA');
Plan hash value: 2554144648
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT | | 1 | | 1 | 00:00:00.01 | 14 | | | | |* 1 | FILTER | | 1 | | 1 | 00:00:00.01 | 14 | | | | |* 2 | HASH JOIN ANTI | | 1 | 1 | 1 | 00:00:00.01 | 14 | 1063K| 1063K| 542K (0)| |* 3 | TABLE ACCESS FULL| TRANSACTIONS | 1 | 2 | 2 | 00:00:00.01 | 7 | | | | |* 4 | TABLE ACCESS FULL| TRANSACTION | 1 | 2 | 2 | 00:00:00.01 | 7 | | | | -------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
1 - filter(SYSDATE_at_!-1<=SYSDATE@!+1) 2 - access("ID"="ID") 3 - filter(("ID" IS NOT NULL AND "TX_DATE">=SYSDATE_at_!-1 AND"TX_DATE"<=SYSDATE_at_!+1 AND "STATUS"='AAA'))
4 - filter(("ID" IS NOT NULL AND "TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE_at_!+1 AND "STATUS"='AAA'))
-- SELECT * FROM TRANSACTIONS TS, (SELECT DISTINCT ID FROM TRANSACTION) T WHERE TS.ID=T.ID(+) AND T.ID IS NULL; SQL_ID 1tpkf3uawhgbb, child number 0 ------------------------------------- SELECT * FROM TRANSACTIONS TS, (SELECT DISTINCT ID FROM TRANSACTION) T WHERE TS.ID=T.ID(+) AND T.ID IS NULL Plan hash value: 4011681805 --------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A- Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 2 |00:00:00.01 | 14 | | | | | 1 | VIEW | VM_NWVW_1 | 1 | 1 | 2 |00:00:00.01 | 14 | | | | | 2 | HASH UNIQUE | | 1 | 1 | 2 |00:00:00.01 | 14 | 915K| 915K| 611K (0)| |* 3 | FILTER | | 1 | | 2 |00:00:00.01 | 14 | | | | |* 4 | HASH JOIN OUTER | | 1 | 1 | 3 |00:00:00.01 | 14 | 947K| 947K| 548K (0)| | 5 | TABLE ACCESS FULL| TRANSACTIONS | 1 | 3 | 3 |00:00:00.01 | 7 | | | | | 6 | TABLE ACCESS FULL| TRANSACTION | 1 | 3 | 3 |00:00:00.01 | 7 | | | | --------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("ID" IS NULL) 4 - access("TS"."ID"="ID") -- select * from transactions where tx_date between sysdate -1 and sysdate +1 and status = 'AAA' and id not in (select id from transaction where tx_date between sysdate -1 and sysdate +1 and status = 'AAA'); SQL_ID 2ydut2u6dspa0, child number 0 ------------------------------------- select * from transactions where tx_date between sysdate -1 and sysdate +1 and status = 'AAA' and id not in (select id from transaction where tx_date between sysdate -1 and sysdate +1 and status = 'AAA') Plan hash value: 1245405710 ------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 | 00:00:00.01 | 12 | | | | |* 1 | FILTER | | 1 | | 0 | 00:00:00.01 | 12 | | | | |* 2 | HASH JOIN ANTI NA | | 1 | 2 | 0 | 00:00:00.01 | 12 | 1079K| 1079K| 553K (0)| |* 3 | TABLE ACCESS FULL| TRANSACTIONS | 1 | 3 | 3 | 00:00:00.01 | 7 | | | | |* 4 | TABLE ACCESS FULL| TRANSACTION | 1 | 3 | 3 | 00:00:00.01 | 5 | | | | ------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(SYSDATE_at_!-1<=SYSDATE@!+1) 2 - access("ID"="ID") 3 - filter(("TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE@!+1 AND "STATUS"='AAA')) 4 - filter(("TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE@!+1 AND "STATUS"='AAA')) True, if NULL values are permitted, the OP could see different results with the different SQL statements. One item that stands out in the last of the above execution plans is the null aware hash join anti operation, which was introduced in Oracle 11.1.0.6. This makes should make a high performance increase when for large data sets that permit NULL values. Joel, I started using the outer join and NULL syntax back when I started encountering performance problems with NOT IN type queries in Oracle 8.0.5. I usually do not give much thought to potential NULLs permitted in the join columns on the left side of a left outer join - I know that those rows that are NULL will be excluded automatically, as they would in a typical inner join. I saw an example of this technique somewhere, but do not recall where (it might have been on AskTom). Prior to Oracle 9i, Oracle would not automatically transform the NOT IN syntax into a more efficient form, so the outer join syntax was almost always 10, 100, possibly 1000 times faster than the NOT IN syntax when I tested both query forms back then. Since that time the optimizer has gained the ability to automatically rewrite SQL statements (as seen in a 10053 trace file) like these into more efficient equivalent forms, but the potential presence of NULL values limits the optimization possibilities. Charles Hooper Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table" http://hoopercharles.wordpress.com/ IT Manager/Oracle DBA K&M Machine-Fabricating, Inc.Received on Fri Mar 12 2010 - 14:54:41 CST