Re: Why is this query slow?

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Fri, 12 Mar 2010 12:54:41 -0800 (PST)
Message-ID: <bcf21f0e-359f-401a-994c-6da9c1c70d87_at_c16g2000yqd.googlegroups.com>



On Mar 12, 2:26 pm, Maxim Demenko <mdeme..._at_gmail.com> wrote:
> On 12.03.2010 14:46, Charles Hooper wrote:
> > On Mar 12, 5:34 am, vsevolod afanassiev
> > <vsevolod.afanass..._at_gmail.com>  wrote:
> > I would be inclined to re-write the SQL statement to look something
> > like this:
> > select
> >    T.*
> > from
> >    transactions T,
> >    (select DISTINCT
> >      ID
> >    from
> >      temp_table) TT
> > where
> >    T.tx_date between date1 and date2
> >    and T.status = 'AAA'
> >    AND t.ID=TT.ID(+)
> >    AND TT.ID IS NULL;
>
> > Charles Hooper
> > Co-author of "Expert Oracle Practices: Oracle Database Administration
> > from the Oak Table"
> >http://hoopercharles.wordpress.com/
> > IT Manager/Oracle DBA
> > K&M Machine-Fabricating, Inc.
>
> It would be however not equivalent to the original query ( in terms of
> returned resultset, that execution plan will differ - it is obvious). By
> anti join you would return the rows from TRANSACTIONS table, which *may*
> contain ID=NULL. By semijoin (as provided by OP) no NULLs can be
> returned. That is why, if both, TRANSACTION.ID and TRANSACTIONS.ID are
> known to optimizer as NOT NULL, it may be able to rewrite semijoin to
> antijoin on its own (as shown by Lothar). Here is small testcase on
> 10.2.0.4, but, i think, on 9.2.0.8 it would look similar.
> Of course, not null constraints can be used by optimizer equally as
> specifying not null conditions...
>
> SQL> set echo on autot off feed off
> SQL> create table transactions(id number,tx_date date,status varchar2(6));
> SQL> create table transaction(id number,tx_date date,status varchar2(6));
> SQL> insert into transaction values(1,trunc(sysdate),'AAA');
> SQL> insert into transaction values(2,trunc(sysdate),'AAA');
> SQL> insert into transaction values(null,trunc(sysdate),'AAA');
> SQL> insert into transactions values(1,trunc(sysdate),'AAA');
> SQL> insert into transactions values(3,trunc(sysdate),'AAA');
> SQL> insert into transactions values(null,trunc(sysdate),'AAA');

Maxim,
As usual, nice demonstration.

This is from 11.2.0.1:
select *

   from transactions
  where tx_date between sysdate -1 and sysdate +1     and status = 'AAA'
  and id is not null
    and id not in (select id
     from transaction
  where tx_date between sysdate -1 and sysdate +1   and id is not null

      and status = 'AAA')

SQL_ID curjz9ggzrgvt, child number 0


select *      from transactions     where tx_date between sysdate -1
and sysdate +1       and status = 'AAA'     and id is not null
and id not in (select id        from transaction     where tx_date
between sysdate -1 and sysdate +1     and id is not null         and
status = 'AAA');

Plan hash value: 2554144648



| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
|   0 | SELECT STATEMENT    |              |      1 |        |      1 |
00:00:00.01 |      14 |       |       |          |
|*  1 |  FILTER             |              |      1 |        |      1 |
00:00:00.01 |      14 |       |       |          |
|*  2 |   HASH JOIN ANTI    |              |      1 |      1 |      1 |
00:00:00.01 |      14 |  1063K|  1063K|  542K (0)|
|*  3 |    TABLE ACCESS FULL| TRANSACTIONS |      1 |      2 |      2 |
00:00:00.01 |       7 |       |       |          |
|*  4 |    TABLE ACCESS FULL| TRANSACTION  |      1 |      2 |      2 |
00:00:00.01 |       7 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - filter(SYSDATE_at_!-1<=SYSDATE@!+1)
   2 - access("ID"="ID")
   3 - filter(("ID" IS NOT NULL AND "TX_DATE">=SYSDATE_at_!-1 AND
"TX_DATE"<=SYSDATE_at_!+1 AND "STATUS"='AAA'))

   4 - filter(("ID" IS NOT NULL AND "TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE_at_!+1 AND "STATUS"='AAA'))

--

SELECT
  *
FROM
  TRANSACTIONS TS,
  (SELECT DISTINCT
    ID
  FROM
    TRANSACTION) T
WHERE
  TS.ID=T.ID(+)
  AND T.ID IS NULL;

SQL_ID  1tpkf3uawhgbb, child number 0
-------------------------------------
SELECT    *  FROM    TRANSACTIONS TS,    (SELECT DISTINCT      ID
FROM      TRANSACTION) T  WHERE    TS.ID=T.ID(+)    AND T.ID IS NULL

Plan hash value: 4011681805

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name         | Starts | E-Rows | A-
Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |              |      1 |        |
2 |00:00:00.01 |      14 |       |       |          |
|   1 |  VIEW                 | VM_NWVW_1    |      1 |      1 |
2 |00:00:00.01 |      14 |       |       |          |
|   2 |   HASH UNIQUE         |              |      1 |      1 |
2 |00:00:00.01 |      14 |   915K|   915K|  611K (0)|
|*  3 |    FILTER             |              |      1 |        |
2 |00:00:00.01 |      14 |       |       |          |
|*  4 |     HASH JOIN OUTER   |              |      1 |      1 |
3 |00:00:00.01 |      14 |   947K|   947K|  548K (0)|
|   5 |      TABLE ACCESS FULL| TRANSACTIONS |      1 |      3 |
3 |00:00:00.01 |       7 |       |       |          |
|   6 |      TABLE ACCESS FULL| TRANSACTION  |      1 |      3 |
3 |00:00:00.01 |       7 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("ID" IS NULL)
   4 - access("TS"."ID"="ID")

--

select *
   from transactions
  where tx_date between sysdate -1 and sysdate +1
    and status = 'AAA'
    and id not in (select id
     from transaction
  where tx_date between sysdate -1 and sysdate +1
      and status = 'AAA');

SQL_ID  2ydut2u6dspa0, child number 0
-------------------------------------
select *      from transactions     where tx_date between sysdate -1
and sysdate +1       and status = 'AAA'       and id not in (select
id
      from transaction     where tx_date between sysdate -1 and
sysdate
+1         and status = 'AAA')

Plan hash value: 1245405710

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Starts | E-Rows | A-Rows
|   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |      1 |        |      0 |
00:00:00.01 |      12 |       |       |          |
|*  1 |  FILTER             |              |      1 |        |      0 |
00:00:00.01 |      12 |       |       |          |
|*  2 |   HASH JOIN ANTI NA |              |      1 |      2 |      0 |
00:00:00.01 |      12 |  1079K|  1079K|  553K (0)|
|*  3 |    TABLE ACCESS FULL| TRANSACTIONS |      1 |      3 |      3 |
00:00:00.01 |       7 |       |       |          |
|*  4 |    TABLE ACCESS FULL| TRANSACTION  |      1 |      3 |      3 |
00:00:00.01 |       5 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(SYSDATE_at_!-1<=SYSDATE@!+1)
   2 - access("ID"="ID")
   3 - filter(("TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE@!+1 AND
"STATUS"='AAA'))
   4 - filter(("TX_DATE">=SYSDATE_at_!-1 AND "TX_DATE"<=SYSDATE@!+1 AND
"STATUS"='AAA'))

True, if NULL values are permitted, the OP could see different results
with the different SQL statements.

One item that stands out in the last of the above execution plans is
the null aware hash join anti operation, which was introduced in
Oracle 11.1.0.6.  This makes should make a high performance increase
when for large data sets that permit NULL values.

Joel, I started using the outer join and NULL syntax back when I
started encountering performance problems with NOT IN type queries in
Oracle 8.0.5.  I usually do not give much thought to potential NULLs
permitted in the join columns on the left side of a left outer join -
I know that those rows that are NULL will be excluded automatically,
as they would in a typical inner join.  I saw an example of this
technique somewhere, but do not recall where (it might have been on
AskTom).  Prior to Oracle 9i, Oracle would not automatically transform
the NOT IN syntax into a more efficient form, so the outer join syntax
was almost always 10, 100, possibly 1000 times faster than the NOT IN
syntax when I tested both query forms back then.  Since that time the
optimizer has gained the ability to automatically rewrite SQL
statements (as seen in a 10053 trace file) like these into more
efficient equivalent forms, but the potential presence of NULL values
limits the optimization possibilities.

Charles Hooper
Co-author of "Expert Oracle Practices: Oracle Database Administration
from the Oak Table"
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
Received on Fri Mar 12 2010 - 14:54:41 CST

Original text of this message