Re: Avoid multiple scan of the same table
Date: Mon, 25 Aug 2008 11:08:33 -0700 (PDT)
Message-ID: <f54c8daa-5295-4c50-b5be-6192ea77cad4@y38g2000hsy.googlegroups.com>
On Aug 22, 4:11 pm, mak..._at_gmail.com wrote:
> On Aug 22, 2:08 pm, t..._at_panix.com (Dan Blum) wrote:
>
>
>
>
>
> > DA Morgan <damor..._at_psoug.org> wrote:
> > > Dan Blum wrote:
> > > > mak..._at_gmail.com wrote:
> > > >> On Aug 21, 7:08?pm, DA Morgan <damor..._at_psoug.org> wrote:
> > > >>> mak..._at_gmail.com wrote:
> > > >>>> Is there way to avoid multiple scan of the same table in following?
> > > >>>> Select c1 as v1, c2 as v2, c3 as v3
> > > >>>> From t1
> > > >>>> Union
> > > >>>> Select c1 as v1, c2 as v2, c4 as v3
> > > >>>> From t1
> > > >>>> Where ?c5 < 5000
> > > >>>> The output should look like..
> > > >>>> v1, v2, v3
> > > >>>> v11, v21, v31
> > > >>>> v12, v22, v32
> > > >>>> and so on.
> > > >>>> Rows from the both queries can be same so UNION is required to avoid
> > > >>>> duplicates in current construct.
> > > >>> One way would be to use a WITH CLAUSE query.http://www.psoug.org/reference/with.html
> > > >>> --
> > > >>> Daniel A. Morgan
> > > >>> Oracle Ace Director & Instructor
> > > >>> University of Washington
> > > >>> damor..._at_x.washington.edu (replace x with u to respond)
> > > >>> Puget Sound Oracle Users Groupwww.psoug.org-Hidequoted text -
>
> > > >>> - Show quoted text -
>
> > > >> Pls. explain with above example.. I have given the problem query with
> > > >> expected output..
>
> > > > At some point reading the relevant documentation is a good idea... however:
>
> > > > with x as (select c1, c2, c3, c4, c5 from t1)
> > > > select c1 v1, c2 v2 c3 v3 from x
> > > > union
> > > > select c1 v1, c2 v2, c4 v3 from x where c5 < 5000;
>
> > > > is one way of doing it.
>
> > > > If the table is large this is still not going to perform very well because
> > > > of the work to remove duplicate rows. Given more information about the table,
> > > > a better statement could probably be written.
> > > One way to avoid (not remove) duplicate rows would be to put a WHERE
> > > clause on the first statement. Making each set exclusive.
> > > with x as (select c1, c2, c3, c4, c5 from t1)
> > > select c1 v1, c2 v2 c3 v3 from x
> > > where c5 >= 5000;
> > > union
> > > select c1 v1, c2 v2, c4 v3 from x
> > > where c5 < 5000;
>
> > Except that if c3 != c4 then we always want the c3 row, regardless of the value of
> > c5. A better option would be
>
> > with x as (select c1, c2, c3, c4, c5 from t1)
> > select c1 v1, c2 v2, c3 v3 from x
> > union
> > select c1 v1, c2 v2, c4 v3 from x
> > where c5 < 5000
> > and c3 != c4;
>
> > That ensures that nothing in the bottom set matches anything in the top set,
> > although of course each set could have duplicates within it as far as we know.
> > If anything there (c1 or c2 or the combination) is a unique key then in this
> > version we can replace UNION with UNION ALL. If not, it might still be
> > faster to use UNION ALL and put a DISTINCT (or GROUP BY) on each query - unless
> > one set is much larger than the other, it will almost certainly be faster.
>
> > --
> > _______________________________________________________________________
> > Dan Blum t..._at_panix.com
> > "I wouldn't have believed it myself if I hadn't just made it up."- Hide quoted text -
>
> > - Show quoted text -
>
> Folks,
>
> I'm still not avoiding multiple scan of the table.. With "With
> Clause" , I believe Oracle creates the temp table and then use that
> table later in the query.. I'm now scanning temp table multiple time
> than the original table.. Aint' I? The table in question is not that
> wide so oracle may also decide not to materilialize to with query as
> creating temp table may be more expensive...
>
> Thanks- Hide quoted text -
>
> - Show quoted text -
Plans for the submitted queries, including hinted versions of the 'with' offerings:
SQL> create table t1(
2 c1 number, 3 c2 number, 4 c3 number, 5 c4 number, 6 c5 number
7 );
Table created.
SQL>
SQL> begin
2 for i in 1..10000 loop 3 insert into t1 4 values(i, mod(i,2),mod(i,3), mod(i,4), mod(i,7)*5000); 5 end loop;
6 end;
7 /
PL/SQL procedure successfully completed.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING',
tabname=>'T1', cascade=>true, estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> SQL> set autotrace on linesize 200 trimspool on SQL> SQL> Select c1 as v1, c2 as v2, c3 as v32 From t1
3 Union
4 Select c1 as v1, c2 as v2, c4 as v3 5 From t1
6 Where c5 < 5000;
V1 V2 V3
---------- ---------- ----------
1 1 1 2 0 2 3 1 0 4 0 1 5 1 2 6 0 0 7 1 1 7 1 3 8 0 2 9 1 0 10 0 1 ... V1 V2 V3 ---------- ---------- ---------- 9996 0 0 9997 1 1 9998 0 2 9999 1 0 10000 0 1
11071 rows selected.
Execution Plan
Plan hash value: 3979283944
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 11667 | 120K| | 65 (24)| 00:00:01 | | 1 | SORT UNIQUE | | 11667 | 120K| 408K| 65 (24)| 00:00:01 | | 2 | UNION-ALL | | | | | | | | 3 | TABLE ACCESS FULL| T1 | 10000 | 97K| | 9 (12)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T1 | 1667 | 23338 | | 9(12)| 00:00:01 |
Predicate Information (identified by operation id):
4 - filter("C5"<5000)
Statistics
1 recursive calls 0 db block gets 62 consistent gets 0 physical reads 0 redo size 152667 bytes sent via SQL*Net to client 5412 bytes received via SQL*Net from client 740 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 11071 rows processed
SQL>
SQL> with x as (select c1, c2, c3, c4, c5 from t1)
2 select c1 v1, c2 v2, c3 v3 from x
3 union
4 select c1 v1, c2 v2, c4 v3 from x where c5 < 5000;
V1 V2 V3
---------- ---------- ----------
1 1 1 2 0 2 3 1 0 4 0 1 5 1 2 6 0 0 7 1 1 7 1 3 8 0 2 9 1 0 10 0 1 ... V1 V2 V3 ---------- ---------- ---------- 9996 0 0 9997 1 1 9998 0 2 9999 1 0 10000 0 1
11071 rows selected.
Execution Plan
Plan hash value: 3979283944
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 11667 | 120K| | 65 (24)| 00:00:01 | | 1 | SORT UNIQUE | | 11667 | 120K| 408K| 65 (24)| 00:00:01 | | 2 | UNION-ALL | | | | | | | | 3 | TABLE ACCESS FULL| T1 | 10000 | 97K| | 9 (12)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T1 | 1667 | 23338 | | 9(12)| 00:00:01 |
Predicate Information (identified by operation id):
4 - filter("C5"<5000)
Statistics
1 recursive calls 0 db block gets 62 consistent gets 0 physical reads 0 redo size 152668 bytes sent via SQL*Net to client 5412 bytes received via SQL*Net from client 740 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 11071 rows processed
SQL>
SQL> with x as (select c1, c2, c3, c4, c5 from t1)
2 select c1 v1, c2 v2, c3 v3 from x
3 where c5 >= 5000
4 union
5 select c1 v1, c2 v2, c4 v3 from x
6 where c5 < 5000;
V1 V2 V3
---------- ---------- ----------
1 1 1 2 0 2 3 1 0 4 0 1 5 1 2 6 0 0 7 1 3 8 0 2 9 1 0 10 0 1 11 1 2 ... V1 V2 V3 ---------- ---------- ---------- 10000 0 1
10000 rows selected.
Execution Plan
Plan hash value: 3979283944
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 11429 | 156K| | 73 (21)| 00:00:01 | | 1 | SORT UNIQUE | | 11429 | 156K| 552K| 73 (21)| 00:00:01 | | 2 | UNION-ALL | | | | | | | |* 3 | TABLE ACCESS FULL| T1 | 9762 | 133K| | 9 (12)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T1 | 1667 | 23338 | | 9(12)| 00:00:01 |
Predicate Information (identified by operation id):
3 - filter("C5">=5000)
4 - filter("C5"<5000)
Statistics
1 recursive calls 0 db block gets 62 consistent gets 0 physical reads 0 redo size 137868 bytes sent via SQL*Net to client 4908 bytes received via SQL*Net from client 668 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 10000 rows processed
SQL>
SQL> with x as (select /*+ materialize */ c1, c2, c3, c4, c5 from t1)
2 select c1 v1, c2 v2, c3 v3 from x
3 union
4 select c1 v1, c2 v2, c4 v3 from x where c5 < 5000;
V1 V2 V3
---------- ---------- ----------
1 1 1 2 0 2 3 1 0 4 0 1 5 1 2 6 0 0 7 1 1 7 1 3 8 0 2 9 1 0 10 0 1 ... V1 V2 V3 ---------- ---------- ---------- 9996 0 0 9997 1 1 9998 0 2 9999 1 0 10000 0 1
11071 rows selected.
Execution Plan
Plan hash value: 3457872322
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 20000 | 888K| | 255 (59)| 00:00:04 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | 2 | LOAD AS SELECT | | | | | | | | 3 | TABLE ACCESS FULL | T1 | 10000 | 146K| | 9 (12)| 00:00:01 | | 4 | SORT UNIQUE | | 20000 | 888K| 2304K| 255 (59)| 00:00:04 | | 5 | UNION-ALL | | | | | | | | 6 | VIEW | | 10000 | 380K| | 7 (15)| 00:00:01 | | 7 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6611_1FB4B03D | 10000 | 146K| | 7 (15)| 00:00:01 | |* 8 | VIEW | | 10000 | 507K| | 7 (15)| 00:00:01 | | 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6611_1FB4B03D | 10000 | 146K| | 7 (15)| 00:00:01 | ------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
8 - filter("C5"<5000)
Statistics
160 recursive calls 37 db block gets 110 consistent gets 28 physical reads 1456 redo size 146747 bytes sent via SQL*Net to client 5412 bytes received via SQL*Net from client 740 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 11071 rows processed
SQL>
SQL> with x as (select /*+ materialize */ c1, c2, c3, c4, c5 from t1)
2 select c1 v1, c2 v2, c3 v3 from x
3 where c5 >= 5000
4 union
5 select c1 v1, c2 v2, c4 v3 from x
6 where c5 < 5000;
V1 V2 V3
---------- ---------- ----------
1 1 1 2 0 2 3 1 0 4 0 1 5 1 2 6 0 0 7 1 3 8 0 2 9 1 0 10 0 1 11 1 2 ... V1 V2 V3 ---------- ---------- ---------- 10000 0 1
10000 rows selected.
Execution Plan
Plan hash value: 2974447107
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 20000 | 1015K| | 286 (52)| 00:00:04 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | 2 | LOAD AS SELECT | | | | | | | | 3 | TABLE ACCESS FULL | T1 | 10000 | 146K| | 9 (12)| 00:00:01 | | 4 | SORT UNIQUE | | 20000 | 1015K| 2544K| 286 (52)| 00:00:04 | | 5 | UNION-ALL | | | | | | | |* 6 | VIEW | | 10000 | 507K| | 7 (15)| 00:00:01 | | 7 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_1FB4B03D | 10000 | 146K| | 7 (15)| 00:00:01 | |* 8 | VIEW | | 10000 | 507K| | 7 (15)| 00:00:01 | | 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_1FB4B03D | 10000 | 146K| | 7 (15)| 00:00:01 | ------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
6 - filter("C5">=5000)
8 - filter("C5"<5000)
Statistics
160 recursive calls 37 db block gets 110 consistent gets 28 physical reads 1456 redo size 132523 bytes sent via SQL*Net to client 4908 bytes received via SQL*Net from client 668 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 10000 rows processed
SQL>
SQL> select
2 c1 as v1,
3 c2 as v2,
4 case when id = 1 then c3 else c4 end as v3
5 from t1, 6 (select 1 id from dual union all select 2 id from dual) dupl 7 where dupl.id = 1 or 8 (dupl.id = 2 and t1.c5 < 5000 and t1.c3 <> t1.c4); V1 V2 V3 ---------- ---------- ---------- 730 0 1 731 1 2 732 0 0 733 1 1 734 0 2 735 1 0 736 0 1 737 1 2 738 0 0 739 1 1 740 0 2 ... V1 V2 V3 ---------- ---------- ---------- 8288 0 0 8295 1 3 8302 0 2 8309 1 1 8323 1 3
11071 rows selected.
Execution Plan
Plan hash value: 2909109448
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 225 | 4050 | 20 (10)| 00:00:01 | | 1 | NESTED LOOPS | | 225 | 4050 | 20 (10)| 00:00:01 | | 2 | VIEW | | 2 | 6 | 4 (0)| 00:00:01 | | 3 | UNION-ALL | | | | | | | 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | |* 6 | TABLE ACCESS FULL| T1 | 112 | 1680 | 8 (13)|00:00:01 |
Predicate Information (identified by operation id):
6 - filter("DUPL"."ID"=1 OR "DUPL"."ID"=2 AND "T1"."C5"<5000 AND
"T1"."C3"<>"T1"."C4")
Statistics
1 recursive calls 0 db block gets 800 consistent gets 0 physical reads 0 redo size 152676 bytes sent via SQL*Net to client 5412 bytes received via SQL*Net from client 740 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 11071 rows processed
SQL> Draw your own conclusions.
David Fitzjarrell Received on Mon Aug 25 2008 - 13:08:33 CDT