Re: Avoid multiple scan of the same table

From: Dan Blum <tool_at_panix.com>
Date: Fri, 22 Aug 2008 18:08:57 +0000 (UTC)
Message-ID: <g8mvbp$g13$1@reader1.panix.com>


DA Morgan <damorgan_at_psoug.org> wrote:
> Dan Blum wrote:
> > maks71_at_gmail.com wrote:
> >> On Aug 21, 7:08?pm, DA Morgan <damor..._at_psoug.org> wrote:
> >>> mak..._at_gmail.com wrote:
> >>>> Is there way to avoid multiple scan of the same table in following?
> >>>> Select c1 as v1, c2 as v2, c3 as v3
> >>>> From t1
> >>>> Union
> >>>> Select c1 as v1, c2 as v2, c4 as v3
> >>>> From t1
> >>>> Where ?c5 < 5000
> >>>> The output should look like..
> >>>> v1, v2, v3
> >>>> v11, v21, v31
> >>>> v12, v22, v32
> >>>> and so on.
> >>>> Rows from the both queries can be same so UNION is required to avoid
> >>>> duplicates in current construct.
> >>> One way would be to use a WITH CLAUSE query.http://www.psoug.org/reference/with.html
> >>> --
> >>> Daniel A. Morgan
> >>> Oracle Ace Director & Instructor
> >>> University of Washington
> >>> damor..._at_x.washington.edu (replace x with u to respond)
> >>> Puget Sound Oracle Users Groupwww.psoug.org- Hide quoted text -
> >>>
> >>> - Show quoted text -
> >
> >> Pls. explain with above example.. I have given the problem query with
> >> expected output..
> >
> > At some point reading the relevant documentation is a good idea... however:
> >
> > with x as (select c1, c2, c3, c4, c5 from t1)
> > select c1 v1, c2 v2 c3 v3 from x
> > union
> > select c1 v1, c2 v2, c4 v3 from x where c5 < 5000;
> >
> > is one way of doing it.
> >
> > If the table is large this is still not going to perform very well because
> > of the work to remove duplicate rows. Given more information about the table,
> > a better statement could probably be written.

> One way to avoid (not remove) duplicate rows would be to put a WHERE
> clause on the first statement. Making each set exclusive.

> with x as (select c1, c2, c3, c4, c5 from t1)
> select c1 v1, c2 v2 c3 v3 from x
> where c5 >= 5000;
> union
> select c1 v1, c2 v2, c4 v3 from x
> where c5 < 5000;

Except that if c3 != c4 then we always want the c3 row, regardless of the value of c5. A better option would be

with x as (select c1, c2, c3, c4, c5 from t1) select c1 v1, c2 v2, c3 v3 from x
union
select c1 v1, c2 v2, c4 v3 from x
where c5 < 5000
  and c3 != c4;

That ensures that nothing in the bottom set matches anything in the top set, although of course each set could have duplicates within it as far as we know. If anything there (c1 or c2 or the combination) is a unique key then in this version we can replace UNION with UNION ALL. If not, it might still be faster to use UNION ALL and put a DISTINCT (or GROUP BY) on each query - unless one set is much larger than the other, it will almost certainly be faster.

-- 
_______________________________________________________________________
Dan Blum					         tool_at_panix.com	
"I wouldn't have believed it myself if I hadn't just made it up."
Received on Fri Aug 22 2008 - 13:08:57 CDT

Original text of this message