Re: performance difference between UNION operator and OR in where clause Options

From: Mike <gongweigang_at_gmail.com>
Date: Wed, 11 Jul 2007 07:49:33 -0700
Message-ID: <1184165373.980289.256250_at_d55g2000hsg.googlegroups.com>


On Jul 11, 10:20 am, "Roy Hann" <specia..._at_processed.almost.meat> wrote:
> "Mike" <gongweig..._at_gmail.com> wrote in message
>
> news:1184160105.785774.35920_at_w3g2000hsg.googlegroups.com...
>
>
>
>
>
> > tables:
> > 1. publication(pubid, title)
> > 2. book(pubid, date)
> > 3. journal(pubid, date)
>
> > purpose: find all publication titles for books or journals.
>
> > precondition: both book and journal tables are not empty. (if either
> > is empty, the following solution 2 will be wrong. See my old post
> > "difference between UNION operator and OR in where clause Options" for
> > details).
>
> > Solution 1: use set "union"
>
> > select title
> > from publication, ( (select pubid from book) UNION (select pubid from
> > journal)) bj
> > where publication.pubid = bj.pubid
>
> > solution 2: use "or" operator in the where clause
>
> > select DISTINCE title
> > from publication, book, journal
> > where publication.pubid = book.pubid or publication.pubid =
> > journal.pubid
>
> > Both solutions are correct under the precondition.
>
> > My question is which one should use for better performance ?
>
> > My intuitive guess will be solution 2.
>
> > Any comments ?
>
> In general there is no way to know. Any reasonable SQL DBMS should be free
> to do as it pleases as long as the answer comes out right. Intuition is a
> really bad guide. When it comes to SQL, intuitions are very frequently
> wrong. (For instance, are you confident about when you will break-even
> doing a table scan with readahead, or how parallel execution will play out?)
>
> The obvious corollories are: (1) if you are using an SQL DBMS which requires
> you to care/know about which of several equivalent formulations is "optimal"
> you should start looking for a better one, and (2) if your DBMS is too
> stupid to recognize equivalent formulations today so that *you* have to
> choose the best one, there is no reason to suppose a future release won't
> change so that another formulation is more optimal.
>
> Roy- Hide quoted text -
>
> - Show quoted text -

You are right that there is no way for sure to know which solution performs better during execution.

Since I am new to database programming, what I would like to see is any guide lines (or lessons, or experience) when picking between a "UNION" operator and an "OR" operator. For example, suppose you face such situation, which one you will choose ?

My experience told me that a good programmer will always try to use a better algorithm or solution, instead of relying on the optimization during runtime.

One simple example is :

loop1:



boolean a;
for(int i = 0; i < 1000; i++) {

   a = false;
   printf("%d\n",i);
}


loop2:



boolean a;
a = false;
for(int i = 0; i < 1000; i++) {

   printf("%d\n",i);
}

A good programmer should always pick loop2, instead of loop1. Though in the runtime, they might be same if the optimization does the right thing. Received on Wed Jul 11 2007 - 16:49:33 CEST

Original text of this message