Re: Oracle Optimizer versus the Logical OR
Date: Fri, 18 Sep 1992 05:10:23 GMT
Message-ID: <1992Sep18.051023.26964_at_RedBrick.COM>
In article <1992Sep17.004345.12002_at_nb.rockwell.com> usenet%dny.rockwell.com (Usenet Account) writes:
>We are at the point where we have loaded real data into the Oracle database I
>built -- 40,000 rows in one table and 80,000 in another. It is at this
>juncture that indexing is essential for speedy retrieval.
>
>However, it has become apparent that Oracle V6.0's optimizer *BYPASSES* the
>indexes and does a table scan (!) when it encounters a logical OR combined
>with any of the following items in the WHERE clause: Outer join, subselect,
>and possibly others. CONNECT BY and != and NOT xxx also seem to induce
>the optimizer to eschew the indexes which I so carefully put in place for this
>purpose! This of course applies to the "IN (item,item,item)" clause too, since
>this is equivalent to a string of OR's.
This is truly interesting. While it is true that IN is equivalent to a string of ORs, the converse doesn't hold. Therefore, it is important to understand, in query optimization, whether the list of ORed expressions create the equivalent of an IN, or are ORs between mismatched fields or expressions. An OR between two expressions on two different columns is very different than an OR between two expressions on the same column.
In most query systems, IN is handled like a bunch of direct lookups on the column in question using its index (assuming it is a simple IN with no other complex expressions involved). This is also subject to whether the optimizer decides whether to just do a relation scan because it is faster than using the index (you would be surprised how often this is true). A *good* optimizer will also detect a series of ORs on a single column looking for "equal" comparisons to single values and convert it into an IN lookup. A *poor* optimizer will not notice this and just brute force the OR. IBM's DB2 originally handled IN very well but even simple ORs poorly (X IN(3,7) would be fast while X=3 OR X=7 would be horribly slow). DB2 V2R1 and beyond fixed most of these problems and turns "reasonable" ORs into "sargable" predicates which can use indexes directly (if appropriate).
Note that X=3 OR Y=7 is quite different. This results in the union
of the two result sets and there can be rows that match either one or
both of the predicates (in X=3 OR X=7, a row can match only one of
the predicates). Most systems don't do this very well, and in fact,m
a relation scan is often the best thing to do.
>
>My solution was to UNION together SELECT statements. Example:
>
>select eq.make, eq.serialno, emp.lastname
>from equipment eq, employee_roster emp
>where eq.ssn = emp.ssn(+)
>and (eq.make = 'HEWLETT PACKARD' or eq.make = 'BECHTEL');
>
>The foregoing results in a table scan, although "ssn" and "make" are indexed.
>
>select eq.make, eq.serialno, emp.lastname
>from equipment eq, employee_roster emp
>where eq.ssn = emp.ssn(+)
>and (eq.make = 'HEWLETT PACKARD')
>UNION
>select eq.make, eq.serialno, emp.lastname
>from equipment eq, employee_roster emp
>where eq.ssn = emp.ssn(+)
>and (eq.make = 'BECHTEL');
>
>The foregoing USES the indexes.
>
>In conclusion:
>
>"OR" is an awfully common thing to put in a SQL query -- any suggestions
>on other workarounds -- or better yet, a non-kludge way to make Oracle
>use the indexes, even against its will ?!?!?
I believe that Oracle 7 will do a better job on this (at least Oracle claims to have improved its optimization techniques). If indeed Oracle 6 does NOT use the index on an IN, I would be very surprised (this would make their optimizer much poorer than even I thought it was). And remember, using indexes isn't always the best thing anyway, although in the above cases, I would bet it was as long as the tables are reasonably big.
Jon Rosen Received on Fri Sep 18 1992 - 07:10:23 CEST