Re: Is Richard Niemiec Right

From: Daniel Morgan <damorgan_at_x.washington.edu>
Date: Sat, 20 Sep 2003 21:52:09 -0700
Message-ID: <1064119922.65845@yasure>

Ryan wrote:

>"Daniel Morgan" <damorgan_at_x.washington.edu> wrote in message
>news:1064114777.554698_at_yasure...
>
>
>>joe bayer wrote:
>>
>>
>>
>>>In Richard Niemiec's book "Oracle 9i Performance Tuning, Tips &
>>>Techniques", Chapter 9, advanced SQL tuning, he mentioned about selecting
>>>the driving table for table joins, he mentioned
>>>1) we should put most selective condtion in the where clause to eliminate
>>>the most unqualified rows
>>>2) in the three table join situation, we should put the table which join
>>>with the other two tables as the driving table, and put the driving table
>>>
>>>
>as
>
>
>>>first table after the from clause.
>>>
>>>Is this statement true?
>>>
>>>By testing, I found that changing the table position in the from clause
>>>
>>>
>or
>
>
>>>changing the position of join conditions at where clause does not affect
>>>optimizer at all, if you do not put ORDERED, or RULE hint.
>>>
>>>My understanding is, optimizer will calculate all the possibility of all
>>>possible join conditions up to 80,000 or something like that, so it
>>>
>>>
>should
>
>
>>>not matter where you put your table, or which join conditions first,
>>>
>>>
>unless
>
>
>>>it has more than 10 table joins.
>>>
>>>You imput is highly appreciated.
>>>
>>>
>>>
>>>
>>>
>>Not true with the single caveat that your statement doesn't look
>>something like this:
>>
>>SELECT /*+ RULE */ ...
>>FROM
>>WHERE
>>
>>We are lucky that Oracle has been around for a long time and is a mature
>>product. But like all good things there is a negative
>>side too. We are steeped in mythology that based on what may have been
>>valid in some murky and distant past but are no
>>longer relevant.
>>
>>Among my favorite examples of this are:
>>
>>"When you use a group by clause in your query, all the nongroup
>>expressions in the column clause of the query must
>>appear before the grouped expression in the column clause."
>>
>>and
>>
>>"Other types of views that you will learn about in Chapter 7 do not
>>support the use of the order by clause."
>>
>>Source: OCP Introduction to Oracle 9i: SQL Exam Guide, pages 128 and 163
>>respectively
>>
>>Though this, from the same book (page 322), isn't far behind:
>>
>>In Oracle, indexes can be created on any column in a table except for
>>columns of the LONG datatype.
>>Apparently the author and reviewers never heard of ORA-02327.
>>
>>My point here is not to claim superiority over the author and reviewers.
>>Likely they know a lot more than I do. But rather
>>to point out that you don't read something in a book and just implement
>>it. You read it in a book ... and test it extensively.
>>
>>And if you think EXISTS is always faster than IN or MINUS ... I've got
>>some SQL statements I'd like to sell you. ;-)
>>
>>--
>>Daniel Morgan
>>http://www.outreach.washington.edu/ext/certificates/oad/oad_crs.asp
>>http://www.outreach.washington.edu/ext/certificates/aoa/aoa_crs.asp
>>damorgan_at_x.washington.edu
>>(replace 'x' with a 'u' to reply)
>>
>>
>>
>when is it faster to use minus? I have found that in is faster if the
>sub-query is less 'costly' than the outer query by a considerable margin. I
>dont mean costly by the 'cost' indicator though. Less rows, doesnt have a
>group by, etc... less work to do.
>
>what about minus?
>
>
>

I'm not trying to be arrogant here ... but it is faster when it is faster. There are no, well almost no, rule that can be blindly followed: Here are a few SQL statements to chew on. Assume a record ratio of 1:10 between the tables servers and serv_inst.

-T1
SELECT srvr_id
FROM servers
INTERSECT
SELECT srvr_id
FROM serv_inst;

SELECT STATEMENT Cost = 8

-T2
SELECT srvr_id
FROM servers
WHERE srvr_id IN (

SELECT srvr_id
FROM serv_inst);

SELECT STATEMENT Cost = 4

-T3
SELECT srvr_id
FROM servers
WHERE srvr_id IN (

   SELECT i.srvr_id
   FROM serv_inst i, servers s
   WHERE i.srvr_id = s.srvr_id);

SELECT STATEMENT Cost = 5

-T4
SELECT DISTINCT s.srvr_id
FROM servers s, serv_inst i
WHERE s.srvr_id = i.srvr_id;

SELECT STATEMENT Cost = 5

-T5
SELECT DISTINCT srvr_id
FROM servers
WHERE srvr_id NOT IN (

   SELECT srvr_id
   FROM servers
   MINUS
   SELECT srvr_id
   FROM serv_inst);

SELECT STATEMENT Cost = 3

-T6
SELECT srvr_id
FROM servers s
WHERE EXISTS (
   SELECT srvr_id
   FROM serv_inst i
   WHERE s.srvr_id = i.srvr_id);

SELECT STATEMENT Cost = 4

This is part of the classroom demo for explain plan in my class and each of these six statements returns the exact same result set. And yes I know enough about EXPLAIN PLAN and 'Cost' to understand its strengths and weaknesses so lets not rehash that here and stray from the point.

The above was run on 9.2.0.4 with W2K. The results are not the same if performed on the same machine, with the same data, also with a freshly run DBMS_STATS using 8.1.7.4. With 8.1.7.4 the ordering is different.

Thus my statement ... that no one can make a blanket statement and expect it to be right all of the time. There are places to start that are more likely to be right than others. But that is all. Just like in quantum mechanics ... it is a question of probabilities.

-- 
Daniel Morgan
http://www.outreach.washington.edu/ext/certificates/oad/oad_crs.asp
http://www.outreach.washington.edu/ext/certificates/aoa/aoa_crs.asp
damorgan_at_x.washington.edu
(replace 'x' with a 'u' to reply)

Received on Sat Sep 20 2003 - 23:52:09 CDT