Re: order of tables in FROM ?

From: <holderm_at_mailexcite.com>
Date: Fri, 01 May 1998 21:58:12 -0600
Message-ID: <6ie244$ci7$1@nnrp1.dejanews.com>

In article <6i8gke$rra$1_at_pebble.ml.org>#1/1, joelga_at_pebble.ml.org (Joel Garry) wrote:
>
> In article <6i7nql$163a_at_fidoii.cc.Lehigh.EDU>,
> Phil R Lawrence <prl2_at_lehigh.edu> wrote:
> >Hey, anybody can write SQL (I do), but I want to move on to a higher plain.
> >
> >1. Is the ordering of tables listed in the FROM statement important for
> >performance?
> 3.1 Ordering of the FROM clause.
>
> This rule is for when the oracle optimiser is
> stuck for a good idea. The Oracle optimiser works
> in the following manner. It looks at each of the
> WHERE clauses and assigns the tables concerned a
> number based on the type of predicate e.g. field =
> 'const' or field(+) = field. It then chooses the
> table with the lowest score as the driving table.
> But, and its a big one, if more than one table has
> the same lowest score then it chooses the last
> table in the FROM list to be the driving table.
>
> >
> >2. Is there a FAQ where I can start to learn what things *do* impact
> >performance?
> >
>
> www.orafaq.org. The above is from selection SQL, then going to "Tina
> Londons'SQL Guide.
>
> Some informative surfing off of
> http://ourworld.compuserve.com/homepages/joel_garry/oracleln.htm
>
> >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >Phil R Lawrence phone: 610-758-3051
> >Programmer / Analyst e-mail: prl2_at_lehigh.edu
> >194 Lehigh University Computing Center
> >E.W. Fairchild - Martindale, Bldg. 8B
> >Bethlehem, PA 18018
> >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> >
> >
>
> --
> These opinions are my own and not necessarily those of Information Quest
> jgarry@eiq.com http://www.informationquest.com
> http://ourworld.compuserve.com/homepages/joel_garry
> "See your DBA?" I AM the @#%*& DBA!
>

Mike Holder wrote...

Close but... it depends on a lot of things.

Is you database set up to use RULE based optimization or COST based? Has your friendly DBA run statistics? Does one of the tables have enough information in the WHERE clause to cause Oracle to perform a unique scan on an index or a range scan?

In cost based it is impossible to tell how the optimizer will work because Oracle looks at the statistics and tries to find the least costly access path.

For example, if you have state_cd = 'CO' on a sales_order table with thousands of different customers, and sex = 'F' on the customer table which you are joining to, you would probably get fewer rows on sales_order for Colorado because about half of your customers should be femail, while about 1/50 orders will be from Colorado. So Oracle should return rows from sales_order first, then join to customer and eliminate rows where sex <> 'F'.

And then one day two years later, sales_order has become huge while the customers stay about the same. Say on average each customer has about 50 orders. Now, looking at Colorado will return a result set about as big as the customer table itself, whereas sex='F' should return about half that. So you would want to return all femail customers, then join to sales_order and eliminate rows with state_cd <> 'CO'. Cost based optimization might do this if/when statistics are run. I say might because cost based optimization is kind of a magical relm (i.e. unpredictable).

Rule based is more understandable.

In the case above we were compareing two different range scans (scan an index but get more than one row). Say instead that you are joining three tables. In the WHERE clause you have enough information to uniquely specify a single row, in other words you give values for every column in the primary key. (A primary key is the columns that give uniqueness to a row and always has an index because Oracle needs the index to enforce the unique constraint). The other tables are range scans. Say the tables are as follows:

customer

cust_no <--- primary key cust_name etc...

sales_order

order_no <--- primary key order_dt <--- indexed column cust_no <--- foreign key to customer table etc...

sales_line

order_no <--- primary key
line_no <--- primary key item_id <--- foreign key to item table qty price etc...

And your query is:

SELECT (whatever)
FROM sales_line c

, sales_order b
, customer a
WHERE a.cust_no = 12345
AND b.cust_no = a.cust_no
AND order_dt BETWEEN '01-JAN-98' AND '30-APR-98' AND c.order_no = b.order_no

Note that the tables are aliased in the order that you want to access them, a, b then c.

This is how Oracle will access the data under rules based optimization. In this case you have a unique scan on pk_customer the primary key to the customer table (a.cust_no = 12345) which returns one row. Then with the result (a.cust_no) you perform a range scan on the foreign key index ifk_sales_order_customer (b.cust_no = a.cust_no). Oracle might try to do a range scan on the order_dt column (order_dt BETWEEN ...), but since the cust_no range scan is closest to the WHERE it should do that one first. It might be better to use (order_dt + 0 BETWEEN...) to break the index which guarantees you use the cust_no FK index. Finally you access sales_line with a range scan on pk_sales_line, the primary key to the sales_line table.

In this example you order the tables a, b, c, then you list the tables going away from the WHERE, and you list the WHERE clause statements also away from the WHERE. Note that the result from a feeds the next table, and the results from b feeds c. This is because once a has been accessed it has a known value, and once b is known (several rows), they are known values for table c.

You can find more about how to optimize a query by visiting http://www.tusc.com/ click on Documents and download EXPLAIN.ZIP, I highly recommend it.

-----== Posted via Deja News, The Leader in Internet Discussion ==----- http://www.dejanews.com/ Now offering spam-free web-based newsreading Received on Fri May 01 1998 - 22:58:12 CDT