Multiple Table Joins

NP-Hard Problems

Oracle's Cost Based Optimiser works by analyzing several of the possible execution paths for a SQL and choosing the one that it considers best. For instance, a two table join could drive off table A and lookup table B for each row returned, or it could drive off table B. By adding in the possibilities of join methods and index selection, the number of possible execution paths increases.

A three table join has three times as many alternatives, a four table join has four times the alternatives of a three table join. In general, the number of possible execution paths for a join statement is proportional to n! (ie. n x n-1 x n-2 x ... x 2 x 1), where n is the number of tables in the join.

The problem of choosing the absolute best execution path becomes near impossible as n increases. Mathematicians call this an np-hard - or non-polynomial - problem. The theory of np-hard problems states that there are no shortcuts to finding the absolute best solution. Often we must settle for a solution that just looks OK.

Oracle does not attempt to evaluate every possible execution path. As the number of tables in a join increases, the percentage of possible paths evaluated plummets. Since bad execution paths far outnumber good execution paths, the likelihood that Oracle does not stumble on even a halfway good execution path increases with the number of tables in the join.

What to look for

If you have a table join (ie. a FROM clause) with five or more tables, and you have not included a hint for join order (eg. ORDERED or LEADING ), then Oracle may be joining the tables in the wrong order.

Run your SQL through Explain Plan and check the order that the tables are processed. Say we have a SQL with five tables (A-E) in the FROM clause, and the following joins:

A <-> B
A <-> C
B <-> D
D <-> E

If the Explain Plan showed the tables in the order A, B, C, D, E, then this would be OK. However if it showed A, B, C, E, D, then there would be a problem because we need the columns from D to join to E, but D is joined last.

This is a simplistic example that would never happen if all of the join predicates were equals joins. But what if the SQL had clauses like this:

WHERE B.k1 = D.k1
AND   D.k2 = E.K3
AND   E.eff_date >= A.eff_date

In this case, table E joins to both table D and table A, however the join to table A is a range operator (>=). It may seem obvious that table E should be joined after table D, but this is exactly the sort of case where Oracle can mess up.

How to fix it

If the tables are being joined in the wrong order, you can supply a hint to suggest a better order.

If Oracle is just starting with the wrong table, try a LEADING hint to suggest the best table to start with. SQLs with equi-joins will often get the rest of the joins right if only they know where to start.
For ultimate control, update the FROM clause to list the tables in the exact order that they should be joined, and specify the ORDERED hint.

Is it now joining in the correct order? If not, then Oracle is ignoring the hint; check with your DBA.

If it is joining in the desired order, is it faster? If not, try elsewhere in this guide; the problem may be related to index usage or join methods.