Range Keyed Tables

A more insidious Range Scan problem occurs when a table contains two columns that represent lower and upper bounds of a range. Consider a table of employee grades defined by lower and upper salary bounds.

The SQL above selects the position description for employees earning $34,500. Even though we expect only one row returned, the primary key index on min_salary and max_salary will not perform a unique scan because we are not using the key with equals only joins. In fact, since the first column of the index - min_salary - is used with a range predicate - the rest of the index (max_salary) will not take part in the scan at all. This means that the index scan will return all four rows with min_salary <= 34500 before using max_salary to filter out the three non-matching rows. Whilst this example is trivial, it becomes very non-trivial when the table contains thousands of ranges.

For a single table select such as this, the solution is simple. Create an index with the two columns swapped around. ie. max_salary, min_salary. This will ensure that the first matching row in the index is the row that we want. Add the predicate AND ROWNUM = 1 to the query. Oracle will range scan the index, find that the first row encounted is a match for min_salary as well, and then abort the SQL because of the ROWNUM = 1 predicate. The range scan is not given the opportunity to complete.

Note that this technique is useless if the query is a table join where we want to use the range keyed table as a loopup for rows in another table.

Consider the SQL:

If the lookup of the range table is part of a table join, then there is simply no good way to perform the join using an index without a horrendous range scan. Say position is keyed on (max_salary, min_salary). For each row in employee, the index on position is range scanned on max_salary and (on average) half the rows are read from that index before being filtered down to one.

If the range keyed table contains thousands of rows, the cost can be catastrophic.

If the range-keyed table has been designed so that the ranges never overlap, then we can alter the SQL slightly to short-circuit the range scan. In the above example, if we assume that the ranges never overlap then the following SQL is functionally equivalent:

This SQL exploits a feature of the Cost Based Optimizer, whereby the MIN or MAX of an Index Range Scan can be returned without scanning the entire range - Oracle just goes to either the start or end of the range and returns the first or last row. Use Explain Plan to determine whether Oracle is using the FIRST ROW feature:

Since this is a feature of the CBO, it is not available under the Rule Based Optimizer (RBO). If the Optimizer Mode/Goal is set to RULE, or the SQL contains the /*+ RULE*/ hint, or the Index has not been analyzed, the FIRST ROW identifier will not appear in the Explain Plan and there will be no performance improvement. Analyze the index and ensure that CBO is being used, then try again. It may be necessary to add a PUSH_SUBQ hint to promote the eveluation of the sub-query ahead of the BETWEEN join.

If you cannot guarantee non-overlapping ranges, the best option is to perform a Sort-Merge join between the tables, and ignore the index on the Range Keyed Table - you may find that Oracle chooses this method anyway if the table is analyzed. Unlike a Hash Join, a Sort-Merge join is able to join on range predicates (>[=], <[=], BETWEEN). To find out if Oracle is using a Merge join, run the SQL through Explain Plan and look for a MERGE step covering the two tables. If Oracle is performing a Nested Loops join, add a USE_MERGE hint to the SQL.

Note that this technique involves a Full Table Scan on the range keyed table. This makes it efficient if we are joining the range keyed table to many rows in the other table (say >50). If the SQL is retrieving only a few rows in the main table, then the full table scan on the range keyed table might be too expensive. In this situation, break the join into two separate SQLs: create a cursor selecting from the main table, and a nested lookup querying the range key table using the ROWNUM=1 technique desribed earlier.

©Copyright 2003