Jonathan Lewis

Subscribe to Jonathan Lewis feed Jonathan Lewis
Just another Oracle weblog
Updated: 4 hours 5 min ago

Index Speculation

Sun, 2016-03-20 17:32

There’s a current question on the OTN database forum as follows (with a little cosmetic adjustment):

I have a request for 3 indices as shown below. Does the 1st index suffice for 2 and 3?  Do I need all 3?

  • create index idx_atm_em_eff_ver_current_del on atm_xxx_salary (employee_key, effective_dt, salary_version_number, is_current, is_deleted);
  • create index idx_atm_em_ver_current on atm_xxx_salary (employee_key, salary_version_number, is_current);
  • create index .idx_atm_sal_current_del on atm_xxx_salary (employee_key, is_deleted, is_current);

In the absence of any information about the data and the application the correct answer is: “How could we possibly tell?”

On the other hand there’s plenty of scope for intelligent speculation, and that’s an important skill to practise because when we’re faced with a large number of options and very little information we need to be able to make best-guess choices about which ones are most likely to be worth the effort of pursuing in detail. So if we have to make some guesses about this table and the set of indexes shown, are there any reasonable guesses we might make.

I’ve highlighted the table name and leading column for the first index. The table seems to about salary and the leading column seems to identify an employee. In fact we see that all three indexes start with the employee_key and that may be what prompted the original question. Previous (“real-world”) experience tells me that employees are, generally, paid a salary and that salaries are likely to change (usually upwards) over time, and I note that another column in one of these indexes is effective_dt (date ?), and a third column (appearing in two of the indexes) is is_current.

This looks like a table of employee salaries recording their current and historic salaries, engineered with a little redundant information to make it easy to find the current salary. (Perhaps there’s a view of current_salary defined as is_current = ‘Y’ and is_deleted = ‘N’.)

It’s harder to speculate with any confidence on the columns is_deleted and salary_version_number;  why would a salary row be marked as deleted – is this something that happens when an employee leaves or an employee is deleted (or, following the pattern, has their is_deleted flag set to ‘Y’); why does a salary have a version number – does the table contain it’s own audit trail of errors and corrections, perhaps a correction is effected by marking the incorrect entry as deleted and incrementing its version number to generate the version number for the correct entry. Possibly the notional primary key of the table is (employee_key, effective_dt, is_deleted, salary_version_number).

The level of complexity surrounding these two columns could send further speculation in completely the wrong direction, but let’s follow the line that these two columns see very little action – let’s assume that most of the data is not “deleted” and virtually none of the data needs “versioning”. How does this assumption help us with the original question.

The largest employer in the world is the America Department of Defence with 3.2 million employees (following by the People’s Liberation Army of China with only 2.3 million employees), so an “employees” table is not really likely to be very big. How often does an employee have a salary review and change ? Would once per year be a reasonable figure to pluck from the air ? How many employess stay at the same company for 40 years – how many rows per employee would you end up with, and how scattered would they be through the salary table ?

Under any reasonable estimate it seems likely that if you created the first index (5 columns) then all the salary rows for a given employee are likely to be contained in a single leaf block, so if all the searches were driven by employee then that single index would allow exactly the correct set of table rows to be identified from one index leaf block access plus a little extra CPU.

Of course it’s possible that, with different circumstances, the size and clustering factor of the first index would be so much greater than the size and clustering factors of the other two that a query that would use one of the smaller indexes won’t use the larger index -but in this case the most significant contributor to the optimizer’s cost is likely to be the clustering_factor and given our assumption of the slow appearance over time of the new salaries for an employee the clustering factor of all three indexes is likely to be the same (probably very similar to the number of rows in the salary table).

Having got this far, it’s worth considering whether or not the salary table should actually be an index-organized table – it looks like an obvious candidate; how many other columns are there likely to be in a salary table ? Of course it’s worth thinking about other queries that might access a salary table without reference to the employees table at that point, perhaps a secondary index on (is_current, employee_key) might be appropriate, but in the absence of any other information we’ve reached the point where speculation needs to be backed up by some hard facts.

Bottom Line:

I wouldn’t guarantee that the first index makes the other two indexes redundant but it seems highly likely that it should and it’s probably worth spending some time looking at the requirements and numbers a little more closely – especially if you’re the US DoD or the Chinese People’s Liberation Army.

 

 

 

 


Hinting

Thu, 2016-03-17 08:10

A posting on the OTN database forum a few days ago demonstrated an important problem with hinting – especially (though it didn’t come up in the thread)  in the face of upgrades. A simple query needed a couple of hints to produce the correct plan, but a slight change to the query seemed to result in Oracle ignoring the hints. The optimizer doesn’t ignore hints, of course, but there are many reasons why it might have appeared to so I created a little demonstration of the problem – starting with the following data set:

rem
rem     Script:  OTN_DAG.sql
rem     Author:  J.P.Lewis
rem     Dated:   March 2016
rem

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from
        generator       g1,
        generator       g2
where
        rownum <= 24000
;

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc((rownum-1)/15)    n1,
        trunc((rownum-1)/15)    n2,
        rpad(rownum,180)        v1
from    generator
where
        rownum <= 3000
;
begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

(Ignore the silliness of the way I’ve created the data, it’s a consequence of using my standard template).

For every row in t2 there are 8 rows in t1, so when I join t1 to t2 on n2 it would obviously be sensible for the resulting hash join to use the t2 (smaller) data set as the build table and the t1 data set as the probe table, but I’m going to pretend that the optimizer is making an error and needs to be hinted to use t1 as the build table and t2 as the probe. Here’s a query, and execution plan, from 11.2.0.4:

explain plan for
select
        /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */
        count(t1.n2)
from
        t1, t2
where
        t2.n2 = t1.n2
and     t1.n1 = 15
and     t2.n1 = 15
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    16 |    97   (3)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    16 |            |          |
|*  2 |   HASH JOIN         |      |    20 |   320 |    97   (3)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   120 |   960 |    85   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N2"="T1"."N2")
   3 - filter("T1"."N1"=15)
   4 - filter("T2"."N1"=15)

As you can see, the optimizer has obeyed my hinting – the join order is t1 -> t2, I’ve used a hash join to join t2, and Oracle hasn’t swapped the join inputs despite the fact that the t1 data set is larger than the t2 data set (960 bytes vs. 120 bytes) which should have persuaded it to swap. (Technically, the leading() hint seems to block the swap of the first two tables anyway – see the “Special Case” section at this URL, but I’ve included it the no_swap_join_inputs() anyway to make the point explicit.)

So now, instead of just count n2, we’ll modify the query to count the number of distinct values for n2:


explain plan for
select
        /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */
        count(distinct t1.n2) 
from
        t1, t2
where
        t2.n2 = t1.n2
and     t1.n1 = 15
and     t2.n1 = 15
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |          |     1 |    13 |            |          |
|   2 |   VIEW                | VW_DAG_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |          |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |          |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T2       |    15 |   120 |    12   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T1       |   120 |   960 |    85   (3)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T2"."N1"=15)
   6 - filter("T1"."N1"=15)

Check operations 5 and 6 – Oracle has swapped the join inputs: t2 (the obvious choice) is now the build table. Has Oracle ignored the hint ? (Answer: No).
If you look at operation 2 you can see that Oracle has generated an internal view called VW_DAG_0 – this is an example of the “Distinct Aggregate” transformation taking place. It seems to be a pointless exercise in this case and the 10053 trace file seems to indicate that it’s a heuristic transformation rather than cost-based transformation (i.e. the optimizer does it because it can, not because it’s cheaper). Oracle has transformed the SQL to the following (to which I have applied a little cosmetic tidying):


SELECT  /*+ LEADING (T1) */
        COUNT(VW_DAG_0.ITEM_1) "COUNT(DISTINCTT1.N2)"
FROM    (
        SELECT  T1.N2 ITEM_1
        FROM    TEST_USER.T2 T2,TEST_USER.T1 T1
        WHERE   T2.N2=T1.N2
        AND     T1.N1=15
        AND     T2.N1=15
        GROUP BY
                T1.N2
        ) VW_DAG_0

Notice how the use_hash() and no_swap_join_input() hints have disappeared. I am slightly surprised that the leading() hint is still visible, I would have expected all three to stay or all three to disappear; regardless of that, though, the single remaining hint references an object that does not exist in the query block where the hint has been placed. The original hint has not been “ignored”, it has become irrelevant. (I’ll be coming back to an odd little detail about this transformed query a little later on but for the moment I’m going to pursue the problem of making the optimizer do what we want.)

We have three strategies we could pursue at this point. We could tell the optimizer that we don’t want it to do the transformation; we could work out the query block name of the query block that holds t1 and t2 after the transformation and direct the hints into that query block; or we could tell Oracle to pretend it was using an older version of the optimizer because that Distinct Aggregate transformation only appeared in 11.2.0.1.

You’ll notice that I used the ‘alias’ formatting command in my call to dbms_xplan.display() – this is the queryblock / alias section of the output:


Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$C33C846D
   2 - SEL$5771D262 / VW_DAG_0@SEL$C33C846D
   3 - SEL$5771D262
   5 - SEL$5771D262 / T1@SEL$1
   6 - SEL$5771D262 / T2@SEL$1

Strategy A says try adding the hint: /*+ no_transform_distinct_agg(@sel$1) */
Strategy B says try using the hints: /*+ leading(@SEL$5771D262 t1@sel$1 t2@sel$1) use_hash(@SEL$5771D262 t2@sel$1 no_swap_join_inputs(@SEL$5771D262 t2@sel$1) */
Strategy C says try adding the hint: /*+ optimizer_features_enable(‘11.1.0.7’) */

Strategies A and C (stopping the transformation) produce the following plan:


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    16 |    98   (4)| 00:00:01 |
|   1 |  SORT GROUP BY      |      |     1 |    16 |            |          |
|*  2 |   HASH JOIN         |      |    20 |   320 |    98   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   120 |   960 |    85   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N2"="T1"."N2")
   3 - filter("T1"."N1"=15)
   4 - filter("T2"."N1"=15)

Strategy B (allowing the transformation, but addressing the hints to the generated query block) produces this plan:


----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |          |     1 |    13 |            |          |
|   2 |   VIEW                | VW_DAG_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |          |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |          |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T1       |   120 |   960 |    85   (3)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T2       |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T1"."N1"=15)
   6 - filter("T2"."N1"=15)

All three Strategies have produced plans that use t1, the larger data set, as the build table. It’s hard to resist asking if it’s possible to claim that one of the three strategies is the best strategy; it’s hard to say, but I think I’d favour using the no_transform_distinct_agg() hint because it’s precisely targetted – so avoids the brute force thuggish nature of the reverting back to an old version, and avoids the (possble) fragility of needing to know a very precise query block name which (possibly) might change for some reason if the query were to be modified very slightly. The argument, of course, comes from the perspective of a friendly consultant who visits for a couple of days, gets a bit clever with your SQL, then walks away leaving you to worry about whether you understand why your SQL now works the way it does.

Upgrades

My opening comment was about the difficulty of hinting across upgrades. Imagine you had been running this count(distinct) query in 10.2.0.5, and after some experimention had found that you got the path you needed by adding the hints: /*+ leading(t1 t2)  full(t1) use_hash(t2) no_swap_join_inputs(t2) full(t2) */. This is a careful and thorough piece of hinting (and it does work, of course, in 10.2.0.5).

When the big day for upgrading to 11.2 arrives (just in time for Oracle to ends extended support, possibly) you find that this query changes its execution plan. And this is NOT a rare occurrence. I’ve said it before, and I’ll keep saying it: hinting – especially with “micro-management” hints – is undesirable in a production system. You probably haven’t done it right, and even if the hints are (broadly speaking) perfect in the current version they may be pushed out of context by a new feature in the next version.  If you’ve hinted your code you have to check every single hinted statement to make sure the hints still have the same effect on the upgrade.

This is why I produce the sound-bite (which Maria Colgan nicked): “if you can hint it, baseline it”.  If you had generated a baseline (or outline) from a query with these hints in 10g Oracle would have included the /*+ optimizer_features_enable(‘10.2.0.5’) */ hint with the functional hints, and the upgrade wouldn’t have produced a different plan.

Technically, of course, you could have remembered to add the hint to your production code – but in many cases Oracle introduces far more hints in an SQL Baseline than you might want to put into your code; and by using the SQL Baseline approach you’ve given yourself the option to get rid of the “hidden hinting” in a future version of Oracle by dropping the baseline rather than rewriting the code and (perhaps) recompiling the application.

Inevitably there are cases where setting the optimizer_features_enable backwards doesn’t rescue new from a new plan – there are probably a few cases where the internal code forgets to check the value and bypass some subroutines; more significantly there are cases where one version of Oracle will give you an efficient plan because of an optimizer bug and setting the version backwards won’t re-introduce that bug.

Footnote

I said I’d come back to the “unparsed” query that the optimizer generated from the original count(distinct) statement and the way it left the leading(t1) hint in place but lost the use_hash(t2) and no_swap_join_inputs(t2). I got curious about how Oracle would optimize that query if I supplied it from SQL*Plus – and this is the plan I got:


explain plan for
SELECT  /*+ LEADING (T1) */
        COUNT(VW_DAG_0.ITEM_1) "COUNT(DISTINCTT1.N2)"
FROM    (
        SELECT  T1.N2 ITEM_1
        FROM    TEST_USER.T2 T2,TEST_USER.T1 T1
        WHERE   T2.N2=T1.N2
        AND     T1.N1=15
        AND     T2.N1=15
        GROUP BY
                T1.N2
        ) VW_DAG_0
;

-----------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |           |     1 |    13 |            |          |
|   2 |   VIEW                | VM_NWVW_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |           |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |           |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T1        |   120 |   960 |    85   (3)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T2        |    15 |   120 |    12   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T1"."N1"=15)
   6 - filter("T2"."N1"=15)

Oracle has managed to do a transformation to this statement that it didn’t do when it first generated the statement – too much recursion, perhaps – and that floating leading(t1) hint has been squeezed back into action by a view-merging step in the optimization that got the hint back into a query block that actually contained t1 and t2!  At this point I feel like quoting cod-philosophy from the Dune trilogy: “Just when you think you understand …”

 


Quiz

Mon, 2016-03-14 16:38

Can you spot anything that might appear to be a little surprising about this (continuous) extract from a trace file ? The example is from 10.2.0.5, but 11g and 12c could produce similar results (or direct path read equivalents):


PARSING IN CURSOR #3 len=23 dep=0 uid=30 oct=3 lid=30 tim=112607775392 hv=4235652837 ad='2f647320'
select count(*) from t1
END OF STMT
PARSE #3:c=0,e=53,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=112607775385
EXEC #3:c=0,e=99,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=112607787370
WAIT #3: nam='SQL*Net message to client' ela= 9 driver id=1111838976 #bytes=1 p3=0 obj#=10443 tim=112607789931
WAIT #3: nam='db file sequential read' ela= 415 file#=5 block#=9 blocks=1 obj#=21580 tim=112607795682
WAIT #3: nam='db file scattered read' ela= 2785 file#=5 block#=905 blocks=98 obj#=21580 tim=112607801263
WAIT #3: nam='db file scattered read' ela= 2919 file#=5 block#=777 blocks=128 obj#=21580 tim=112607808280
WAIT #3: nam='db file scattered read' ela= 2066 file#=5 block#=649 blocks=128 obj#=21580 tim=112607813300
WAIT #3: nam='db file scattered read' ela= 1817 file#=5 block#=521 blocks=128 obj#=21580 tim=112607817243
WAIT #3: nam='db file scattered read' ela= 1563 file#=5 block#=393 blocks=128 obj#=21580 tim=112607820899
WAIT #3: nam='db file scattered read' ela= 1605 file#=5 block#=265 blocks=128 obj#=21580 tim=112607824710
WAIT #3: nam='db file scattered read' ela= 1529 file#=5 block#=137 blocks=128 obj#=21580 tim=112607828296
WAIT #3: nam='db file scattered read' ela= 1652 file#=5 block#=10 blocks=127 obj#=21580 tim=112607831946
FETCH #3:c=15625,e=41568,p=994,cr=996,cu=0,mis=0,r=1,dep=0,og=1,tim=112607834004
WAIT #3: nam='SQL*Net message from client' ela= 254 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607835527
FETCH #3:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,tim=112607836780
WAIT #3: nam='SQL*Net message to client' ela= 2 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607837935
WAIT #3: nam='SQL*Net message from client' ela= 14371 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607853526
=====================
PARSING IN CURSOR #2 len=52 dep=0 uid=30 oct=47 lid=30 tim=112607855239 hv=1029988163 ad='2f6c5ec0'
BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;
END OF STMT
PARSE #2:c=0,e=51,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=112607855228
WAIT #2: nam='SQL*Net message to client' ela= 4 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607861803
EXEC #2:c=0,e=1271,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,tim=112607862976
WAIT #2: nam='SQL*Net message from client' ela= 1093883 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112608958078
STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=996 pr=994 pw=0 time=41588 us)'
STAT #3 id=2 cnt=8332 pid=1 pos=1 obj=21580 op='TABLE ACCESS FULL T1 (cr=996 pr=994 pw=0 time=144816 us)'

Update

If you look at the values for block# in the “db file scattered read” waits you’ll notice that they appear in descending order. This looks like a tablescan running backwards – and that’s not a coincidence, because that’s what it is.

It’s obviously a good strategy to have because if you do a big tablescan it’s the blocks at the END of the table which are mostly likely to be subject to change by other sessions [unless, see comment 4, you’ve done a purge of historic data] and the longer it takes you to get there the more work you’ll have to do to get consistent read versions of the last blocks in the table, so reading the last blocks first should, generally, reduce the workload – and the risk of ORA-01555: snapshot too old. Strangely it’s not documented – but it’s been around for years – at least since 10.2.0.5, if not earlier releases of 10g, through event 10460.

The topic came up in a conversation on the Oracle-L list server a few years ago, with Tanel Poder supplying the event number, but I had forgotten about it until I rediscovered the thread by accident a little while ago.

It’s not a supported feature, of course – but if you run into serious performance problems with tablescans doing lots of work with the undo tablespace (physical reads, lots of undo records applied for consistent read, etc.) while a lot of update activity is going on, then have a chat with Oracle support to see if it’s an allowed workaround.

 

 


Wrong Results ?

Fri, 2016-03-11 03:18

I gather that journalistic style dictates that if the headline is a question then the answer is no. So, following on from a discussion of possible side effects of partition exchange, let’s look at an example which doesn’t involve partitions.  I’ve got a schema that holds nothing by two small, simple heap tables, parent and child, (with declared primary keys and the obvious referential integrity constraint) and I run a couple of very similar queries that produce remarkably different results:


select
        par.id      parent_id,
        chi.id      child_id,
        chi.name    child_name
from
        parent  par,
        child   chi
where
        chi.id_p = par.id
order by
        par.id, chi.id
;

 PARENT_ID   CHILD_ID CHILD_NAME
---------- ---------- ----------
         1          1 Simon
         1          2 Sally
         2          1 Janet
         2          2 John
         3          1 Orphan

5 rows selected.

Having got this far with my first query I’ve decided to add the parent name to the report:


select
        par.id      parent_id,
        par.name    parent_name,
        chi.id      child_id,
        chi.name    child_name
from
        parent  par,
        child   chi
where
        chi.id_p = par.id
order by
        par.id, chi.id
;

 PARENT_ID PARENT_NAM   CHILD_ID CHILD_NAME
---------- ---------- ---------- ----------
         1 Smith2              1 Simon
         1 Smith               1 Simon
         1 Smith2              2 Sally
         1 Smith               2 Sally
         2 Jones               1 Janet
         2 Jones               2 John

6 rows selected.

How could adding a column to the select list result in one child row disappearing and two child rows being duplicated; and is this a bug ?

To avoid any confusion, here’s the complete script I used for creating the schema owner, in 11.2.0.4, with no extra privileges granted to PUBLIC:


create user u1
        identified by u1
        default tablespace test_8k
        quota unlimited on test_8k
;

grant
        create session,
        create table
to
        u1
;


Update

It didn’t take long for a couple of people to suggest that the oddity was the consequence of constraints that had not been enabled and validated 100% of the time, but the suggestions offered were a little more convoluted than necessary. Here’s the code I ran from my brand new account before running the two select statements:


create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
        rely disable novalidate
)
;

create table child(
        id_p    number(4)
                constraint chi_fk_par
                references parent
                on delete cascade
                rely disable novalidate,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id)
                rely disable novalidate
)
;

insert into parent values (1,'Smith');
insert into parent values (1,'Smith2');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Janet');
insert into child values(2,2,'John');

insert into child values(3,1,'Orphan');

commit;

begin
        dbms_stats.gather_table_stats(user,'child');
        dbms_stats.gather_table_stats(user,'parent');
end;
/


In a typical data warehouse frame of mind I’ve added plenty of constraints, but left them all disabled and novalidated, but told Oracle to rely on them for optimisation strategies. This means all sorts of incorrect data could get into the tables, with all sorts of unexpected side effects on reporting. The example above shows duplicates on primary keys (and if you checked the table definition you’d find that the primary key columns were nullable as well), child rows with no parent key.

In fact 11g and 12c behave differently – the appearance of the Orphan row in the first sample query is due, as Chris_cc pointed out in the first comment, to the optimizer deciding that it could use join elimination because it was joining to a single-column primary key without selecting any other columns from the referenced table. In 12c the optimizer doesn’t use join elimination for this query, so both queries have the same (duplicated) output.

Update:

Make sure you read the articles linked to by Dani Schneider’s comment below, and note especially the impact on the query_rewrite_integrity parameter.


Wrong Results

Tue, 2016-03-08 12:57

Just in – a post on the Oracle-L mailing lists asks: “Is it a bug if a query returns one answer if you hint a full tablescan and another if you hint an indexed access path?” And my answer is, I think: “Not necessarily”:


SQL> select /*+ full(pt_range)  */ n2 from pt_range where n1 = 1 and n2 = 1;

        N2
----------
         1
SQL> select /*+ index(pt_range pt_i1) */ n2 from pt_range where n1 = 1 and n2 = 1;

        N2
----------
         1
         1

The index is NOT corrupt.

The reason why I’m not sure you should call this a bug is that it is a side effect of putting the database into an incorrect state. You might have guessed from the name that the table is a (range) partitioned table, and I’ve managed to get this effect by doing a partition exchange with the “without validation” option.


create table t1 (
        n1      number(4),
        n2      number(4)
);

insert into t1
select  rownum, rownum
from    all_objects
where   rownum <= 5
;

create table pt_range (
        n1      number(4),
        n2      number(4)
)
partition by range(n1) (
        partition p10 values less than (10),
        partition p20 values less than (20)
)
;

insert into pt_range
select
        rownum, rownum
from
        all_objects
where
        rownum <= 15
;
create index pt_i1 on pt_range(n1,n2);

begin
        dbms_stats.gather_table_stats(
                ownname    => user,
                tabname    => 'T1',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname    => user,
                tabname    => 'PT_RANGE',
                method_opt => 'for all columns size 1'
        );
end;
/

alter table pt_range
exchange partition p20 with table t1
including indexes
without validation
update indexes
;

The key feature (in this case) is that the query can be answered from the index without reference to the table. When I force a full tablescan Oracle does partition elimination and looks at just one partition; when I force the indexed access path Oracle doesn’t eliminate rows that belong to the wrong partition – though technically it could (because it could identify the target partition by the partition’s data_object_id which is part of the extended rowid stored in global indexes).

Here are the two execution plans (from 11.2.0.4) – notice how the index operation has no partition elimination while the table operation prunes partitions:


select /*+ full(pt_range)  */ n2 from pt_range where n1 = 1 and n2 = 1

---------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |       |     2 (100)|          |       |       |
|   1 |  PARTITION RANGE SINGLE|          |     1 |     6 |     2   (0)| 00:00:01 |     1 |     1 |
|*  2 |   TABLE ACCESS FULL    | PT_RANGE |     1 |     6 |     2   (0)| 00:00:01 |     1 |     1 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("N1"=1 AND "N2"=1))


select /*+ index(pt_range pt_i1) */ n2 from pt_range where n1 = 1 and n2 = 1

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |       |       |     1 (100)|          |
|*  1 |  INDEX RANGE SCAN| PT_I1 |     1 |     6 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("N1"=1 AND "N2"=1)


Note: If I had a query that did a table access by (global) index rowid after the index range scan it WOULD do partition elimination and visit just the one partition – never seeing the data in the wrong partition.

So is it a bug ? You told Oracle not to worry about bad data – so how can you complain if it reports bad data.

Harder question – which answer is the “right” one – the answer which shows you all the data matching the query, or the answer which shows you only the data that is in the partition it is supposed to be in ?


Pages