Jonathan Lewis

Subscribe to Jonathan Lewis feed Jonathan Lewis
Just another Oracle weblog
Updated: 10 hours 42 min ago

DDL logging

Mon, 2016-04-25 07:05

I was presenting at the UKOUG event in Manchester on Thursday last week (21st April 2016), and one of the sessions I attended was Carl Dudley’s presentation of some New Features in 12c. The one that caught my eye in particular was “DDL Logging” because it’s a feature that has come up fairly frequently in the past on OTN and other Oracle forums.

So today I decided to write a brief note about DDL Logging – and did a quick search of my blog to see if I had mentioned it before: and I found this note that I wrote in January last year but never got around to publishing – DDL Logging is convenient, but doesn’t do the one thing that I really want it to do:

DDL Logging – 12c

One of the little new features that should be most welcome in 12c is the ability to capture all DDL executed against the database. All it takes is a simple command (if you haven’t set the relevant parameter in the parameter file):

alter system set enable_ddl_logging = true;

All subsequent DDL will be logged to two different places (in two formats)

  • $ORACLE_BASE/diag/rdbms/{database}/{instance}/log/ddl/log.xml
  • $ORACLE_BASE/diag/rdbms/{database}/{instance}/log/ddl_{instance}.log

Unfortunately the one thing I really wanted to see doesn’t appear – probably because it doesn’t really count as DDL –  it’s the implicit DDL due to inserting into not-yet-existing partitions of an interval partitioned table.

Note: If you’re using a container database with pluggable databases then the DDL for all the pluggable databases goes into the same log file.

Update – Licensing

The following text in the Oracle 12c Database Licensing document has just been brought to my attention:

Licensed Parameters
The init.ora parameter ENABLE_DDL_LOGGING is licensed as part of the Database Lifecycle Management Pack when set to TRUE. When set to TRUE, the database reports schema changes in real time into the database alert log under the message group schema_ddl. The default setting is FALSE.

The licensing document is also linked to from the 12c online html page for the parameter.

The 11g parameter definition makes no mention of licensing, and the 11g “New Features” manual don’t mention the feature at all, but the parameter does get a special mention in the 11g licensing document where it is described as being part of the Change Management Pack.

init.ora Parameters
The use of the following init.ora parameter is licensed under Oracle Change Management Pack:
■ ENABLE_DDL_LOGGING: when set to TRUE (default: FALSE)

 


Column Groups

Wed, 2016-04-20 03:07

Patrick Jolliffe alerted the Oracle-L list to a problem that appears when you combine fixed length character columns (i.e. char() or nchar())  with column group statistics. The underlying cause of the problem is the “blank padding” semantics that Oracle uses by default to compare varchar2 with char, so I’ll start with a little demo of that. First some sample data:


rem     Script:         col_group_char_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Apr 2016

execute dbms_random.seed(0)

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        cast(chr(trunc(dbms_random.value(1,6))+64) as char(1))  c1,
        cast(chr(trunc(dbms_random.value(1,6))+64) as char(2))  c2,
        cast('X' as varchar2(2))                                v2
from
        generator       v1
where
        rownum <= 5 * 5 * 10
;

insert into t1(c1, c2, v2)
select  'X', 'X', 'X'
from    t1
;

update t1 set v2 = c2;
commit;


The little demos I’m going to report here don’t use all the data in this table – there are several other tests in the script that I won’t be reporting – so I’ll just point out that there are 500 rows in the table, half of them have ‘X’ in all three columns, and half of them have a uniform distribution of the letters ‘A’ to ‘E’ in every column.

  • Column c1 is declared as char(1) – so it will hold the data exactly as it was inserted by the script.
  • Column c2 is declared as char(2) – so even though the script apparently inserts a character string of length 1, this will be padded with a space to two characters before being stored.

Now we can create some stats – in particular a frequency histogram on the c2 column – and check the cardinality estimates for a couple of queries:

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 254'
        );
end;
/

set autotrace traceonly explain

prompt  ==================
prompt  c2 without padding
prompt  ==================

select  *
from    t1
where   c2 = 'X'
;

prompt  ================
prompt  c2 with padding
prompt  ================

select  *
from    t1
where   c2 = 'X '
;

set autotrace off

The first query compares c2 with the single character ‘X’, the second compares it with the two-character string ‘X ‘. But since the comparison is with a char(2) column the optimizer pads the first constant with spaces, and both queries end up predicting the same cardinality:


==================
c2 without padding
==================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C2"='X')

================
c2 with padding
================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C2"='X ')


Note that both queries predict the 250 rows where (we know) c2 = ‘X ‘; even though the predicate sections suggest the queries are looking for different data sets. This IS the expected behaviour.

Now let’s make things more complex – we’ll add the predicate “and c1 = ‘X'” to both queries but we’ll create a column group with histogram on (c1, c2) before checking the plans. Again we expect both versions of the new query to predict the same volume of data and (in fact) to produce a perfect prediction because we have so few rows and so few distinct combinations that we should get a perfect frequency histogram:


begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1 for columns (c1, c2) size 254'
        );
end;
/

prompt  ========================
prompt  (c1, c2) without padding
prompt  ========================

select  *
from    t1
where   c1 = 'X' and c2 = 'X'
;

prompt  =====================
prompt  (c1, c2) with padding
prompt  =====================

select  *
from    t1
where   c1 = 'X' and c2 = 'X '
;

And here are the execution plans:

========================
(c1, c2) without padding
========================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     2 |    16 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    16 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C1"='X' AND "C2"='X')

=====================
(c1, c2) with padding
=====================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C1"='X' AND "C2"='X ')


If we run the query where the literal is padded with spaces to the correct length (2nd query) then the prediction is correct. But if we haven’t padded the literal the prediction is wrong; the estimate is the one the optimizer would have used for “value not found in histogram”.

I think what’s happening is that the optimizer doesn’t “remember” that the literal is being compared with a char() when making the call to sys_op_combined_hash() that it uses for calculating column group stats so it doesn’t pad the column with spaces before calling the function and, as a consequence, the hashed value isn’t the one it should be using.

I’ve run this test on 11.2.0.4 and 12.1.0.2 – the effects are the same on both versions.

Bottom Line:

Be careful about how you use char() data types in your code, and be especially careful if you think you’re going to be creating column group stats involving char() columns – and then remember that 12c may generate column group stats automatically for you. If you use char() columns you will have to ensure that predicates using literal values should have those values padded to the correct number of spaces if you want to have the best possible chance of getting the correct execution plans.

 


Set Operations

Fri, 2016-04-01 08:20

A recent post on the OTN database forum highlights a couple of important points ideas for optimising SQL. There are: (a) is there a logically equivalent way of stating the SQL and (b) is there a different “natural language” way of posing the problem.

The posting starts with a query, part of an execution plan, and a request to “get rid of the tablescan”. I guessed originally that the query came from an 11g instance, and the OP gave us some code to create the tables and indexes, so I’ve modelled the tables to get the indicated plan (then filled in the original numbers). This is the query, and my cosmetically adjusted version of the plan output that the OP probably got:


SELECT a.hotel_code
  FROM lf_hotel_temp a
WHERE a.service_id = : p_service_id
       AND (NOT EXISTS (SELECT *
          FROM lf_ts_roomtype_properties b
         WHERE a.hotel_code = b.hotel_code)
        or NOT EXISTS (SELECT *
          FROM lf_gta_roomtype_properties b
         WHERE a.hotel_code = b.hotel_code) 
       or  NOT EXISTS (SELECT *
          FROM lf_hb_roomtype_properties b
         WHERE a.hotel_code = b.hotel_code))

-------------------------------------------------------------------------------
| Id  | Operation          | Name                     | Rows  |  Bytes | Cost |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                          | 12613 | 113517 |  135 |
|*  1 |  FILTER            |                          |       |        |      |
|*  2 |   TABLE ACCESS FULL| LF_HOTEL_TEMP            | 88433 | 795897 |  135 |
|*  3 |   INDEX RANGE SCAN | LF_TS_ROOMTYPE_PROP_IDX  |     1 |      7 |    1 |
|*  4 |   INDEX RANGE SCAN | LF_GTA_ROOMTYPE_PROP_IDX |     1 |      9 |    1 |
|*  5 |   INDEX RANGE SCAN | LF_HB_ROOMTYPE_PROP_IDX  |     2 |     14 |    3 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( NOT EXISTS (SELECT 0 FROM "LF_TS_ROOMTYPE_PROPERTIES" "B" WHERE
              "B"."HOTEL_CODE"=:B1) OR  NOT EXISTS (SELECT 0 FROM "LF_GTA_ROOMTYPE_PROPERTIES" "B"
              WHERE "B"."HOTEL_CODE"=:B2) OR  NOT EXISTS (SELECT 0 FROM "LF_HB_ROOMTYPE_PROPERTIES"
              "B" WHERE "B"."HOTEL_CODE"=:B3))
   2 - filter("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
   3 - access("B"."HOTEL_CODE"=:B1)
   4 - access("B"."HOTEL_CODE"=:B1)
   5 - access("B"."HOTEL_CODE"=:B1)

We were told in the original posting that there’s a primary key on lf_hotel_temp declared on (hotel_code, service_id), and we were given the definitions, sizes, and index declarations of all the table in a follow-up posting. It turns out that lf_hotel_temp consists of just those two columns and holds 278,000 rows: the optimizer’s estimate for the number of rows identified by a single service_id is over 88,000, and the nature of the query tells us that the optimizer would have to examine every one of those rows to check if it satisfied any of the three subqueries.

So how might Oracle access the rows ?  Given that the only columns used will all be in the primary key index (which implies not null constraints) there are four basic options: tablescan, index fast full scan, index full scan, and index skip scan. Given the most likely data content (i.e. lots of different hotel_codes), we can assume the skip scan would be a very bad idea. We can be sure that an index fast full scan will be lower cost than an index full scan – for anything except tiny indexes. Ultimately the question is really “why a tablescan instead of an index fast full scan?”. As I pointed out, though, the table consists of just those two columns – which means it’s perfectly reasonable for the index to be larger than the table as each entry of the index will consist of the two columns AND a rowid.

The first interesting bit

The question of why the access to lf_hotel_temp was by tablescan rather than some indexed method isn’t really interesting. The interesting bit is how (in principle) we might make the plan more efficient (if it really needs it); and this leads to two key, and general purpose, observations. As Andrew Sayer pointed out on the thread, we have a compound predicate:

    (not exists A OR not exists B OR not exists C)

and this is logically equivalent to

   not (exists A AND exists B AND exists C)

If we rewrite the query to reflect this equivalence could the optimizer find a different, better way of executing it:


select  /*+ dynamic_sampling(0) */
        a.hotel_code
from    lf_hotel_temp a
where
        a.service_id = :p_service_id
and     not(
                exists (
                        select  null
                        from    lf_ts_roomtype_properties ts
                        where   ts.hotel_code = a.hotel_code
                )
            and exists (
                        select  null
                        from    lf_gta_roomtype_properties gta
                        where   gta.hotel_code = a.hotel_code
                )
            and exists (
                        select  null
                        from    lf_hb_roomtype_properties hb
                        where   hb.hotel_code = a.hotel_code
                )
        )
;

Of course, I didn’t have the original data; so I copied the DDL supplied in the OTN thread and added a little DML to insert a few rows in the tables. The data I used looked like this:


insert into lf_hotel_temp (hotel_code, service_id) values ('A',1);
insert into lf_hotel_temp (hotel_code, service_id) values ('B',1);
insert into lf_hotel_temp (hotel_code, service_id) values ('C',1);
insert into lf_hotel_temp (hotel_code, service_id) values ('D',1);

-- insert into lf_ts_roomtype_properties values ( 'A','x','x',0,1,'x');
insert into lf_ts_roomtype_properties values ( 'B','x','x',0,1,'x');
insert into lf_ts_roomtype_properties values ( 'C','x','x',0,1,'x');
insert into lf_ts_roomtype_properties values ( 'D','x','x',0,1,'x');

-- insert into lf_gta_roomtype_properties values ( 'A','x','x',0,1,'x');
-- insert into lf_gta_roomtype_properties values ( 'B','x','x',0,1,'x');
insert into lf_gta_roomtype_properties values ( 'C','x','x',0,1,'x');
insert into lf_gta_roomtype_properties values ( 'D','x','x',0,1,'x');

-- insert into lf_hb_roomtype_properties values ( 'A','x','x',0,1,'x');
-- insert into lf_hb_roomtype_properties values ( 'B','x','x',0,1,'x');
-- insert into lf_hb_roomtype_properties values ( 'C','x','x',0,1,'x');
insert into lf_hb_roomtype_properties values ( 'D','x','x',0,1,'x');
commit;

It’s possible that with different data volumes you’d get different execution plans, but in 11g the optimizer transformed my query back into the original form – in other words it recognised the equivalence of “not (A and B and C)” and rewrote it as “(not A or not B or not C)” !

However, I also have 12c available, and I had created a script to build a model, so I ran the test on 12c. Both versions of the query produced the following plan:


----------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                            |     1 |  2027 |     8  (13)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI |                            |     1 |  2027 |     8  (13)| 00:00:01 |
|   2 |   VIEW                | VW_SQ_1                    |    82 |   984 |     6   (0)| 00:00:01 |
|*  3 |    HASH JOIN SEMI     |                            |    82 |  2952 |     6   (0)| 00:00:01 |
|*  4 |     HASH JOIN         |                            |    82 |  1968 |     4   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| LF_GTA_ROOMTYPE_PROPERTIES |    82 |   984 |     2   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL| LF_HB_ROOMTYPE_PROPERTIES  |    82 |   984 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL | LF_TS_ROOMTYPE_PROPERTIES  |    82 |   984 |     2   (0)| 00:00:01 |
|*  8 |   INDEX FULL SCAN     | LF_HOTEL_TEMP_PK           |   101 |   198K|     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("VW_COL_1"="A"."HOTEL_CODE")
   3 - access(SYS_OP_MAP_NONNULL("HB"."HOTEL_CODE")=SYS_OP_MAP_NONNULL("TS"."HOTEL_CODE"))
   4 - access(SYS_OP_MAP_NONNULL("HB"."HOTEL_CODE")=SYS_OP_MAP_NONNULL("GTA"."HOTEL_CODE"))
   8 - access("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
       filter("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))

Ignore the numbers (I hadn’t collected stats, which is why I added the /*+ dynamic_sampling(0) */ hint – with stats in place 12c produced the FILTER plan that 11g had produced) the key feature is that Oracle has managed to transform my three filter subqueries into a single join subquery and then transformed the resulting subquery into an anti-join. It’s a pretty amazing transformation – the optimizer did it automatically in 12c, but if you are aware of the logical equivalence then you may find cases where you can turn “OR’s” into “AND’s” and help the optimizer to find transformations that it can’t find automatically.

The second interesting bit

If you think about the meaning behind the query (prompted, perhaps, by the logical equivalence described above) you might rephrase the question as “find me the hotel codes that fail to appear in all three related tables” – in English this is ambigious and open to catastrophic mis-interpretation so you might have another go and say “find me the hotel codes that appear in every one of the three related tables – those are the hotel codes I don’t want”. This latter expression, of course, is exactly what Oracle is doing by joining the three tables and then doing the “not exists”/anti-join against the result. Obviously you could translate the new English form into SQL by hand, with a three table join in a “not exists” subquery.

I actually took a different approach (which might, or might not, be efficient – depending on the actual data and indexes).  I translated the new English statement into the following:


select  /*+ dynamic_sampling(0) */
        hotel_code
from    lf_hotel_temp
where   service_id = :p_service_id
minus   (
        select  hotel_code
        from    lf_ts_roomtype_properties
        where   hotel_code is not null
        intersect
        select  hotel_code
        from    lf_gta_roomtype_properties
        where   hotel_code is not null
        intersect
        select  hotel_code
        from    lf_hb_roomtype_properties
        where   hotel_code is not null
        )
;

The three way intersection gets me the list of hotels that appear in all three tables; the minus operator takes the list of hotel with the correct service_id and eliminates from it the hotels that appear in the intersection – giving me the result I want.

For my tiny data set, this is the plan I got:

--------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                          |     1 |  2159 |     8  (50)| 00:00:01 |
|   1 |  MINUS                |                          |       |       |            |          |
|   2 |   SORT UNIQUE NOSORT  |                          |     1 |  2015 |     2  (50)| 00:00:01 |
|*  3 |    INDEX FULL SCAN    | LF_HOTEL_TEMP_PK         |     1 |  2015 |     1   (0)| 00:00:01 |
|   4 |   INTERSECTION        |                          |       |       |            |          |
|   5 |    INTERSECTION       |                          |       |       |            |          |
|   6 |     SORT UNIQUE NOSORT|                          |     4 |    48 |     2  (50)| 00:00:01 |
|*  7 |      INDEX FULL SCAN  | LF_TS_ROOMTYPE_PROP_IDX  |     4 |    48 |     1   (0)| 00:00:01 |
|   8 |     SORT UNIQUE NOSORT|                          |     4 |    48 |     2  (50)| 00:00:01 |
|*  9 |      INDEX FULL SCAN  | LF_GTA_ROOMTYPE_PROP_IDX |     4 |    48 |     1   (0)| 00:00:01 |
|  10 |    SORT UNIQUE NOSORT |                          |     4 |    48 |     2  (50)| 00:00:01 |
|* 11 |     INDEX FULL SCAN   | LF_HB_ROOMTYPE_PROP_IDX  |     4 |    48 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
       filter("SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
   7 - filter("HOTEL_CODE" IS NOT NULL)
   9 - filter("HOTEL_CODE" IS NOT NULL)
  11 - filter("HOTEL_CODE" IS NOT NULL)

Important note: I am not claiming that this use of set operators will be more efficient than a filter subquery or anti-join/semi-join approach, performance ultimately depends on the volume and patterns in the data combined with the available indexing. In this case you can almost see the classic performance compromise that we often see in Oracle – even in the trade-off between something as simple as choosing between a hash join and a nested loop join – should we operate this query as a tiny number of “bulk” operations, or as a (potentially) large number of tiny, high-precision operations.

If the original query was spending all it’s time on CPU running lots of subqueries, or doing lots of single block random I/Os because of the random ordering of the subqueries, then perhaps a couple of brute force “db file parallel read” index full scans would be a friendlier use of the available resources, run more quickly, and have less impact on every other user.

 


Index Usage

Tue, 2016-03-29 04:53

There are some questions about Oracle that are like the mythical Hydra – you think you’ve killed it, but for every head you cut off another two grow. The claim that “the optimizer will switch between using an index and doing a tablescan when you access more than X% of the data” re-appeared on the OTN database forum a little while ago – it doesn’t really matter what the specific value of X was – and it’s a statement that needs to be refuted very firmly because it’s more likely to cause problems than it is to help anyone understand what’s going on.

At a very informal level we may have an intuitive feeling that for a “precise” query accessing a “small” amount of data an indexed access path should make sense while for a “big” query accessing a “large” amount of data we might expect to see a tablescan, but any attempt to give a meaning to “small” and “large” that is both general purpose and strictly quantified will be wrong: there are too many variables involved.

Just as a quick demonstration of how badly we can be misled by a simple numeric claim here’s a quick test I created on a newly installed instance of 11.2.0.4, which I happened to set up with a locally defined tablespace using uniform extents of 1MB using the default 8KB blocksize but with manual (freelist) space management:


rem
rem     Script:   index_usage_pct.sql
rem     Dated:    March 2016
rem     Author:   J P Lewis
rem

drop table t1;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual 
        connect by 
                level <= 1e4
)
select
        cast(rownum as number(8,0))                              id,
        cast(trunc(dbms_random.value(0,1e6)) as number(8,0))     n1,
        lpad(rownum,6,'0')              v1,
        rpad('x',10,'x')                small_vc,
        rpad('x',151,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;
begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t1_i1 on t1(id);

spool index_usage_pct.lst

select  num_rows, blocks, avg_row_len, round(num_rows/blocks) rows_per_block
from    user_tables
where   table_name = 'T1'
;

set autotrace on explain
select count(v1) from t1 where id between 1 and 245000;
set autotrace off

spool off

I’ve created a table with 1 million rows; the rows are about 180 bytes long (you’ll see the sizes a few lines further down the page), so it’s not an unreasonable model for lots of tables in typical systems – if you want to experiment further you can adjust the rpad() in the padding column; and I’ve created an index on a sequentially  (rownum) generated column. My call to autotrace will produce a truthful execution plan for the query supplied – there’s no risk of unexpected type conversion and no problems from bind variable peeking. As you can easilky infer, my query will access 245,000 rows in the table of 1,000,000 – nearly a quarter of the table. Would you expect to see Oracle use the index ?

Here’s the output from the script on MY brand new database, instance, and tablespace:


  NUM_ROWS     BLOCKS AVG_ROW_LEN ROWS_PER_BLOCK
---------- ---------- ----------- --------------
   1000000      25642         180             39

1 row selected.


 COUNT(N1)
----------
    245000

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 269862921

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    10 |  6843   (1)| 00:01:23 |
|   1 |  SORT AGGREGATE              |       |     1 |    10 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |   245K|  2392K|  6843   (1)| 00:01:23 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |   245K|       |   553   (1)| 00:00:07 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID">=1 AND "ID"<=245000)


There are no tricks involved here, no cunning fiddling with data structures or parameters – this is just a simple, straightforward, test.

Of course the result is probably a little counter-intuitive; 24.5% of the data seems a lot for the optimizer to pick an index. There are many reasons for this, the first being that the data is very well clustered relative to the index – the index’s clustering_factor is the smallest it could be for a B-tree indexing every row in this table.

Another important feature, though, is that I haven’t done anything with the system statistics so the optimizer was using various default values which tell it that a multiblock read will be quite small (eight blocks) and a lot slower than a single block read (26 ms vs. 12 ms). One simple change that many people might have made during or shortly after installation (though it shouldn’t really be done in any modern version of Oracle) is to set the db_file_multiblock_read_count parameter to 128 – with just this change the optimizer would assume that a multiblock read really would be 128 blocks, but that it would now take 266 ms. That means the optimizer will assume that the read will be ten times slower than it was, but will read 32 times as much data – a fairly significant relative improvement thanks to which the access path for my initial query will switch to a full tablescan and won’t switch back to an index range scan until I reduce the range from 245,000 to something like 160,000.

I can go further, of course. With a few background queries running to exercise the database I executed the dbms_stats.gather_system_stats() procedure with the ‘start’ and ‘stop’ options to collect some figures about the hardware and expected workload. This gave me the following results,  interpreted from the sys.aux_stats$ table:


MBRC       :126
MREADTIM   :0.902
SREADTIM   :0.386
CPUSPEED   :976

With the optmizer using these figures to compare the relative speed and size of single and multiblock reads I had to reduce my selected range to roughly 51,000 before the optimizer would choose the index range scan.

I could go on to demonstrate the effects of the dbms_resource_manager.calibrate_io procedure and the effects of allowing different degrees of parallelism with different system stats, but I think I’ve probably made the point that there’s a lot of variation in the break point between index range scans and tablescans EVEN when you don’t change the data. With this very well ordered (perfect clustering_factor) data I’ve seen the break point vary between 51,000 rows and 245,000 rows (5% and 25%).

And finally …

Let’s just finish with a last (and probably the most important) variation:  changing the pattern in the data we want from perfectly clustered to extremely scattered. If you check the query that generated the data you’ll see that we can do this by creating an index on column n1 instead of column id, and changing the where clause in the test query to n1 between 1 and 4500 (which, in my case, returned slightly more that 4,500 rows thanks to a small amount of duplication generated by the call to dbms_random.value()). At slightly under 0.5% of the data (and with my most recent settings for the system statistics) the optimizer chose to use a tablescan.

Remember, there are many factors involved in the optimizer choosing between a tablescan and an index range scan and one of the most significant factors in the choice is the (apparent) clustering of the data so, if you haven’t come across it before, you should examine the “table_cached_blocks” option that appeared in 11.2.0.4 for the procedure dbms_stats.set_table_prefs() as this allows you to give the optimizer a better idea of how well your data really is clustered.

Addendum (April 2016)

Following on from the demonstration of how changes in parameters, patterns and statistics can make a difference in what we (or the optimizer) might consider a “small” amount of data and whether an indexed access path would be appropriate, it’s worth mentioning that the Exadata technologies of smart scans and hybrid columnar compression and Oracle’s latest technology of In-Memory Colum Store do not change the way you think about indexes – they only change the (unspecifiable) volume at which an index ceases to be the better option to use.

 


Index Speculation

Sun, 2016-03-20 17:32

There’s a current question on the OTN database forum as follows (with a little cosmetic adjustment):

I have a request for 3 indices as shown below. Does the 1st index suffice for 2 and 3?  Do I need all 3?

  • create index idx_atm_em_eff_ver_current_del on atm_xxx_salary (employee_key, effective_dt, salary_version_number, is_current, is_deleted);
  • create index idx_atm_em_ver_current on atm_xxx_salary (employee_key, salary_version_number, is_current);
  • create index .idx_atm_sal_current_del on atm_xxx_salary (employee_key, is_deleted, is_current);

In the absence of any information about the data and the application the correct answer is: “How could we possibly tell?”

On the other hand there’s plenty of scope for intelligent speculation, and that’s an important skill to practise because when we’re faced with a large number of options and very little information we need to be able to make best-guess choices about which ones are most likely to be worth the effort of pursuing in detail. So if we have to make some guesses about this table and the set of indexes shown, are there any reasonable guesses we might make.

I’ve highlighted the table name and leading column for the first index. The table seems to about salary and the leading column seems to identify an employee. In fact we see that all three indexes start with the employee_key and that may be what prompted the original question. Previous (“real-world”) experience tells me that employees are, generally, paid a salary and that salaries are likely to change (usually upwards) over time, and I note that another column in one of these indexes is effective_dt (date ?), and a third column (appearing in two of the indexes) is is_current.

This looks like a table of employee salaries recording their current and historic salaries, engineered with a little redundant information to make it easy to find the current salary. (Perhaps there’s a view of current_salary defined as is_current = ‘Y’ and is_deleted = ‘N’.)

It’s harder to speculate with any confidence on the columns is_deleted and salary_version_number;  why would a salary row be marked as deleted – is this something that happens when an employee leaves or an employee is deleted (or, following the pattern, has their is_deleted flag set to ‘Y’); why does a salary have a version number – does the table contain it’s own audit trail of errors and corrections, perhaps a correction is effected by marking the incorrect entry as deleted and incrementing its version number to generate the version number for the correct entry. Possibly the notional primary key of the table is (employee_key, effective_dt, is_deleted, salary_version_number).

The level of complexity surrounding these two columns could send further speculation in completely the wrong direction, but let’s follow the line that these two columns see very little action – let’s assume that most of the data is not “deleted” and virtually none of the data needs “versioning”. How does this assumption help us with the original question.

The largest employer in the world is the America Department of Defence with 3.2 million employees (following by the People’s Liberation Army of China with only 2.3 million employees), so an “employees” table is not really likely to be very big. How often does an employee have a salary review and change ? Would once per year be a reasonable figure to pluck from the air ? How many employess stay at the same company for 40 years – how many rows per employee would you end up with, and how scattered would they be through the salary table ?

Under any reasonable estimate it seems likely that if you created the first index (5 columns) then all the salary rows for a given employee are likely to be contained in a single leaf block, so if all the searches were driven by employee then that single index would allow exactly the correct set of table rows to be identified from one index leaf block access plus a little extra CPU.

Of course it’s possible that, with different circumstances, the size and clustering factor of the first index would be so much greater than the size and clustering factors of the other two that a query that would use one of the smaller indexes won’t use the larger index -but in this case the most significant contributor to the optimizer’s cost is likely to be the clustering_factor and given our assumption of the slow appearance over time of the new salaries for an employee the clustering factor of all three indexes is likely to be the same (probably very similar to the number of rows in the salary table).

Having got this far, it’s worth considering whether or not the salary table should actually be an index-organized table – it looks like an obvious candidate; how many other columns are there likely to be in a salary table ? Of course it’s worth thinking about other queries that might access a salary table without reference to the employees table at that point, perhaps a secondary index on (is_current, employee_key) might be appropriate, but in the absence of any other information we’ve reached the point where speculation needs to be backed up by some hard facts.

Bottom Line:

I wouldn’t guarantee that the first index makes the other two indexes redundant but it seems highly likely that it should and it’s probably worth spending some time looking at the requirements and numbers a little more closely – especially if you’re the US DoD or the Chinese People’s Liberation Army.

 

 

 

 


Hinting

Thu, 2016-03-17 08:10

A posting on the OTN database forum a few days ago demonstrated an important problem with hinting – especially (though it didn’t come up in the thread)  in the face of upgrades. A simple query needed a couple of hints to produce the correct plan, but a slight change to the query seemed to result in Oracle ignoring the hints. The optimizer doesn’t ignore hints, of course, but there are many reasons why it might have appeared to so I created a little demonstration of the problem – starting with the following data set:

rem
rem     Script:  OTN_DAG.sql
rem     Author:  J.P.Lewis
rem     Dated:   March 2016
rem

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from
        generator       g1,
        generator       g2
where
        rownum <= 24000
;

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc((rownum-1)/15)    n1,
        trunc((rownum-1)/15)    n2,
        rpad(rownum,180)        v1
from    generator
where
        rownum <= 3000
;
begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

(Ignore the silliness of the way I’ve created the data, it’s a consequence of using my standard template).

For every row in t2 there are 8 rows in t1, so when I join t1 to t2 on n2 it would obviously be sensible for the resulting hash join to use the t2 (smaller) data set as the build table and the t1 data set as the probe table, but I’m going to pretend that the optimizer is making an error and needs to be hinted to use t1 as the build table and t2 as the probe. Here’s a query, and execution plan, from 11.2.0.4:

explain plan for
select
        /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */
        count(t1.n2)
from
        t1, t2
where
        t2.n2 = t1.n2
and     t1.n1 = 15
and     t2.n1 = 15
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    16 |    97   (3)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    16 |            |          |
|*  2 |   HASH JOIN         |      |    20 |   320 |    97   (3)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   120 |   960 |    85   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N2"="T1"."N2")
   3 - filter("T1"."N1"=15)
   4 - filter("T2"."N1"=15)

As you can see, the optimizer has obeyed my hinting – the join order is t1 -> t2, I’ve used a hash join to join t2, and Oracle hasn’t swapped the join inputs despite the fact that the t1 data set is larger than the t2 data set (960 bytes vs. 120 bytes) which should have persuaded it to swap. (Technically, the leading() hint seems to block the swap of the first two tables anyway – see the “Special Case” section at this URL, but I’ve included it the no_swap_join_inputs() anyway to make the point explicit.)

So now, instead of just count n2, we’ll modify the query to count the number of distinct values for n2:


explain plan for
select
        /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */
        count(distinct t1.n2) 
from
        t1, t2
where
        t2.n2 = t1.n2
and     t1.n1 = 15
and     t2.n1 = 15
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |          |     1 |    13 |            |          |
|   2 |   VIEW                | VW_DAG_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |          |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |          |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T2       |    15 |   120 |    12   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T1       |   120 |   960 |    85   (3)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T2"."N1"=15)
   6 - filter("T1"."N1"=15)

Check operations 5 and 6 – Oracle has swapped the join inputs: t2 (the obvious choice) is now the build table. Has Oracle ignored the hint ? (Answer: No).
If you look at operation 2 you can see that Oracle has generated an internal view called VW_DAG_0 – this is an example of the “Distinct Aggregate” transformation taking place. It seems to be a pointless exercise in this case and the 10053 trace file seems to indicate that it’s a heuristic transformation rather than cost-based transformation (i.e. the optimizer does it because it can, not because it’s cheaper). Oracle has transformed the SQL to the following (to which I have applied a little cosmetic tidying):


SELECT  /*+ LEADING (T1) */
        COUNT(VW_DAG_0.ITEM_1) "COUNT(DISTINCTT1.N2)"
FROM    (
        SELECT  T1.N2 ITEM_1
        FROM    TEST_USER.T2 T2,TEST_USER.T1 T1
        WHERE   T2.N2=T1.N2
        AND     T1.N1=15
        AND     T2.N1=15
        GROUP BY
                T1.N2
        ) VW_DAG_0

Notice how the use_hash() and no_swap_join_input() hints have disappeared. I am slightly surprised that the leading() hint is still visible, I would have expected all three to stay or all three to disappear; regardless of that, though, the single remaining hint references an object that does not exist in the query block where the hint has been placed. The original hint has not been “ignored”, it has become irrelevant. (I’ll be coming back to an odd little detail about this transformed query a little later on but for the moment I’m going to pursue the problem of making the optimizer do what we want.)

We have three strategies we could pursue at this point. We could tell the optimizer that we don’t want it to do the transformation; we could work out the query block name of the query block that holds t1 and t2 after the transformation and direct the hints into that query block; or we could tell Oracle to pretend it was using an older version of the optimizer because that Distinct Aggregate transformation only appeared in 11.2.0.1.

You’ll notice that I used the ‘alias’ formatting command in my call to dbms_xplan.display() – this is the queryblock / alias section of the output:


Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$C33C846D
   2 - SEL$5771D262 / VW_DAG_0@SEL$C33C846D
   3 - SEL$5771D262
   5 - SEL$5771D262 / T1@SEL$1
   6 - SEL$5771D262 / T2@SEL$1

Strategy A says try adding the hint: /*+ no_transform_distinct_agg(@sel$1) */
Strategy B says try using the hints: /*+ leading(@SEL$5771D262 t1@sel$1 t2@sel$1) use_hash(@SEL$5771D262 t2@sel$1 no_swap_join_inputs(@SEL$5771D262 t2@sel$1) */
Strategy C says try adding the hint: /*+ optimizer_features_enable(‘11.1.0.7’) */

Strategies A and C (stopping the transformation) produce the following plan:


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    16 |    98   (4)| 00:00:01 |
|   1 |  SORT GROUP BY      |      |     1 |    16 |            |          |
|*  2 |   HASH JOIN         |      |    20 |   320 |    98   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   120 |   960 |    85   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N2"="T1"."N2")
   3 - filter("T1"."N1"=15)
   4 - filter("T2"."N1"=15)

Strategy B (allowing the transformation, but addressing the hints to the generated query block) produces this plan:


----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |          |     1 |    13 |            |          |
|   2 |   VIEW                | VW_DAG_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |          |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |          |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T1       |   120 |   960 |    85   (3)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T2       |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T1"."N1"=15)
   6 - filter("T2"."N1"=15)

All three Strategies have produced plans that use t1, the larger data set, as the build table. It’s hard to resist asking if it’s possible to claim that one of the three strategies is the best strategy; it’s hard to say, but I think I’d favour using the no_transform_distinct_agg() hint because it’s precisely targetted – so avoids the brute force thuggish nature of the reverting back to an old version, and avoids the (possble) fragility of needing to know a very precise query block name which (possibly) might change for some reason if the query were to be modified very slightly. The argument, of course, comes from the perspective of a friendly consultant who visits for a couple of days, gets a bit clever with your SQL, then walks away leaving you to worry about whether you understand why your SQL now works the way it does.

Upgrades

My opening comment was about the difficulty of hinting across upgrades. Imagine you had been running this count(distinct) query in 10.2.0.5, and after some experimention had found that you got the path you needed by adding the hints: /*+ leading(t1 t2)  full(t1) use_hash(t2) no_swap_join_inputs(t2) full(t2) */. This is a careful and thorough piece of hinting (and it does work, of course, in 10.2.0.5).

When the big day for upgrading to 11.2 arrives (just in time for Oracle to ends extended support, possibly) you find that this query changes its execution plan. And this is NOT a rare occurrence. I’ve said it before, and I’ll keep saying it: hinting – especially with “micro-management” hints – is undesirable in a production system. You probably haven’t done it right, and even if the hints are (broadly speaking) perfect in the current version they may be pushed out of context by a new feature in the next version.  If you’ve hinted your code you have to check every single hinted statement to make sure the hints still have the same effect on the upgrade.

This is why I produce the sound-bite (which Maria Colgan nicked): “if you can hint it, baseline it”.  If you had generated a baseline (or outline) from a query with these hints in 10g Oracle would have included the /*+ optimizer_features_enable(‘10.2.0.5’) */ hint with the functional hints, and the upgrade wouldn’t have produced a different plan.

Technically, of course, you could have remembered to add the hint to your production code – but in many cases Oracle introduces far more hints in an SQL Baseline than you might want to put into your code; and by using the SQL Baseline approach you’ve given yourself the option to get rid of the “hidden hinting” in a future version of Oracle by dropping the baseline rather than rewriting the code and (perhaps) recompiling the application.

Inevitably there are cases where setting the optimizer_features_enable backwards doesn’t rescue new from a new plan – there are probably a few cases where the internal code forgets to check the value and bypass some subroutines; more significantly there are cases where one version of Oracle will give you an efficient plan because of an optimizer bug and setting the version backwards won’t re-introduce that bug.

Footnote

I said I’d come back to the “unparsed” query that the optimizer generated from the original count(distinct) statement and the way it left the leading(t1) hint in place but lost the use_hash(t2) and no_swap_join_inputs(t2). I got curious about how Oracle would optimize that query if I supplied it from SQL*Plus – and this is the plan I got:


explain plan for
SELECT  /*+ LEADING (T1) */
        COUNT(VW_DAG_0.ITEM_1) "COUNT(DISTINCTT1.N2)"
FROM    (
        SELECT  T1.N2 ITEM_1
        FROM    TEST_USER.T2 T2,TEST_USER.T1 T1
        WHERE   T2.N2=T1.N2
        AND     T1.N1=15
        AND     T2.N1=15
        GROUP BY
                T1.N2
        ) VW_DAG_0
;

-----------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |           |     1 |    13 |            |          |
|   2 |   VIEW                | VM_NWVW_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |           |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |           |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T1        |   120 |   960 |    85   (3)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T2        |    15 |   120 |    12   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T1"."N1"=15)
   6 - filter("T2"."N1"=15)

Oracle has managed to do a transformation to this statement that it didn’t do when it first generated the statement – too much recursion, perhaps – and that floating leading(t1) hint has been squeezed back into action by a view-merging step in the optimization that got the hint back into a query block that actually contained t1 and t2!  At this point I feel like quoting cod-philosophy from the Dune trilogy: “Just when you think you understand …”

 


Quiz

Mon, 2016-03-14 16:38

Can you spot anything that might appear to be a little surprising about this (continuous) extract from a trace file ? The example is from 10.2.0.5, but 11g and 12c could produce similar results (or direct path read equivalents):


PARSING IN CURSOR #3 len=23 dep=0 uid=30 oct=3 lid=30 tim=112607775392 hv=4235652837 ad='2f647320'
select count(*) from t1
END OF STMT
PARSE #3:c=0,e=53,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=112607775385
EXEC #3:c=0,e=99,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=112607787370
WAIT #3: nam='SQL*Net message to client' ela= 9 driver id=1111838976 #bytes=1 p3=0 obj#=10443 tim=112607789931
WAIT #3: nam='db file sequential read' ela= 415 file#=5 block#=9 blocks=1 obj#=21580 tim=112607795682
WAIT #3: nam='db file scattered read' ela= 2785 file#=5 block#=905 blocks=98 obj#=21580 tim=112607801263
WAIT #3: nam='db file scattered read' ela= 2919 file#=5 block#=777 blocks=128 obj#=21580 tim=112607808280
WAIT #3: nam='db file scattered read' ela= 2066 file#=5 block#=649 blocks=128 obj#=21580 tim=112607813300
WAIT #3: nam='db file scattered read' ela= 1817 file#=5 block#=521 blocks=128 obj#=21580 tim=112607817243
WAIT #3: nam='db file scattered read' ela= 1563 file#=5 block#=393 blocks=128 obj#=21580 tim=112607820899
WAIT #3: nam='db file scattered read' ela= 1605 file#=5 block#=265 blocks=128 obj#=21580 tim=112607824710
WAIT #3: nam='db file scattered read' ela= 1529 file#=5 block#=137 blocks=128 obj#=21580 tim=112607828296
WAIT #3: nam='db file scattered read' ela= 1652 file#=5 block#=10 blocks=127 obj#=21580 tim=112607831946
FETCH #3:c=15625,e=41568,p=994,cr=996,cu=0,mis=0,r=1,dep=0,og=1,tim=112607834004
WAIT #3: nam='SQL*Net message from client' ela= 254 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607835527
FETCH #3:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,tim=112607836780
WAIT #3: nam='SQL*Net message to client' ela= 2 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607837935
WAIT #3: nam='SQL*Net message from client' ela= 14371 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607853526
=====================
PARSING IN CURSOR #2 len=52 dep=0 uid=30 oct=47 lid=30 tim=112607855239 hv=1029988163 ad='2f6c5ec0'
BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;
END OF STMT
PARSE #2:c=0,e=51,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=112607855228
WAIT #2: nam='SQL*Net message to client' ela= 4 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112607861803
EXEC #2:c=0,e=1271,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,tim=112607862976
WAIT #2: nam='SQL*Net message from client' ela= 1093883 driver id=1111838976 #bytes=1 p3=0 obj#=21580 tim=112608958078
STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=996 pr=994 pw=0 time=41588 us)'
STAT #3 id=2 cnt=8332 pid=1 pos=1 obj=21580 op='TABLE ACCESS FULL T1 (cr=996 pr=994 pw=0 time=144816 us)'

Update

If you look at the values for block# in the “db file scattered read” waits you’ll notice that they appear in descending order. This looks like a tablescan running backwards – and that’s not a coincidence, because that’s what it is.

It’s obviously a good strategy to have because if you do a big tablescan it’s the blocks at the END of the table which are mostly likely to be subject to change by other sessions [unless, see comment 4, you’ve done a purge of historic data] and the longer it takes you to get there the more work you’ll have to do to get consistent read versions of the last blocks in the table, so reading the last blocks first should, generally, reduce the workload – and the risk of ORA-01555: snapshot too old. Strangely it’s not documented – but it’s been around for years – at least since 10.2.0.5, if not earlier releases of 10g, through event 10460.

The topic came up in a conversation on the Oracle-L list server a few years ago, with Tanel Poder supplying the event number, but I had forgotten about it until I rediscovered the thread by accident a little while ago.

It’s not a supported feature, of course – but if you run into serious performance problems with tablescans doing lots of work with the undo tablespace (physical reads, lots of undo records applied for consistent read, etc.) while a lot of update activity is going on, then have a chat with Oracle support to see if it’s an allowed workaround.

 

 


Wrong Results ?

Fri, 2016-03-11 03:18

I gather that journalistic style dictates that if the headline is a question then the answer is no. So, following on from a discussion of possible side effects of partition exchange, let’s look at an example which doesn’t involve partitions.  I’ve got a schema that holds nothing by two small, simple heap tables, parent and child, (with declared primary keys and the obvious referential integrity constraint) and I run a couple of very similar queries that produce remarkably different results:


select
        par.id      parent_id,
        chi.id      child_id,
        chi.name    child_name
from
        parent  par,
        child   chi
where
        chi.id_p = par.id
order by
        par.id, chi.id
;

 PARENT_ID   CHILD_ID CHILD_NAME
---------- ---------- ----------
         1          1 Simon
         1          2 Sally
         2          1 Janet
         2          2 John
         3          1 Orphan

5 rows selected.

Having got this far with my first query I’ve decided to add the parent name to the report:


select
        par.id      parent_id,
        par.name    parent_name,
        chi.id      child_id,
        chi.name    child_name
from
        parent  par,
        child   chi
where
        chi.id_p = par.id
order by
        par.id, chi.id
;

 PARENT_ID PARENT_NAM   CHILD_ID CHILD_NAME
---------- ---------- ---------- ----------
         1 Smith2              1 Simon
         1 Smith               1 Simon
         1 Smith2              2 Sally
         1 Smith               2 Sally
         2 Jones               1 Janet
         2 Jones               2 John

6 rows selected.

How could adding a column to the select list result in one child row disappearing and two child rows being duplicated; and is this a bug ?

To avoid any confusion, here’s the complete script I used for creating the schema owner, in 11.2.0.4, with no extra privileges granted to PUBLIC:


create user u1
        identified by u1
        default tablespace test_8k
        quota unlimited on test_8k
;

grant
        create session,
        create table
to
        u1
;


Update

It didn’t take long for a couple of people to suggest that the oddity was the consequence of constraints that had not been enabled and validated 100% of the time, but the suggestions offered were a little more convoluted than necessary. Here’s the code I ran from my brand new account before running the two select statements:


create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
        rely disable novalidate
)
;

create table child(
        id_p    number(4)
                constraint chi_fk_par
                references parent
                on delete cascade
                rely disable novalidate,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id)
                rely disable novalidate
)
;

insert into parent values (1,'Smith');
insert into parent values (1,'Smith2');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Janet');
insert into child values(2,2,'John');

insert into child values(3,1,'Orphan');

commit;

begin
        dbms_stats.gather_table_stats(user,'child');
        dbms_stats.gather_table_stats(user,'parent');
end;
/


In a typical data warehouse frame of mind I’ve added plenty of constraints, but left them all disabled and novalidated, but told Oracle to rely on them for optimisation strategies. This means all sorts of incorrect data could get into the tables, with all sorts of unexpected side effects on reporting. The example above shows duplicates on primary keys (and if you checked the table definition you’d find that the primary key columns were nullable as well), child rows with no parent key.

In fact 11g and 12c behave differently – the appearance of the Orphan row in the first sample query is due, as Chris_cc pointed out in the first comment, to the optimizer deciding that it could use join elimination because it was joining to a single-column primary key without selecting any other columns from the referenced table. In 12c the optimizer doesn’t use join elimination for this query, so both queries have the same (duplicated) output.

Update:

Make sure you read the articles linked to by Dani Schneider’s comment below, and note especially the impact on the query_rewrite_integrity parameter.


Wrong Results

Tue, 2016-03-08 12:57

Just in – a post on the Oracle-L mailing lists asks: “Is it a bug if a query returns one answer if you hint a full tablescan and another if you hint an indexed access path?” And my answer is, I think: “Not necessarily”:


SQL> select /*+ full(pt_range)  */ n2 from pt_range where n1 = 1 and n2 = 1;

        N2
----------
         1
SQL> select /*+ index(pt_range pt_i1) */ n2 from pt_range where n1 = 1 and n2 = 1;

        N2
----------
         1
         1

The index is NOT corrupt.

The reason why I’m not sure you should call this a bug is that it is a side effect of putting the database into an incorrect state. You might have guessed from the name that the table is a (range) partitioned table, and I’ve managed to get this effect by doing a partition exchange with the “without validation” option.


create table t1 (
        n1      number(4),
        n2      number(4)
);

insert into t1
select  rownum, rownum
from    all_objects
where   rownum <= 5
;

create table pt_range (
        n1      number(4),
        n2      number(4)
)
partition by range(n1) (
        partition p10 values less than (10),
        partition p20 values less than (20)
)
;

insert into pt_range
select
        rownum, rownum
from
        all_objects
where
        rownum <= 15
;
create index pt_i1 on pt_range(n1,n2);

begin
        dbms_stats.gather_table_stats(
                ownname    => user,
                tabname    => 'T1',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname    => user,
                tabname    => 'PT_RANGE',
                method_opt => 'for all columns size 1'
        );
end;
/

alter table pt_range
exchange partition p20 with table t1
including indexes
without validation
update indexes
;

The key feature (in this case) is that the query can be answered from the index without reference to the table. When I force a full tablescan Oracle does partition elimination and looks at just one partition; when I force the indexed access path Oracle doesn’t eliminate rows that belong to the wrong partition – though technically it could (because it could identify the target partition by the partition’s data_object_id which is part of the extended rowid stored in global indexes).

Here are the two execution plans (from 11.2.0.4) – notice how the index operation has no partition elimination while the table operation prunes partitions:


select /*+ full(pt_range)  */ n2 from pt_range where n1 = 1 and n2 = 1

---------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |       |     2 (100)|          |       |       |
|   1 |  PARTITION RANGE SINGLE|          |     1 |     6 |     2   (0)| 00:00:01 |     1 |     1 |
|*  2 |   TABLE ACCESS FULL    | PT_RANGE |     1 |     6 |     2   (0)| 00:00:01 |     1 |     1 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("N1"=1 AND "N2"=1))


select /*+ index(pt_range pt_i1) */ n2 from pt_range where n1 = 1 and n2 = 1

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |       |       |     1 (100)|          |
|*  1 |  INDEX RANGE SCAN| PT_I1 |     1 |     6 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("N1"=1 AND "N2"=1)


Note: If I had a query that did a table access by (global) index rowid after the index range scan it WOULD do partition elimination and visit just the one partition – never seeing the data in the wrong partition.

So is it a bug ? You told Oracle not to worry about bad data – so how can you complain if it reports bad data.

Harder question – which answer is the “right” one – the answer which shows you all the data matching the query, or the answer which shows you only the data that is in the partition it is supposed to be in ?


Pages