Skip navigation.

Jonathan Lewis

Syndicate content Oracle Scratchpad
Just another Oracle weblog
Updated: 14 hours 14 min ago

Shrink Tablespace

Fri, 2014-09-19 05:10

A recent question on the OTN database forum raised the topic of returning free space in a tablespace to the operating system by rebuilding objects to fill the gaps near the start of files and leave the empty space at the ends of files so that the files could be resized downwards.

This isn’t a process that you’re likely to need frequently, but I have written a couple of notes about it, including a sample query to produce a map of the free and used space in a tablespace. While reading the thread, though, it crossed my mind that recent versions of Oracle introduced a feature that can reduce the amount of work needed to get the job done, so I thought I’d demonstrate the point here.

When you move a table its indexes become unusable and have to be rebuilt; but when an index becomes unusable, the more recent versions of Oracle will drop the segment. Here’s a key point – if the index becomes unusable because the table has been moved the segment is dropped only AFTER the move has completed. Pause a minute for thought and you realise that the smart thing to do before you move a table is to make its indexes unusable so that they release their space BEFORE you move the table. (This strategy is only relevant if you’ve mixed tables and indexes in the same tablespace and if you’re planning to do all your rebuilds into the same tablespace rather than moving everything into a new tablespace.)

Here are some outputs demonstrating the effect in a 12.1.0.2 database. I have created (and loaded) two tables in a tablespace of 1MB uniform extents, 8KB block size; then I’ve created indexes on the two tables. Running my ts_hwm.sql script I get the following results for that tablespace:


FILE_ID    BLOCK_ID   END_BLOCK OWNER      SEGMENT_NAME    SEGMENT_TYPE
------- ----------- ----------- ---------- --------------- ------------------
      5         128         255 TEST_USER  T1              TABLE
                256         383 TEST_USER  T2              TABLE
                384         511 TEST_USER  T1_I1           INDEX
                512         639 TEST_USER  T2_I1           INDEX
                640      65,535 free       free

Notice that it was a nice new tablespace, so I can see the two tables followed by the two indexes at the start of the tablespaces. If I now move table t1 and re-run the script this is what happens:


alter table t1 move;

FILE_ID    BLOCK_ID   END_BLOCK OWNER      SEGMENT_NAME    SEGMENT_TYPE
------- ----------- ----------- ---------- --------------- ------------------
      5         128         255 free       free
                256         383 TEST_USER  T2              TABLE
                384         511 free       free
                512         639 TEST_USER  T2_I1           INDEX
                640         767 TEST_USER  T1              TABLE
                768      65,535 free       free

Table t1 is now situated past the previous tablespace highwater mark and I have two gaps in the tablespace where t1 and the index t1_i1 used to be.

Repeat the experiment from scratch (drop the tables, purge, etc. to empty the tablespace) but this time mark the index unusable before moving the table and this is what happens:


FILE_ID    BLOCK_ID   END_BLOCK OWNER      SEGMENT_NAME    SEGMENT_TYPE
------- ----------- ----------- ---------- --------------- ------------------
      5         128         255 free       free
                256         383 TEST_USER  T2              TABLE
                384         511 TEST_USER  T1              TABLE
                512         639 TEST_USER  T2_I1           INDEX
                640      65,535 free       free

Table t1 has moved into the space vacated by index t1_i1, so the tablespace highwater mark has not moved up.

If you do feel the need to reclaim space from a tablespace by rebuilding objects, you can find that it’s quite hard to decide the order in which the objects should be moved/rebuilt to minimise the work you (or rather, Oracle) has to do; if you remember that any table you move will release its index space anyway and insert a step to mark those indexes unusable before you move the table you may find it’s much easier to work out a good order for moving the tables.

Footnote: I appreciate that some readers may already take advantage of the necessity of rebuilding indexes by dropping indexes before moving tables – but I think it’s a nice feature that we can now make them unusable and get the same benefit without introducing a risk of error when using a script to recreate an index we’ve dropped.

 


Delphix

Sat, 2014-09-13 15:29

I’ve often found in my travels that I’ve come up with a (potential) solution to a problem and wanted to test it “right now” – only to run onto the horns of a dilemma. A typical client offers me one of two options:

  • Option 1: test it on the production system – which is generally frowned on, and sometimes I can’t even get access to the production system anyway.
  • Option 2: test it on something that looks nothing like the production system – and hope that that’s in some way a valid test.

The first option – when it’s offered – is quite stressful, especially if the original performance problem had involved updates; the second option – which is the commoner offer – is also quite stressful because it’s quite hard to prove that the fix is doing what it’s supposed and that it will do it at the expected speed when it gets promoted to production. What I’d really like to do is clone the production system and run tests and demonstrations on the clone. In fact, that’s exactly what does happen in some rare cases – but then it usually takes several hours (or couple of days) rather than the few minutes I’d prefer to wait.

This is why I like Delphix and why I’m prepared to say so whenever I see a need for it on a client site. It’s also why I’ve agreed to do a little on-line webinar about the product on Thursday 18th Sept at 10:00 am Pacific Time (6:00 pm UK time). The event is free, but you need to register to attend.

 

Disclosure:  The company paid me to visit their premises in Menlo Park CA last year so that I could experiment with their product and talk to their technical staff, without requesting any right to edit or limit any review I might subsequently publish about their product. The company has also paid me for my time for Thursday’s webinar but, again, has made no attempt to exercise editorial control over the content.

 

 


Unusual Deadlock

Fri, 2014-09-12 06:08

Prompted by a question on OTN I came up with a strategy for producing an ORA-00060 deadlock that DIDN’T produce a deadlock graph (because there isn’t one) and didn’t get reported in the alert log (at least, not when tested on 11.2.0.4). It’s a situation that shouldn’t arise in a production system because it’s doing the sorts of things that you shouldn’t do in a production system: but possibly if you’re trying to do some maintenance or upgrades while keeping the system live it could happen. Here’s the starting code:


drop procedure p2;
drop procedure p1;

drop table t1 purge;
create table t1 (n1 number);

insert into t1 values(1);
insert into t1 values(3);

create or replace procedure p1
as
begin
        update t1 set n1 = 2 where n1 = 1;
        dbms_lock.sleep(10);
        update t1 set n1 = 4 where n1 = 3;
end;
/

create or replace procedure p2
as

        procedure q
        is
                pragma autonomous_transaction;
        begin
                execute immediate 'drop procedure p1';
        end;

begin

        update t1 set n1 = 4 where n1 = 3;
        q;

end;
/

Of course you’re asking for trouble if you start doing DDL as part of your production code; and you’re asking for trouble if you start playing around with autonomous transaction; and if you use one to do the other you’re almost guaranteed to hit a problem eventually. All it takes in this case is a simple sequence of actions followed by a short wait (ca. 10 seconds):

Session 1: execute p1
Session 2: wait a couple of seconds, then execute p2

I got the following result from session 2:


BEGIN p2; END;

*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "TEST_USER.P2", line 8
ORA-06512: at "TEST_USER.P2", line 14
ORA-06512: at line 1

While this doesn’t show up in the alert log, I do get a trace file dumped for the session; in fact I got a trace file from both processes. The trace from the process that reported the deadlock started like this:


DEADLOCK DETECTED

  Performing diagnostic dump and signaling ORA-00060

  Complete deadlock information is located in the trace file of process (pid: 8, osid: 17861, DIA0)
    trace file: /u01/app/oracle/diag/rdbms/test/test/trace/test_dia0_17861.trc

-------------------------------------------------------------------------------

*** 2014-09-09 12:44:13.427
-------------------------------------------------------------------------------
HUNG PROCESS DIAGNOSTIC DUMP BEGIN:

    dump requested by process (pid: 8, osid: 17861, DIA0)
    trace file: /u01/app/oracle/diag/rdbms/test/test/trace/test_dia0_17861.trc
-------------------------------------------------------------------------------

The other process didn’t mention a deadlock, but started at the “HUNG PROCESS DIAGNOSTIC” line – the trace file had been triggered by the DIAG process. Of course, the DIAG process also dumped a trace file, summarising the situation; and this started with the following important note:


*** 2014-09-09 12:44:13.426
-------------------------------------------------------------------------------

DEADLOCK DETECTED (id=0xf678cfe3)

Chain Signature: 'enq: TX - row lock contention'<='library cache pin' (cycle)
Chain Signature Hash: 0x39f9868d

The rest of the trace file told me what the other two processes had been doing when the ORA-00060 was initiated, but the point I want to pick up here is that we have a very brief summary in the “Chain Signature” that tells us we’ve had a collision between some DML (enq: TX) and some sort of cursor-like problem (library cache pin) and not a simple data cross-over.

If you’ve looked at the original OTN posting you’ll see that the Chain Signature in that case is “Chain Signature: ‘library cache lock'<=’library cache pin’ (cycle)”, which indicates a collision restricted entirely to the library cache (the lock suggests someone is using a package/cursor while the pin suggests that another session is trying to destroy/(re)compile it – and they’re each trying to do it to each other’s package ! (In the past when I’ve a deadlock of this type it’s been reported as ORA-04020 rather than ORA-00060.) I still have to work out exactly how the OP got into their deadlock (especially in view of their latest comment) -but since I don’t have the right version of Oracle to hand, and it might be a version-specific bug, I don’t think I’m going to try.

 


Quiz Night

Tue, 2014-09-09 11:46

I have a table with several indexes on it, and I have two versions of a query that I might run against that table. Examine them carefully, then come up with some plausible reason why it’s possible (with no intervening DDL, DML, stats collection, parameter fiddling etc., etc., etc.) for the second form of the query to be inherently more efficient than the first.


select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 between 1 and 3
;

prompt  ===========
prompt  Split query
prompt  ===========

select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 = 1
or      bit_1 > 1 and bit_1 <= 3
;

Update / Answers

I avoided giving any details about the data and indexes in this example as I wanted to allow free rein to readers’ imagination  – and I haven’t been disappointed with the resulting suggestions. The general principles of allowing more options to the optimizer, effects of partitioning, and effects of skew are all worth considering when the optimizer CAN’T use an execution path that you think makes sense.  (Note: I didn’t say make it clear in my original question, but I wasn’t looking for cases where you could get a better path by hinting (or profiling) I was after cases where Oracle literally could not do what you wanted.)

The specific strategy I was thinking of when I posed the question was based on a follow-up to some experiments I had done with the cluster_by_rowid() hint. and (there was a little hint in the “several indexes” and more particularly the column name “bit_1″) I was looking at a data warehouse table with a number of bitmap indexes. So here’s the execution plan for the first version of the query  when there’s a simple bitmap index on bit_1.


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   600 | 18000 |    96 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   600 | 18000 |    96 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|*  3 |    BITMAP INDEX RANGE SCAN   | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("BIT_1">=1 AND "BIT_1"<=3)

And here’s the plan for the second query:


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   560 | 16800 |    91 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   560 | 16800 |    91 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|   3 |    BITMAP OR                 |         |       |       |       |
|*  4 |     BITMAP INDEX SINGLE VALUE| BT1     |       |       |       |
|   5 |     BITMAP MERGE             |         |       |       |       |
|*  6 |      BITMAP INDEX RANGE SCAN | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("BIT_1"=1)
   6 - access("BIT_1">1 AND "BIT_1"<=3)

Clearly the second plan is more complex than the first – moreover the added complexity had resulted in the optimizer getting a different cardinality estimate – but, with my data set, there’s a potential efficiency gain. Notice how lines 5 and 6 show a bitmap range scan followed by a bitmap merge: to do the merge Oracle has to “superimpose” the bitmaps for the different key values in the range scan to produce a single bitmap that it can then OR with the bitmap for bit_1 = 1 (“bitmap merge” is effectively the same as “bitmap or” except all the bitmaps come from the same index). The result of this is that when we convert to rowids the rowids are in table order. You can see the consequences in the ordering of the result set or, more importantly for my demo, in the autotrace statistics:


For the original query:
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        604  consistent gets
          0  physical reads
          0  redo size
      27153  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed


For the modified query
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        218  consistent gets
          0  physical reads
          0  redo size
      26714  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed

Note, particularly, the change in the number of consistent gets. Each table block I visited held two or three rows that I needed, in the first query I visit the data in order of (bit_1, rowid) and get each table block 3 time; in the second case I visit the data in order of rowid and only get each table block once (with a “buffer is pinned count” for subsequent rows from the same block).

Here’s the starting output from each query, I’ve added the rowid to the original select statements so that you can see the block ordering:


Original query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         1         12 12         AAAmeCAAFAAAAECAAB
         1         22 22         AAAmeCAAFAAAAEDAAB
         1         32 32         AAAmeCAAFAAAAEEAAB
         1         42 42         AAAmeCAAFAAAAEFAAB
         1         52 52         AAAmeCAAFAAAAEGAAB

Modified query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         2          3 3          AAAmeCAAFAAAAEBAAC
         3          4 4          AAAmeCAAFAAAAEBAAD
         1         12 12         AAAmeCAAFAAAAECAAB
         2         13 13         AAAmeCAAFAAAAECAAC
         3         14 14         AAAmeCAAFAAAAECAAD
         1         22 22         AAAmeCAAFAAAAEDAAB
         2         23 23         AAAmeCAAFAAAAEDAAC
         3         24 24         AAAmeCAAFAAAAEDAAD

By rewriting the query I’ve managed to force a “cluster by rowid” on the data access. Of course, the simpler solution would be to add the /*+ cluster_by_rowid() */ hint to the original query – but it doesn’t work for bitmap indexes, and when I found that it worked for B-tree indexes the next test I did was to try a single bitmap index, which resulted in my writing this note.

Footnote: I don’t really expect Oracle Corp. to modify their code to make the hint work with bitmaps, after all it’s only relevant in the special case of using a bitmap index with a range scan and no subsequent bitmap AND/OR/MINUS operations where it would be needed – and you’re not really expected to use a single bitmap index to access a table, we engineer bitmaps to take advantage of combinations.


ASSM Truncate.

Mon, 2014-09-08 04:34

Here’s one that started off with a tweet from Kevin Closson, heading towards a finish that shows some interesting effects when you truncate large objects that are using ASSM. To demonstrate the problem I’ve set up a tablespace using system allocation of extents and automatic segment space management (ASSM).  It’s the ASSM that causes the problem, but it requires a mixture of circumstances to create a little surprise.


create
	tablespace test_8k_auto_assm
	datafile	-- OMF
	SIZE 1030M
	autoextend off
	blocksize 8k
	extent management local
	autoallocate
	segment space management auto
;

create table t1 (v1 varchar2(100)) pctfree 99 tablespace test_8k_auto_assm storage(initial 1G);

insert into t1 select user from dual;
commit;

alter system flush buffer_cache;

truncate table t1;

I’ve created a table with an initial definition of 1GB, which means that (in a clean tablespace) the autoallocate option will jump straight to extents of 64MB, with 256 table blocks mapped per bitmap block for a total of 32 bitmap blocks in each 64MB extent. Since I’m running this on 11.2.0.4 and haven’t included “segment creation immediate” in the definition I won’t actually see any extents until I insert the first row.

So here’s the big question – when I truncate this table (using the given command) how much work will Oracle have to do ?

Exchanging notes over twitter (140 char at a time) and working from a model of the initial state, it took a little time to get to understand what was (probably) happening and then produce this silly example – but here’s the output from a snapshot of v$session_event for the session across the truncate:


Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
local write wait                                    490           0          83.26        .170          13
enq: RO - fast object reuse                           2           0         104.90      52.451         105
db file sequential read                              47           0           0.05        .001           0
db file parallel read                                 8           0           0.90        .112           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.67        .067         153
events in waitclass Other                             2           0           0.04        .018         109

The statistic I want to highlight is the number recorded against “local write wait”: truncating a table of one row we wait for 490 blocks to be written! We also have 8 “db file parallel read”  waits which, according to a 10046 trace file, were reading hundreds of blocks. (I think the most significant time in this test – the RO enqueue wait – may have been waiting for the database writer to complete the work needed for an object checkpoint, but I’m not sure of that.)

The blocks written were the space management bitmap blocks for the extent(s) that remained after the truncate – even the ones that referenced extents above the high water mark for the table. Since we had set the tables initial storage to 1GB, we had a lot of bitmap blocks. At 32 per extent and 16 extents (64MB * 16 = 1GB) we might actually expect something closer to 512 blocks, but actually Oracle had formatted the last extent with only 8 space management blocks. and the first extent had an extra 2 to cater for the level 2 bitmap lock and segment header block giving: 32 * 15 + 8 + 2 = 490.

As you may have seen above, the impact on the test that Kevin was doing was quite dramatic – he had set the initial storage to 128GB (lots of bitmap blocks), partitioned the table (more bitmap blocks) and was running RAC (so the reads were running into waits for global cache grants).

I had assumed that this type of behaviour happened only with the “reuse storage” option of the truncate command: and I hadn’t noticed before that it also appeared even if you didn’t reuse storage – but that’s probably because the effect applies only to the bit you keep, which may typically mean a relatively small first extent. It’s possible, then, that in most cases this is an effect that isn’t going to be particularly visible in production systems – but if it is, can you work around it ? Fortunately another tweeter asked the question “What happens if you ‘drop all storage?'” Here’s the result from adding that clause to my test case:


Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
enq: RO - fast object reuse                           1           0           0.08        .079           0
log file sync                                         1           0           0.03        .031           0
db file sequential read                              51           0           0.06        .001           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.56        .056         123
events in waitclass Other                             3           0           0.87        .289         186


Looking good – if you don’t keep any extents you don’t need to make sure that their bitmaps are clean. (The “db file sequential read” waits are almost all about the data dictionary, following on from my “flush buffer cache”).

Footnote 1: the same effect appears in 12.1.0.2
Footnote 2: it’s interesting to note that the RO enqueue wait time seems to parallel the local write wait time: perhaps a hint that there’s some double counting going on. (To be investigated, one day).


Group By Bug

Thu, 2014-09-04 10:11

This just in from OTN Database Forum – a surprising little bug with “group by elimination” exclusive to 12c.


alter session set nls_date_format='dd-Mon-yyyy hh24:mi:ss';

select
       /* optimizer_features_enable('12.1.0.1')*/
       trunc (ts,'DD') ts1, sum(fieldb) fieldb
from (
  select
        ts, max(fieldb) fieldb
  from (
  select trunc(sysdate) - 1/24 ts, 1 fieldb from dual
  union all
  select trunc(sysdate) - 2/24 ts, 2 fieldb from dual
  union all
  select trunc(sysdate) - 3/24 ts, 3 fieldb from dual
  union all
  select trunc(sysdate) - 4/24 ts, 4 fieldb from dual
  union all
  select trunc(sysdate) - 5/24 ts, 5 fieldb from dual
  )
  group by ts
)
group by trunc (ts,'DD')
/

You might expect to get one row as the answer – but this is the result I got, with the execution plan pulled from memory:


TS1                      FIELDB
-------------------- ----------
03-Sep-2014 00:00:00          1
03-Sep-2014 00:00:00          5
03-Sep-2014 00:00:00          4
03-Sep-2014 00:00:00          2
03-Sep-2014 00:00:00          3

-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    11 (100)|          |
|   1 |  HASH GROUP BY   |      |     5 |    60 |    11  (10)| 00:00:01 |
|   2 |   VIEW           |      |     5 |    60 |    10   (0)| 00:00:01 |
|   3 |    UNION-ALL     |      |       |       |            |          |
|   4 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   5 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   6 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

You’ll notice that I’ve got an “optimizer_features_enable()” comment in the code: if I change it into a hint I get the following (correct) result and plan:


TS1                      FIELDB
-------------------- ----------
03-Sep-2014 00:00:00         15

-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    12 (100)|          |
|   1 |  HASH GROUP BY   |      |     5 |    60 |    12  (17)| 00:00:01 |
|   2 |   VIEW           |      |     5 |    60 |    11  (10)| 00:00:01 |
|   3 |    HASH GROUP BY |      |     5 |    60 |    11  (10)| 00:00:01 |
|   4 |     VIEW         |      |     5 |    60 |    10   (0)| 00:00:01 |
|   5 |      UNION-ALL   |      |       |       |            |          |
|   6 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Somehow 12.1.0.2 has managed to get confused by the combination of “group by ts” and “group by trunc(ts,’DD’)” and has performed “group-by elimination” when it shouldn’t have. If you use the ‘outline’ option for dbms_xplan.display_cursor() you’ll find that the bad result reports the hint elim_groupby(@sel$1), which leads to an alternative solution to hinting the optimizer_features level. Start the code like this:


select
       /*+ qb_name(main) no_elim_groupby(@main) */
       trunc (ts,'DD') ts1, sum(fieldb) fieldb
from  ...

The (no_)elim_groupby is a hint that appeared in v$sql_hints only in the 12.1.0.2.


Order of Operation

Wed, 2014-09-03 02:42

One response to my series on reading execution plans was an email request asking me to clarify what I meant by the “order of operation” of the lines of an execution plan. Looking through the set of articles I’d written I realised that I hadn’t made any sort of formal declaration of what I meant, all I had was a passing reference in the introduction to part 4; so here’s the explanation.

 

By “order of operation” I mean the order in which the lines of an execution plan start to produce a rowsource. It’s worth stating this a little formally as any other interpretation could lead to confusion; consider the following simple hash join:


-------------------------------------
| Id  | Operation           | Name  |
-------------------------------------
|   0 | SELECT STATEMENT    |       |
|*  1 |  HASH JOIN          |       |
|   2 |   TABLE ACCESS FULL | T1    |
|   3 |   TABLE ACCESS FULL | T2    |
-------------------------------------

The tablescan at line 2 is the first operation to start producing a rowsoruce; the hash join at line 1 consumes the output and builds a hash table – so it definitely has to do some work before line 3 starts to run – but it doesn’t start generating a rowsource at this point. It’s only after line 3 starts its tablescan and starts to generate its rowsource that line 1 can produce a rowsource by consuming the rows coming from line 3 and probing the in-memory hash table. So line 1 starts to produce its rowsource only after line 3 starts producing its rowsource. The “order of operation” is 2, 3, 1, 0. Perhaps, for the purposes of avoiding confusion, it would be better in future if I remembered to say: “the order of rowsource generation”.


Execution Plans

Mon, 2014-09-01 01:40

This is the index to a series of articles I’ve been writing for redgate, published on their AllThingsOracle site, about generating and reading execution plans. I’ve completed a few articles that haven’t yet been published, but I’ll add their URLs when they’re available.

I don’t really know how many parts it’s going to end up as – there’s an awful lot that that you could say about reading execution plans, even when you’re trying to cover just the basics; every time I’ve started writing an episode in the series it’s turned into two episodes.  I’ve delivered 10 parts to redgate so far; the active URLs below are the ones that they are currently online.

Chapter 9 has just been published, so I’ve popped this catalogue to the top of the stack.  Episode 10 is written, but waiting for its final proof read.

It’s only going to take a couple more installments and I’ll have finished the basic introduction to execution plans – so I’m looking for some ideas of what people really need to know about reading execution plans. If you have any suggestions about what features, or functions, or patterns of execution plans need a more detailed execution, please put them into writing in the comments and I’ll start work on addressing the commonest requirements.

 


In-memory Consistency

Wed, 2014-08-27 12:00

A comment on one of my early blogs about the 12c in-memory database option asked how Oracle would deal with read-consistency. I came up with a couple of comments outlining the sort of thing I would look for in a solution, and this note is an outline on how I started to tackle the question – with a couple of the subsequent observations. The data is (nearly) the same as the data I generated for my previous article on the in-memory database (and I’m running 12.1.0.2, of course):


create table t1 nologging
as
select  *
from    all_objects
where   rownum <= 50000
;

insert /*+ append */ into t1 select * from t1;
commit;

insert /*+ append */ into t1 select * from t1;
commit;

insert /*+ append */ into t1 select * from t1;
commit;

begin
        dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1');
end;
/

alter table t1
        inmemory priority high memcompress for query low
        inmemory memcompress for query high (object_type)
;

In this case I’ve made the inmemory priority high and I haven’t set any column to “no inmemory” although I have made one column different from the rest (v$_im_column_level doesn’t get populated unless there is some variation across columns). I have to say I couldn’t get very consistent behaviour in terms of when the data finally got into memory with this table creation – possibly something to do with using “alter table” rather than “create table” – but a second “alter table t1 inmemory;” seemed to do the trick if Oracle was playing hard to get.

Once I’d checked that the table was in memory I collected performance figures from v$mystat and v$session_event for the following query:


select
        /* Test Run */
        last_ddl_time
from
        t1
where   t1.created > trunc(sysdate)
and     t1.object_type = 'TABLE'
and     t1.subobject_name is not null
;

Once I was satisfied that the in-memory option was working correctly, I went through the following steps:

  • Session 1: set transaction read only;
  • Session 1: run the query and collect performance figures
  • Session 2: do several small, committed, updates, modifying a total of 30 or 40 random rows
  • Session 2: Flush the buffer cache – so that we can see future block acquisition
  • Session 1: re-run the query and collect performance figures – compare and contrast

The effect of the “set transaction read only;” was to force the session to do some extra work in the second execution of the query to make the data read-consistent back to the start of the “transaction”. The results were as follows (don’t forget that some of the numbers will relate to the action of collecting the performance figures):


First execution
===============
Name                                                                     Value
----                                                                     -----
Requests to/from client                                                      4
opened cursors cumulative                                                    5
user calls                                                                   6
recursive calls                                                              3
session logical reads                                                    6,680
DB time                                                                      1
non-idle wait count                                                          4
consistent gets                                                              3
consistent gets from cache                                                   3
consistent gets pin                                                          3
consistent gets pin (fastpath)                                               3
logical read bytes from cache                                           24,576
calls to kcmgcs                                                              7
calls to get snapshot scn: kcmgss                                            1
table scans (long tables)                                                    1
table scans (IM)                                                             1
IM scan CUs memcompress for query low                                        1
session logical reads - IM                                               6,677
IM scan bytes in-memory                                              5,155,309
IM scan bytes uncompressed                                          45,896,824
IM scan CUs columns theoretical max                                         18
IM scan rows                                                           399,984
IM scan rows optimized                                                 399,984
IM scan CUs split pieces                                                     1
IM scan CUs predicates received                                              3
IM scan CUs predicates applied                                               3
IM scan CUs predicates optimized                                             1
IM scan CUs pruned                                                           1
IM scan segments minmax eligible                                             1
session cursor cache hits                                                    5
workarea executions - optimal                                                1
parse count (total)                                                          4
execute count                                                                5
bytes sent via SQL*Net to client                                         1,150

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message to client                             9           0           0.00        .000           0
SQL*Net message from client                           9           0           0.44        .049       8,408

Second Execution
================
Name                                                                     Value
----                                                                     -----
Requests to/from client                                                      4
opened cursors cumulative                                                    5
user calls                                                                   6
recursive calls                                                              3
session logical reads                                                    6,728
DB time                                                                      1
non-idle wait count                                                         35
enqueue requests                                                             2
enqueue releases                                                             2
physical read total IO requests                                             29
physical read total multi block requests                                    24
physical read total bytes                                            6,987,776
cell physical IO interconnect bytes                                  6,987,776
consistent gets                                                             92
consistent gets from cache                                                  92
consistent gets pin                                                         44
consistent gets pin (fastpath)                                               5
consistent gets examination                                                 48
logical read bytes from cache                                          753,664
physical reads                                                             853
physical reads cache                                                       853
physical read IO requests                                                   29
physical read bytes                                                  6,987,776
consistent changes                                                          48
free buffer requested                                                      894
CR blocks created                                                           41
physical reads cache prefetch                                              824
physical reads prefetch warmup                                             713
shared hash latch upgrades - no wait                                        43
calls to kcmgcs                                                              7
calls to get snapshot scn: kcmgss                                            1
file io wait time                                                        3,861
data blocks consistent reads - undo records applied                         48
rollbacks only - consistent read gets                                       41
table scans (long tables)                                                    1
table scans (IM)                                                             1
table scan rows gotten                                                   2,803
table scan blocks gotten                                                    41
IM scan CUs memcompress for query low                                        1
session logical reads - IM                                               6,636
IM scan bytes in-memory                                              5,155,309
IM scan bytes uncompressed                                          45,896,824
IM scan CUs columns theoretical max                                         18
IM scan rows                                                           399,984
IM scan rows optimized                                                 399,984
IM scan rows cache                                                          48
IM scan blocks cache                                                        41
IM scan CUs split pieces                                                     1
IM scan CUs predicates received                                              3
IM scan CUs predicates applied                                               3
IM scan CUs predicates optimized                                             1
IM scan CUs pruned                                                           1
IM scan segments minmax eligible                                             1
session cursor cache hits                                                    5
workarea executions - optimal                                                1
parse count (total)                                                          4
execute count                                                                5
bytes sent via SQL*Net to client                                         1,150
bytes received via SQL*Net from client                                   1,772
SQL*Net roundtrips to/from client                                            4

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
Disk file operations I/O                              2           0           0.01        .003           0
db file sequential read                               5           0           0.01        .001           0
db file scattered read                               24           0           0.38        .016           0
SQL*Net message to client                            10           0           0.01        .001           0
SQL*Net message from client                          10           0           0.76        .076       8,408

There’s quite a lot of stats which probably aren’t interesting – and there’s one detail that is important but doesn’t appear (at least not clearly) and that’s the fact that the table in question had about 6,800 blocks below its highwater mark.

So, what do the stats tell us? The most obvious change, of course, is that we had to do some physical reads to get a result set: 24 multiblock reads and 5 single block reads (the latter from the undo tablespace). This is echoed in the session stats as 853 “physical reads cache” from 29 “physical read IO requests”. We can then see the specific read-consistency work (in two ways – with a third close approximation):

consistent changes                                                          48
CR blocks created                                                           41

data blocks consistent reads - undo records applied                         48
rollbacks only - consistent read gets                                       41

IM scan rows cache                                                          48
IM scan blocks cache                                                        41

We applied 48 undo change vectors to fix up 41 blocks to the correct point in time and used them to read 48 rows – the last pair of figures won’t necessarily match the first two pairs, but they do give us a measure of how much data we had to acquire from the cache when trying to do an in-memory scan.

The number 41 actually appears a couple more times: it’s “table scan blocks gotten” (which might seem a little odd since we got far more than 41 blocks by multiblock reads – but we only really wanted 41), and it’s also the change (downwards) in “session logical reads – IM”. Even when Oracle does a pure in-memory query it calculates the number of blocks it would have been reading and reports that number as “session logical reads” and “session logical reads – IM” – so there’s another way to get confused about buffer visits and another statistic to cross-check when you’re trying to work out how to calculate “the buffer hit ratio” ;)

After the first read the scattered reads all seemed to be 32 blocks of “intermittent” tablescan – perhaps this is a measure of the number of blocks that are compressed into a single in-memory chunk (for query low), but perhaps it’s a side effect of the “physical reads prefetch warmup” that Oracle may do when the cache has a lot of empty space. I’ll leave it as an exercise to the reader to refine the test (or think of a different test) to determine whether it’s the former or latter; it’s quite important to find this out because if Oracle is tracking change at the “in-memory chunk” rather than at the block level then a small amount of high-precision change to an in-memory table could result in a relatively large amount of “redundant” I/O as a long-running query tried to stay read-consistent.


In-memory Aggregation

Sun, 2014-08-24 13:05

The title of this piece is the name given to a new feature in 12.1.0.2, and since I’ve recently blogged about a limitation of the in-memory option I thought I’d pick this feature as the next obvious thing to blog about. This is a bit of a non sequitur, though, as the feature seems to have nothing whatsoever to do with the in-memory option; instead it’s a cunning mechanism combining aspects of the star-transformation (but without the bitmap indexes), Bloom filters, and “group-by” placement to minimise the cost of aggregation over high-volume joins.

Here’s a small data set I’ll use to demonstrate the feature:

create table towns
as
select
        rownum                                          id,
        trunc(dbms_random.value(1,51))                  id_state,
        rpad(dbms_random.string('U',3),12)              name,
        cast (rpad('x',trunc(dbms_random.value(50,60)),'x') as varchar2(60))    padding
from
        all_objects
where
        rownum <= 2000
;

alter table towns add constraint to_pk primary key(id);
create index to_i1 on towns(name);

create table people(
        id_town_work    number(6,0)     not null
                constraint pe_fk_wo references towns,
        id_town_home    number(6,0)     not null
                constraint pe_fk_ho references towns,
        dummy1          varchar2(10),
        dummy2          varchar2(10),
        padding         varchar2(110)
);

insert /*+ append */  into people
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1,2001)),
        trunc(dbms_random.value(1,2001)),
        lpad(rownum,10),
        lpad(rownum,10),
        cast (rpad('x',trunc(dbms_random.value(50,60)),'x') as varchar2(60))    padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;

commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'Towns',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'People',
                method_opt       => 'for all columns size 1'
        );
end;
/

I have a “large” table of people, and people can live in one town and work in another. Towns are in states and I’m interested in a report about people who live in one specific state but work in another (e.g. New Hampshre vs. Massachusetts). There are a couple of “padding” columns to represent the data associated with each town and person that I might want in a report. To keep things simple I haven’t extended the query out to select the name of the state. Here’s the query I might use to get the report I want:

select
        wt.padding,
        ht.padding,
        max(pe.padding)
from
        towns   wt,
        towns   ht,
        people  pe
where
        wt.id_state     = 1
and     pe.id_town_work = wt.id
and     ht.id_state     = 2
and     pe.id_town_home = ht.id
group by
        wt.padding,
        ht.padding
;

You might expect something like the following as the execution plan:

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |    40 |  7600 |   179   (6)| 00:00:01 |
|   1 |  HASH GROUP BY       |        |    40 |  7600 |   179   (6)| 00:00:01 |
|*  2 |   HASH JOIN          |        |    40 |  7600 |   178   (6)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL | TOWNS  |    40 |  2520 |     5   (0)| 00:00:01 |
|*  4 |    HASH JOIN         |        |  2000 |   248K|   173   (6)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| TOWNS  |    40 |  2520 |     5   (0)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| PEOPLE |   100K|  6250K|   165   (4)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PE"."ID_TOWN_HOME"="HT"."ID")
   3 - filter("HT"."ID_STATE"=2)
   4 - access("PE"."ID_TOWN_WORK"="WT"."ID")
   5 - filter("WT"."ID_STATE"=1)

The order of operation (row source generation) is: 3,5,6,4,2,1 – we build a hash table from the towns in state 2; build a hash table from the towns in state 1; scan the people table and probe the state 1 hash table, any row that survives is used to probe the state 2 hash table, and the rows that survive the second probe are aggregated to produce the answer.

When you do this type of thing with very large data sets one of the potential performance threats comes from the volume of data you have to aggregate. As we’ve joined the three tables the row length grows significantly before we finally aggregate (admittedly my data set is small, and the number of rows we’re going to aggregate also appears to be very small according to the predictions). There’s also (in the early stages at least) the potential for passing a very large number of rows from the fact table through the first (and possibly subsequent) hash join, doing a lot of work to eliminate the rows you don’t need.

In 12c the optimizer can choose to minimise both these threat points using “vector transformation”. (The name may also reflect the possibility that the code path will take advantage of vector processing (SIMD) operations if they’re available in the CPU.) Here’s the execution path I got when I added the /*+ vector_transform(@sel$1) */ hint to my query – it’s not sensible for this tiny data set, of course, but the hint is a way of learning what Oracle can do:

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                           |    71 | 15975 |   184   (6)| 00:00:01 |

|   1 |  TEMP TABLE TRANSFORMATION    |                           |       |       |            |          |

|   2 |   LOAD AS SELECT              | SYS_TEMP_0FD9D6661_31399B |       |       |            |          |
|   3 |    VECTOR GROUP BY            |                           |    10 |   670 |     6  (17)| 00:00:01 |
|   4 |     KEY VECTOR CREATE BUFFERED| :KV0000                   |    40 |  2680 |     6  (17)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL        | TOWNS                     |    40 |  2520 |     5   (0)| 00:00:01 |

|   6 |   LOAD AS SELECT              | SYS_TEMP_0FD9D6662_31399B |       |       |            |          |
|   7 |    VECTOR GROUP BY            |                           |    10 |   670 |     6  (17)| 00:00:01 |
|   8 |     KEY VECTOR CREATE BUFFERED| :KV0001                   |    40 |  2680 |     6  (17)| 00:00:01 |
|*  9 |      TABLE ACCESS FULL        | TOWNS                     |    40 |  2520 |     5   (0)| 00:00:01 |

|  10 |   HASH GROUP BY               |                           |    71 | 15975 |   172   (6)| 00:00:01 |
|* 11 |    HASH JOIN                  |                           |    71 | 15975 |   171   (5)| 00:00:01 |
|  12 |     TABLE ACCESS FULL         | SYS_TEMP_0FD9D6662_31399B |    10 |   670 |     2   (0)| 00:00:01 |
|* 13 |     HASH JOIN                 |                           |    71 | 11218 |   169   (5)| 00:00:01 |
|  14 |      TABLE ACCESS FULL        | SYS_TEMP_0FD9D6661_31399B |    10 |   670 |     2   (0)| 00:00:01 |
|  15 |      VIEW                     | VW_VT_C444E4CB            |    71 |  6461 |   167   (5)| 00:00:01 |
|  16 |       HASH GROUP BY           |                           |    71 |  5112 |   167   (5)| 00:00:01 |
|  17 |        KEY VECTOR USE         | :KV0000                   |    71 |  5112 |   167   (5)| 00:00:01 |
|  18 |         KEY VECTOR USE        | :KV0001                   |  2000 |   132K|   167   (5)| 00:00:01 |
|* 19 |          TABLE ACCESS FULL    | PEOPLE                    |   100K|  6250K|   165   (4)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("WT"."ID_STATE"=1)
   9 - filter("HT"."ID_STATE"=2)
  11 - access("ITEM_10"=INTERNAL_FUNCTION("C0") AND "ITEM_11"="C2")
  13 - access("ITEM_8"=INTERNAL_FUNCTION("C0") AND "ITEM_9"="C2")
  19 - filter(SYS_OP_KEY_VECTOR_FILTER("PE"."ID_TOWN_HOME",:KV0001) AND
              SYS_OP_KEY_VECTOR_FILTER("PE"."ID_TOWN_WORK",:KV0000))

There are three critical components to this plan: first, we create a couple of “Key Vectors” from the towns table, then we use those key vectors while scanning the people table and aggregate a minimal data set, finally we join back to the data associated with the key vectors. Reprising my introductory paragraph: the creation and use of the key vectors is similar to the Bloom filter approach; the final join-back is similar to the strategy used in Star Transformations (especially the ones where temp tables appear), and the key vector allows the high-volume fact data to be aggregated as much as possible before adding extra row-length from the dimensions.

In outline Oracle does the following:

  • scan the towns table to extract the id, and padding columns for id_state = 1 / work town – this produced 50 rows with my data set
  • manipulate the result to extract the distinct values of padding, and give each value a unique numeric identifier – this is the information that goes into the temp table (with one extra column) – this produced 10 rows
  • manipulate the result again to produce an in-memory array of (town.id, temp_table.identifier) – this is the key vector, containing 50 elements.

The second temp table and key vector for (id_state = 2 /work town ) will be created in the same way.

As the fact table is scanned Oracle can apply the key vectors very efficiently (we hope) to pick out the people rows that would be involved in the final aggregate and associate with each relevant row the two padding identifiers that belong to that row (this step is a bit like doing 2 hash joins – but presumably much more efficient; Bloom filtering does something very similar). After selecting the minimum number of rows we can aggregate them on the  two padding identifiers (an example of the “aggregate early”/”place group by” principle – aggregate before joining); finally we join back to the two temporary tables to translate the short padding identifiers into the long padding values (just as we do in star transformations with temporary table transformation).

Strangely we aggregate again after the join-back. I don’t think it’s necessary in this case because I’m fairly sure that the join back is on a unique set of columns – but perhaps this is a generic strategy allowing for variations in the mechanism, including such things as cases where the vector transform is only applied to a subset of the dimension tables.

Technically you could almost emulate this strategy in any version of Oracle (and I probably have at various times over the last few years) with the major limitation that the “KEY VECTOR USE” operations at lines 17 and 18 would have to be replaced with hash joins; no doubt, though, the major CPU saving of this approach is the difference between consecutive hash joins and what appears to be (from the execution stats) concurrent vector filtering. At some point – if a client needs the extra performance edge before they get to 12c – I’ll have to see if I can engineer an example in 11g that emulates the whole plan but uses Bloom filtering to approximate the key vector filtering.

 


Quiz night

Thu, 2014-08-21 11:05

Here’s a script to create a table, with index, and collect stats on it. Once I’ve collected stats I’ve checked the execution plan to discover that a hint has been ignored (for a well-known reason):

create table t2
as
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from
        all_objects
where
        rownum <= 3000
;

create index t2_i1 on t2(n1);

begin
        dbms_stats.gather_table_stats(
                user,
                't2',
                method_opt => 'for all columns size 1'
        );
end;
/

explain plan for
select  /*+ index(t2) */
        n1
from    t2
where   n2 = 45
;

select * from table(dbms_xplan.display);

----------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    15 |   120 |    15 |
|*  1 |  TABLE ACCESS FULL| T2   |    15 |   120 |    15 |
----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=45)

Of course we don’t expect the optimizer to use the index because we didn’t declare n1 to be not null, so there may be rows in the table which do not appear in the index. The only option the optimizer has for getting the right answer is to use a full tablescan. So the question is this – how come Oracle will obey the hint in the following SQL statement:


explain plan for
select
        /*+
                leading (t2 t1)
                index(t2) index(t1)
                use_nl(t1)
        */
        t2.n1, t1.n2
from
        t2      t2,
        t2      t1
where
        t2.n2 = 45
and     t2.n1 = t1.n1
;

select * from table(dbms_xplan.display);

-------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |   225 |  3600 |  3248 |
|   1 |  NESTED LOOPS                         |       |   225 |  3600 |  3248 |
|   2 |   NESTED LOOPS                        |       |   225 |  3600 |  3248 |
|*  3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T2    |    15 |   120 |  3008 |
|   4 |     INDEX FULL SCAN                   | T2_I1 |  3000 |       |     8 |
|*  5 |    INDEX RANGE SCAN                   | T2_I1 |    15 |       |     1 |
|   6 |   TABLE ACCESS BY INDEX ROWID         | T2    |    15 |   120 |    16 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("T2"."N2"=45)
   5 - access("T2"."N1"="T1"."N1")

I ran this on 11.2.0.4, but it does the same on earlier versions.

Update:

This was clearly too easy – posted at 18:04, answered correctly at 18:21. At some point in it’s evolution the optimizer acquired a rule that allowed it to infer unwritten “is not null” predicates from the join predicate.

 

 

 


LOB Length

Tue, 2014-08-19 11:06

It’s funny how you can make little savings in work all over the place in Oracle if you’re prepared to look a little closely at what’s going on. Here’s a quirky little example with LOBs and function calls that might just have some greater relevance in other situations. Here’s a little data set, and two queries that I might run against it:


create table tbl(
	c1      clob
)
lob (c1) store as c_lob(
	disable storage in row
	nocache nologging
)
;

begin
	for i in 1..128 loop
		insert into tbl values(rpad('x',4000));
		commit;
	end loop;
end;
/

commit;

-- collect stats, prime dictionary cache and library cache
-- run both queries twice and check stats on second run

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		/*+ no_merge */
		dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		rownum rn, dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

The question that you might ask yourselves when you see these queries is: will they do similar amounts of work. Of course, I wouldn’t be asking the question if the answer were yes. Despite the no_merge() hint, which you might think would have the same effect as the rownum approach, Oracle seems to execute the call to dbms_lob.getlength() twice for each row in the first query, but only once per row for the second query. Here are the stats (from autotrace) on the second run of the two queries when autotrace is enabled:


Statistics (for no_merge)
----------------------------------------------------------
         40  recursive calls
          0  db block gets
        271  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Statistics (for rownum)
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        131  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As you can see, the consistent gets for the no_merge() approach is roughly double that for the rownum approach – and since we have 128 rows/LOBs in the table that looks suspiciously like 2 gets vs. 1 get per LOB depending on the approach – which suggests two calls to the function. This is further corroborated by the execution plans, and especially by the predicate sections (how often have I said “always check the predicates”) which show that the predicate has been pushed inside the view that’s been hinted to be non-mergeable, but it hasn’t been pushed inside the view that uses the rownum instantion trick:


Execution Plan for no_merge()
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    13 |            |          |
|   2 |   VIEW              |      |     6 |    78 |     2   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| TBL  |     6 |   522 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("DBMS_LOB"."GETLENGTH"("C1")>3960)

Execution Plan for rownum
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |      |     1 |    13 |            |          |
|*  2 |   VIEW               |      |   128 |  1664 |     2   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| TBL  |   128 | 11136 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("LEN">3960)

My first thought on seeing this difference was to apply the /*+ no_push_pred */ hint to block predicate pushing – but then I remembered that the hint is about join predicate pushing and this is a simple filter push. A quick search of the hidden parameters, though, revealed this:

_optimizer_filter_pushdown : enable/disable filter predicate pushdown

Setting this parameter to false – either through a call to ‘alter session’ or through an /*+ opt_param( opt_param(‘_optimizer_filter_pushdown’ , ‘false’) */ hint – allowed the no_merge approach to produce the same plan and resource usage as the rownum approach. Of course, for a production system, I’d probably use the rownum approach rather than mess around with hidden parameters.

Footnote:

I don’t know why the code with the no_merge() approach reported 40 recursive calls (on its first execution with autotrace). A couple of variations on the experiment suggested that it had something to do with the number of rows (or consequential buffer visits) that survived the predicate call – for a sufficiently small number of rows the recursive call count happened to drop to zero; but the phenomenon needs further investigation.


In-memory limitation

Fri, 2014-08-15 13:51

I’ve been struggling to find time to have any interaction with the Oracle community for the last couple of months – partly due to workload, partly due to family matters and (okay, I’ll admit it) I really did have a few days’ holiday this month. So making my comeback with a bang – here’s a quick comment about the 12.1.0.2 in-memory feature, and how it didn’t quite live up to my expectation; but it’s also a comment about assumptions, tests, and inventiveness.

One of the 12.1.0.2 manuals tells us that the optimizer can combine the in-memory columnar storage mechanism with the “traditional” row store mechanisms – unfortunately it turned out that this didn’t mean quite what I had hoped; I had expected too much of the first release. Here’s a quick demo of what doesn’t happen, what I wanted to happen, and how I made it happen, starting with a simple definition (note – this is running 12.1.02 and the inmemory_size parameter has been set to enable the feature):


create table t1 nologging
as
select	*
from	all_objects
where	rownum <= 50000
;

alter table t1 inmemory
no inmemory (object_id, object_name)
inmemory memcompress for query low (object_type)
-- all other columns implicitly inmemory default
;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

begin
	dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1');
end;
/

rem
rem	Needs select on v$_im_column_level granted
rem

select
	table_name,
	column_name,
	inmemory_compression
from
	v$im_column_level
where	owner = user
and	table_name = 'T1'
order by
	segment_column_id
;

explain plan for
select
	last_ddl_time, created
from
	t1
where	t1.created > trunc(sysdate)
and	t1.object_type = 'TABLE'
and	t1.subobject_name is not null
;

select * from table(dbms_xplan.display);

All I’ve done at this point is create a table with most of its columns in-memory and a couple excluded from the columnar store. This is modelling a table with a very large number of columns where most queries are targeted at a relatively small subset of the data; I don’t want to have to store EVERY column in-memory in order to get the benefit of the feature, so I’m prepared to trade lower memory usage in general against slower performance for some queries. The query against v$im_column_level shows me which columns are in-memory, and how they are stored. The call to explain plan and dbms_xplan then shows that a query involving only columns that are declared in-memory could take advantage of the feature. Here’s the resulting execution plan:

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    27 |    73   (9)| 00:00:01 |
|*  1 |  TABLE ACCESS INMEMORY FULL| T1   |     1 |    27 |    73   (9)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - inmemory("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))
       filter("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))

Note that the table access full includes the inmemory keyword; and the predicate section shows the predicates that have taken advantage of in-memory columns. The question is – what happens if I add the object_id column (which I’ve declared as no inmemory) to the select list.  Here’s the resulting plan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    32 |  1818   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    32 |  1818   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))

There’s simply no sign of an in-memory strategy – it’s just a normal full tablescan (and I didn’t stop with execution plans, of course, I ran other tests with tracing, snapshots of dynamic performance views etc. to check what was actually happening at run-time).

In principle there’s no reason why Oracle couldn’t use the in-memory columns that appear in the where clause to determine the rowids of the rows that I need to select and then visit the rows by rowid but (at present) the optimizer doesn’t generate a plan to do that. There’s no reason, though, why we couldn’t try to manipulate the SQL to produce exactly that effect:


explain plan for
select
        /*+ no_eliminate_join(t1b) no_eliminate_join(t1a) */
        t1b.object_id, t1b.last_ddl_time, t1b.created
from
        t1 t1a, t1 t1b
where   t1a.created > trunc(sysdate)
and     t1a.object_type = 'TABLE'
and     t1a.subobject_name is not null
and     t1b.rowid = t1a.rowid
;

select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    64 |    74   (9)| 00:00:01 |
|   1 |  NESTED LOOPS               |      |     1 |    64 |    74   (9)| 00:00:01 |
|*  2 |   TABLE ACCESS INMEMORY FULL| T1   |     1 |    31 |    73   (9)| 00:00:01 |
|   3 |   TABLE ACCESS BY USER ROWID| T1   |     1 |    33 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - inmemory("T1A"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1A"."OBJECT_TYPE"='TABLE' AND "T1A"."CREATED">TRUNC(SYSDATE@!))
       filter("T1A"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1A"."OBJECT_TYPE"='TABLE' AND "T1A"."CREATED">TRUNC(SYSDATE@!))

I’ve joined the table to itself by rowid, hinting to stop the optimizer from getting too clever and eliminating the join. In the join I’ve ensured that one reference to the table can be met completely from the in-memory columns, isolating the no inmemory columns to the second reference to the table. It is significant that the in-memory tablescan is vastly lower in cost than the traditional tablescan – and there will be occasions when this difference (combined with the knowledge that the target is a relatively small number of rows) means that this is a very sensible strategy. Note – the hints I’ve used happen to be sufficient to demonstrate method but I’d be much more thorough in a production system (possibly using an SQL baseline to fix the execution plan).

Of course, this method is just another example of the “visit a table twice to improve the efficiency” strategy that I wrote about a long time ago; and it’s this particular variant of the strategy that allows you to think of the in-memory columnar option as an implementation of OLTP bitmap indexes.


Viewing Figures

Sun, 2014-08-03 17:05

The day has just started in Singapore – though it’s just coming up to midnight back home – and the view counter has reached 4,00,009 despite the fact that I’ve not been able to contribute much to the community for the last couple of months. Despite the temporary dearth of writing it’s time to have a little review to see what’s been popular and how things have changed in the 10 months it took to accumulate the last 500,000 views so here are some of the latest WordPress stats.

All time ratings AWR / Statspack 80,997 Updated from time to time NOT IN 51,673 February 2007 Cartesian Merge Join 39,265 December 2006 dbms_xplan in 10g 37,725 November 2006 Shrink Tablespace 31,184 November 2006 Ratings over the last year AWR / Statspack 13,905 Updated from time to time AWR Reports 9,494 February 2011 Shrink Tablespace 8,402 February 2010 Lock Modes 8,221 June 2010 NOT IN 6,388 February 2007

The figures for the previous half million views (opens in a new window) are very similar for most of the top 5 although “Analysing Statspack (1)” has been pushed from 5th place to 6th place in the all-time greats; and “Lock Modes” has swapped places with “NOT IN” in the annual ratings. As the annual WordPress summary says: “… your posts have staying powere, why not write more about …”.

The number of followers has gone up from about 2,500  to just over 3,900 but, as I said last time, I suspect that there’s a lot of double counting related to twitter.

 


Analogy – 2

Sun, 2014-08-03 06:41

I suggested a little while ago that thinking about the new in-memory columnar store as a variation on the principle of bitmap indexes was quite a good idea. I’ve had a couple of emails since then asking me to expand on the idea because “it’s wrong” – I will follow that one up as soon as I can, but in the meantime here’s another angle for connecting old technology with new technology:

It is a feature of in-memory column storage that the default strategy is to store all columns in memory. But it’s quite likely that you’ve got some tables where a subset of the columns are frequently accessed and other columns are rarely accessed and it might seem a waste of resources to keep all the columns in memory just for the few occasional queries. So the feature allows you to de-select columns with the “no inmemory({list of columns})” option – it’s also possible to use different degrees of compression for different columns, of course, which adds another dimension to design and planning – but that’s a thought for another day.

So where else do you see an example of being selective about where you put columns ?  Index Organized Tables (IOTs) – where you can choose to put popular columns in the index (IOT_TOP) segment, and the rest in the overflow segment, knowing that this can give you good performance for critical queries, but less desirable performance for the less important or less frequent queries. IOTs allow you to specify the (typically short) list of columns you want “in” – it might be quite nice if the same were true for the in-memory option, I can imagine cases where I would want to include a small set of columns and exclude a very large number of them (for reasons that bring me back to the bitmap index analogy).

 


Parallel Plans

Sun, 2014-07-27 13:39

I’ve popped this note to the top of the stack because I’ve added an index to Randolf Geist’s series on parallel execution skew, and included a reference his recent update to the XPLAN_ASH utility.

This is the directory for a short series I wrote discussing how to interpret parallel execution plans in newer versions of Oracle.

For other aspects of parallel execution, here are links to several articles by Randolf Geist, published on his own blog or on Oracle’s Technet:

One of the awkward problems you can encounter with parallel execution is data skew – which has the potential to make just one slave in a set do (almost) all the work hence reducing the performance to something close to serial execution times.  has written a series of articles on Parallel Skew that has been published by AllthingsOracle over the last few months.

And a summary posting from Randolf listing the 5 articles above, but also including a set of short videos on the topic.

 


Analogy

Sun, 2014-07-27 01:02

So 12.1.0.2 is out with a number of interesting new features, of which the most noisily touted is the “in-memory columnar storage” feature. As ever the key to making best use of a feature is to have an intuitive grasp of what it gives you, and it’s often the case that a good analogy helps you reach that level of understanding; so here’s the first thought I had about the feature during one of the briefing days run by Maria Colgan.

“In-memory columnar storage gives you bitmap indexes on OLTP systems without the usual disastrous locking side effects.”

Obviously the analogy isn’t perfect … but I think it’s very close:  for each column stored you use a compression technique to pack the values for a large number of rows into a very small space, and for each stored row you can derive the rowid by arithmetic.  In highly concurrent OLTP systems there’s still room for some contention as the session journals are applied to the globally stored compressed columns (but then, private redo introduces some effects of that sort anyway); and the “indexes” have to be created dynamically as tables are initially accessed (but that’s a startup cost, it’s timing can be controlled, and it’s basically limited to a tablescan).

Whatever the technical hand-waving it introduces – thinking of the in-memory thing as enabling real-time bitmaps ought to help you think of ways to make good use of the feature.

 

 


Deferrable RI – 2

Sun, 2014-07-13 12:46

A question came up on Oracle-L recently about possible locking anomalies with deferrable referential integrity constraints.

An update by primary key is taking a long time; the update sets several columns, one of which is the child end of a referential integrity constraint. A check on v$active_session_history shows lots of waits for “enq: TX – row lock contention” in mode 4 (share), and many of these waits also identify the current object as the index that has been created to avoid the “foreign key locking” problem on this constraint (though many of the waits show the current_obj# as -1). A possible key feature of the issue is that foreign key constraint is defined as “deferrable initially deferred”. The question is, could such a constraint result in TX/4 waits.

My initial thought was that if the constraint was deferrable it was unlikely, there would have to be other features coming into play.

Of course, when the foreign key is NOT deferrable it’s easy to set up cases where a TX/4 appears: for example you insert a new parent value without issuing a commit then I insert a new matching child, at that point my session will wait for your session to commit or rollback. If you commit my insert succeeds if you rollback my session raises an error (ORA-02291: integrity constraint (schema_name.constraint_name) violated – parent key not found). But if the foreign key is deferred the non-existence (or potential existence, or not) of the parent should matter.  If the constraint is deferrable, though, the first guess would be that you could get away with things like this so long as you fixed up the data in time for the commit.

I was wrong. Here’s a little example:


create table parent (
	id	number(4),
	name	varchar2(10),
	constraint par_pk primary key (id)
)
;

create table child(
	id_p	number(4)
		constraint chi_fk_par
		references parent
		deferrable initially deferred,
	id	number(4),
	name	varchar2(10),
	constraint chi_pk primary key (id_p, id)
)
;

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');

commit;

begin
	dbms_stats.gather_table_stats(user,'parent');
	dbms_stats.gather_table_stats(user,'child');
end;
/

pause Press return

update child set id_p = 3 where id_p = 2 and id = 2;

If you don’t do anything after the pause and before the insert then the update will succeed – but fail on a subsequent commit unless you insert parent 3 before committing. But if you take advantage of the pause to use another session to insert parent 3 first, the update will then hang waiting for the parent insert to commit or rollback – and what happens next may surprise you. Basically the deferrability doesn’t protect you from the side effects of conflicting transactions.

The variations on what can happen next (insert the parent elsewhere, commit or rollback) are interesting and left as an exercise.

I was slightly surprised to find that I had had a conversation about this sort of thing some time ago, triggered by a comment to an earlier post. If you want to read a more thorough investigation of the things that can happen and how deferrable RI works then there’s a good article at this URL.

 


SQL Plan Baselines

Sun, 2014-07-06 11:34

Here’s a thread from Oracle-L that reminded of an important reason why you still have to hint SQL sometimes (rather than following the mantra “if you can hint it, baseline it”).

I have a query that takes 77 seconds to optimize (it’s not a production query, fortunately, but one I engineered to make a point). I can enable sql plan baseline capture and create a baseline for it, and given the nature of the query I can be confident that the resulting plan will always be exactly the plan I want. If I have to re-optimize the query at any time  (because it runs once per hour, say, and is constantly being flushed from the library cache) how much time will the SQL plan baseline save for me ?

The answer is NONE.

The first thing that the optimizer does for a query with a stored sql plan baseline is to optimize it as if the baseline did not exist.

If I want to get rid of that 77 seconds I’ll have to extract (most of) the hints from the SQL Plan Baseline and write them into the query.  (Or, maybe, create a Stored Outline – except that they’re deprecated in the latest version of Oracle, and I’d have to check whether the optimizer used the same strategy with stored outlines or whether it applied the outline before doing any optimisation). Maybe we could do with a hint which forces the optimizer to attempt to use an existing, accepted SQL Baseline without attempting the initial optimisation pass.

 


Adjusting Histograms

Fri, 2014-07-04 13:32

This is a quick response to a question on an old blog post asking how you can adjust the high value if you’ve already got a height-balanced histogram in place. It’s possible that someone will come up with a tidier method, but this was just a quick sample I created and tested on 11.2.0.4 in a few minutes.  (Note – this is specifically for height-balanced histograms,  and it’s not appropriate for 12c which has introduced hybrid histograms that will require me to modify my “histogram faking” code a little).

rem
rem	Script:		adjust_histogram.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Jun 2014
rem	Purpose:
rem
rem	Last tested
rem		11.2.0.4
rem	Not tested
rem		12.1.0.1
rem		11.1.0.7
rem		10.2.0.5
rem	Outdated
rem		 9.2.0.8
rem		 8.1.7.4	no WITH subquery
rem
rem	Notes:
rem	Follow-on from a query on my blog about setting the high value
rem	when you have a histogram.  We could do this by hacking, or by
rem	reading the user_tab_histogram values and doing a proper prepare
rem

start setenv
set timing off

execute dbms_random.seed(0)

drop table t1;

begin
	begin		execute immediate 'purge recyclebin';
	exception	when others then null;
	end;

	begin
		dbms_stats.set_system_stats('MBRC',16);
		dbms_stats.set_system_stats('MREADTIM',10);
		dbms_stats.set_system_stats('SREADTIM',5);
		dbms_stats.set_system_stats('CPUSPEED',1000);
	exception
		when others then null;
	end;
/*
	begin		execute immediate 'begin dbms_stats.delete_system_stats; end;';
	exception	when others then null;
	end;

	begin		execute immediate 'alter session set "_optimizer_cost_model"=io';
	exception	when others then null;
	end;

	begin		execute immediate 'alter session set "_optimizer_gather_stats_on_load" = false';
	exception	when others then null;
	end;
*/

	begin		execute immediate  'begin dbms_space_admin.materialize_deferred_segments(''TEST_USER''); end;';
	exception	when others then null;
	end;

end;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	trunc(sysdate,'YYYY') + trunc(dbms_random.normal * 100,1)	d1
from
	generator	v1,
	generator	v2
where
	rownum <= 1e4
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 32'
	);

end;
/

spool adjust_histogram.lst

prompt	==================
prompt	Current High Value
prompt	==================

select to_char(max(d1),'dd-Mon-yyyy hh24:mi:ss') from t1;

prompt	==============================
prompt	Initial Histogram distribution
prompt	==============================

select
	endpoint_number,
	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val,
	endpoint_value,
	lag(endpoint_value,1) over(order by endpoint_number) lagged_epv,
	endpoint_value -
		lag(endpoint_value,1) over(order by endpoint_number)  delta
from	user_tab_histograms
where
	table_name = 'T1'
and	column_name = 'D1'
;

rem
rem	Note - we can't simply overwrite the last srec.novals
rem	because that doesn't adjust the stored high_value.
rem	We have to make a call to prepare_column_values,
rem	which means we have to turn the stored histogram
rem	endpoint values into their equivalent date types.
rem

prompt	==================
prompt	Hacking the values
prompt	==================

declare

	m_distcnt		number;
	m_density		number;
	m_nullcnt		number;
	srec			dbms_stats.statrec;
	m_avgclen		number;

	d_array			dbms_stats.datearray := dbms_stats.datearray();
	ct			number;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'd1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

	ct := 0;
	for r in (
		select	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val
		from	user_tab_histograms
		where	table_name = 'T1'
		and	column_name = 'D1'
		order by endpoint_number
	) loop

		ct := ct + 1;
		d_array.extend;
		d_array(ct) := r.d_val;
		if ct = 1 then
			srec.bkvals(ct) := 0;
		else
			srec.bkvals(ct) := 1;
		end if;

	end loop;

	d_array(ct) := to_date('30-Jun-2015','dd-mon-yyyy');

	dbms_stats.prepare_column_values(srec, d_array);

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'd1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	);
end;
/

prompt	============================
prompt	Final Histogram distribution
prompt	============================

select
	endpoint_number,
	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val,
	endpoint_value,
	lag(endpoint_value,1) over(order by endpoint_number) lagged_epv,
	endpoint_value -
		lag(endpoint_value,1) over(order by endpoint_number)  delta
from	user_tab_histograms
where
	table_name = 'T1'
and	column_name = 'D1'
;

spool off

doc

#