Randolf Geist
New Version Of XPLAN_ASH Tool - Video Tutorial
A new major release (version 3.0) of my XPLAN_ASH tool is available for download.
You can download the latest version here.
In addition to many changes to the way the information is presented and many other smaller changes to functionality there is one major new feature: XPLAN_ASH now also supports S-ASH, the free ASH implementation.
If you run XPLAN_ASH in a S-ASH repository owner schema, it will automatically detect that and adjust accordingly.
XPLAN_ASH was tested against the latest stable version of S-ASH (2.3). There are some minor changes required to that S-ASH release in order to function properly with XPLAN_ASH. Most of them will be included in the next S-ASH release as they really are only minor and don't influence the general S-ASH functionality at all.
If you're interested in using XPLAN_ASH with an existing S-ASH installation get in touch with me so I can provide the necessary scripts that apply the necessary changes.
Rather than writing another lengthy blog post about the changes and new features introduced I thought I start a multi-part video tutorial where I explain the purpose of the tool and how to use it based on the new version - some parts of the tutorial will focus on specific functionality of the tool and are therefore probably also quite useful as some kind of general tutorial on that Oracle feature and SQL execution troubleshooting guide in general.
The tutorial will consist of six parts initially, the first two are already available - the next ones to follow over time.
Part 1: Introduction, Overview
Part 2: Usage, Parameters, Invocation
Part 3: Rowsource Statistics: TBD
Part 4: Active Session History: TBD
Part 5: Systematic Parallel Execution Skew Analysis & Troubleshooting: Coming Soon
Part 6: Experimental Stuff, Script Configuration And Internals: TBD
Feel free to post questions/requests for clarification that are not covered in the tutorials in the comments section - if there are topics of general interest I might publish a seventh part addressing those questions.
In future I might use that video style more often since it's a nicer way of conveying certain kind of information.
You can download the latest version here.
In addition to many changes to the way the information is presented and many other smaller changes to functionality there is one major new feature: XPLAN_ASH now also supports S-ASH, the free ASH implementation.
If you run XPLAN_ASH in a S-ASH repository owner schema, it will automatically detect that and adjust accordingly.
XPLAN_ASH was tested against the latest stable version of S-ASH (2.3). There are some minor changes required to that S-ASH release in order to function properly with XPLAN_ASH. Most of them will be included in the next S-ASH release as they really are only minor and don't influence the general S-ASH functionality at all.
If you're interested in using XPLAN_ASH with an existing S-ASH installation get in touch with me so I can provide the necessary scripts that apply the necessary changes.
Rather than writing another lengthy blog post about the changes and new features introduced I thought I start a multi-part video tutorial where I explain the purpose of the tool and how to use it based on the new version - some parts of the tutorial will focus on specific functionality of the tool and are therefore probably also quite useful as some kind of general tutorial on that Oracle feature and SQL execution troubleshooting guide in general.
The tutorial will consist of six parts initially, the first two are already available - the next ones to follow over time.
Part 1: Introduction, Overview
Part 2: Usage, Parameters, Invocation
Part 3: Rowsource Statistics: TBD
Part 4: Active Session History: TBD
Part 5: Systematic Parallel Execution Skew Analysis & Troubleshooting: Coming Soon
Part 6: Experimental Stuff, Script Configuration And Internals: TBD
Feel free to post questions/requests for clarification that are not covered in the tutorials in the comments section - if there are topics of general interest I might publish a seventh part addressing those questions.
In future I might use that video style more often since it's a nicer way of conveying certain kind of information.
ASM AU Size And LMT AUTOALLOCATE
When using Locally Managed Tablespaces (LMT) with variable, system managed extent sizes (AUTOALLOCATE) and data files residing in ASM the Allocation Unit (AU) size can make a significant difference to the algorithm that searches for free extents.The corresponding free extent search algorithm when searching for free extents >= the AU size seems to only search for free extents on AU boundaries in order to avoid I/O splitting.Furthermore the algorithm seems to use two extent sizes when searching for free extents: A "desired" (for example 8MB) and a "minimum acceptable" (for example 1MB) extent size - however when performing the search the "desired" size seems to be relevant when limiting the search to free extents on AU boundaries.This can lead to some surprising side effects, in particular when using 4MB AUs.It effectively means that although you might have plenty of space (I've seen cases with more than 90% free of a several hundred GB tablespace) processes inserting data might fail with "unable to extend segment" error messages when the "desired" extent size is >= 4MB and the AU size is 4MB, although there might be for example hundreds of (fragmented) free extents available of < 8MB size.This is particularly relevant to Exadata systems because I believe the default ASM AU size is 4MB there.This behaviour can be influenced by using event 60060, which disables the extent alignment to AU boundaries.Here is a simple test case demonstrating the issue. It deliberately interleaves extents of two tables and then drops one of them to leave non-contiguous free space behind:
If you run this test case on a tablespace using 4MB AU ASM data files, it will fail to create the last table, although it only requires approx. 127MB and there is approx. 200MB of (non-contiguous) free space (200 times 1MB extents), simply because the "desired" extent size is 8MB and no suitable free extents on "AU boundary" can be found.The relevance of the "AU boundary" condition can easily be checked by changing the test case to drop the other table. This table's extents allocate all the "AU boundaries" and hence the table creation will succeed as a sufficient number of (1MB) extents on "AU boundaries" could be found.If you repeat the same test case with the event set or using data files with a 1MB AU (or simply standard file system data files) the table will be created successfully (even with no extents on "AU boundaries" available), re-using the available 1MB extents. Although the "desired" extent size is still 8MB, the "minimum acceptable" extent size of 1MB allows the re-usage since the free extents don't need to be aligned on AU boundaries.Performance Vs. Space UsageSo with the event you can choose between performance (larger extents, no I/O splitting for desired larger extents) and space usage (non-contiguous, smaller extents).Another option in such cases is the usage of tablespaces with UNIFORM extent allocation, however it might be hard to find a good extent size when dealing with many segments of vastly differing sizes.FootnoteThis should only be relevant if the processes inserting into the tablespace manage to run at a high concurrency, therefore interleaving the extents (causing non-contiguous free extents when moving / dropping / truncating segments) in conjunction with extent sizes of vastly different sizes.Interestingly I've observed this behaviour on an Exadata system running 11.2.0.3 BP14 and using a tablespace where mostly partitioned objects resided. Since partitioned objects are using an initial extent size of 8MB in recent versions you wouldn't expect this problem in such a scenario, but it looks like that mixing regular load processes with ALTER TABLE ... MOVE COMPRESS PARALLEL can lead to much smaller extent sizes than 8MB and therefore allowing the issue to occur.
-----------------------------------------------------------------------
-- Exadata / ASM 4MB AUs LMT AUTOALLOCATE extent fragmentation issue --
-----------------------------------------------------------------------
--
-- Tablespaces using Locally Managed Extents and the AUTOALLOCATE option
-- seem to behave differently on Exadata storage / ASM 4MB AUs than on regular storage / ASM 1MB AUs
--
-- On regular storage / ASM 1MB AUs the algorithm has more flexibility when allocating a next extent
-- For example, if according to the rules an 8MB extent would be required but only 1MB extents
-- are available, it gracefully re-uses the smaller extents, at least to some degree,
-- if they are not too small (for example it won't start re-using 64K free extents if an 8MB extent should be allocated)
--
-- On Exadata / ASM 4MB AUs the same test case fails with an ORA-01652 since it obviously can't re-use the existing free extents
-- Since in the below example there is no suitable extent on AU boundary available the table creation fails
--
-- Given a corresponding usage pattern this can lead to huge space wastage or "unable to extend" errors with plenty of (fragmented) free space
-- in Exadata / ASM 4MB AUs tablespaces
set echo on timing on time on
-- Enable this EVENT to disable the extent alignment to AU boundaries
-- This will allow below table to get created also in a tablespace using 4MB AUs in ASM
-- alter session set events '60060 trace name context forever, level 1';
drop tablespace auto_alloc_test including contents and datafiles;
-- Create tablespace either on ASM / Exadata storage or outside on regular file system
-- by uncommenting the datafile name clause
create tablespace auto_alloc_test
datafile
--'auto_alloc.dbf'
size 400M
extent management local
autoallocate
segment space management auto;
-- Create two tables
begin
for i in 1..2 loop
execute immediate 'create table table'||i||'(col1 number,col2 number) /*segment creation immediate*/ tablespace auto_alloc_test';
end loop;
end;
/
-- Interleave the extents until no space left (the ORA-01653 is expected)
-- This will generate lots of 1MB extents
begin
for i in 1..1000 loop
execute immediate 'alter table table'||(mod(i,2)+1)||' allocate extent';
end loop;
end;
/
-- Free half of the tablespace, but free extents are fragmented, and max. free contiguous space is 1MB
-- If you drop TABLE2, no extents on 4MB AU boundaries will be available
drop table table2;
-- If you drop TABLE1, extents on 4MB AU boundaries will be available and below CREATE TABLE will be successful
--drop table table1;
select sum(bytes)/1024/1024 sum_free_mb, max(bytes)/1024/1024 max_free_mb from dba_free_space where tablespace_name='AUTO_ALLOC_TEST';
-- Create a table that fits into the free space (less than 200MB) but usually will request 8MB extents
--
-- When you drop TABLE2 above, this fails on Exadata storage/ ASM 4MB AUs because it attempts to find extents on AU boundaries that are not available
--
-- When you drop TABLE2 above, this succeeds on regular storage / ASM 1MB AUs because it gracefully re-uses the existing 1MB extents if no 8MB extents can be found among the free extents
--
-- This leads to ORA-01652 error messages on Exadata storage / ASM 4MB AUs with a suitable extent usage pattern although there is plenty of (fragmented) free space
create table test
tablespace auto_alloc_test
as
select rpad('x', 100) as col1 from
(select /*+ cardinality(1e5) */ * from dual connect by level <= 1e5),
(select /*+ cardinality(11) */ * from dual connect by level <= 11)
;
-- In case of success check segment size and extent layout
select bytes/1024/1024 as MB from dba_segments where segment_name = 'TEST' and owner = USER;
select extent_id, bytes, blocks from user_extents where segment_name = 'TEST';
-- In case of failure check AU boundaries of free extents
with au_size
as
(
select allocation_unit_size from v$asm_diskgroup where name = (select substr(file_name, 2, instr(file_name, '/') - 2) from dba_data_files where tablespace_name = 'AUTO_ALLOC_TEST')
)
select count(*) from (
select block_id * 16384 / allocation_unit_size as AU_info, a.* from dba_free_space a, au_size where tablespace_name = 'AUTO_ALLOC_TEST'
)
where au_info = trunc(au_info);
Free Webinar
It's webinar time again.
Join me on Wednesday, May 8th at AllThingsOracle.com for an overview session on the specifics of Oracle Parallel Execution.
The session starts at 16:00 UK (17:00 Central European) time. The webinar is totally free and the recording will made available afterwards.
Here's the link to the official landing page where you can register and below is the official abstract:
Abstract
Oracle Parallel Execution, a feature of the Enterprise Edition, allows you to automatically distribute the processing of a SQL statement execution among multiple worker processes. This requires additional effort when analysing and troubleshooting such parallel executions, since it adds complexity that is simply not there with normal serial execution where only a single process is involved.
In this webinar I'll provide an overview of what these additional challenges are and how these can be approached. A live Q&A session will follow the presentation.
Join me on Wednesday, May 8th at AllThingsOracle.com for an overview session on the specifics of Oracle Parallel Execution.
The session starts at 16:00 UK (17:00 Central European) time. The webinar is totally free and the recording will made available afterwards.
Here's the link to the official landing page where you can register and below is the official abstract:
Abstract
Oracle Parallel Execution, a feature of the Enterprise Edition, allows you to automatically distribute the processing of a SQL statement execution among multiple worker processes. This requires additional effort when analysing and troubleshooting such parallel executions, since it adds complexity that is simply not there with normal serial execution where only a single process is involved.
In this webinar I'll provide an overview of what these additional challenges are and how these can be approached. A live Q&A session will follow the presentation.
"Cost-free" joins - 2
In the previous post I've demonstrated an unexpected Nested Loop Join caused by an extreme data distribution. Although unexpected at first sight, the performance of the execution plan selected by the optimizer is decent - provided the estimates are in the right ballpark.Here is another case of an unexpected execution plan, this time about Merge Joins.Merge JoinsIn order to appreciate why the execution plan encountered is unexpected, first a quick summary about how Merge Joins work:A Merge Join is essentially a Nested Loop operation from one sorted row source into another sorted row source. In contrast to a Nested Loop the join condition is not used for a possible index-driven lookup from the driving, outer row source into the inner row source, simply because Oracle usually first needs to run separate operations on each rowsource for sorting.This means that in most cases the Merge Join requires to sort both row sources and therefore a Hash Join is usually preferred where possible (for example, Hash Joins are only suitable for Equi-Joins, whereas a Merge Join also supports non-Equi Joins), because it only needs to "prepare" one row source for building the hash table, and can then process the second row source as it is without any further start-up cost / preparation steps.Let's have a look at some common execution plans using Merge Joins. Consider this simple setup:
So this is what a Merge Join usually looks like:
create table t1
as
select
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e3
;
exec dbms_stats.gather_table_stats(null, 't1')
create unique index t1_idx on t1 (id);
create table t2
as
select
rownum as id
, 1 as fk
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e6
;
exec dbms_stats.gather_table_stats(null, 't2')
As usual I had to force the Merge Join via a hint, since in my (default 11.2.0.1) setup a Hash Join would be preferred. Notice the two SORT JOIN operations that first create two (ideally in-memory) sorted/indexed tables from the two row sources to be joined and how the SORT JOIN on the larger row source basically determines the overall cost of this MERGE JOIN.A corresponding Hash Join could use the smaller row source as hash table and therefore very likely would be much more efficient.Since the MERGE JOIN usually needs to SORT both row sources it doesn't make such a big difference which of the two row sources comes first, but it is interesting to note that the MERGE JOIN is not able to "swap" the join inputs as the HASH JOIN is able to, which, in particular for outer joins, makes the MERGE JOIN less flexible.Here is a variation of a MERGE JOIN that avoids a SORT JOIN operation. This is only supported for the "driving" row source:
select /*+ use_merge(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id (+) = t2.fk
;
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 199M| | 28075 (1)| 00:05:37 |
| 1 | MERGE JOIN OUTER | | 1000K| 199M| | 28075 (1)| 00:05:37 |
| 2 | SORT JOIN | | 1000K| 99M| 217M| 28067 (1)| 00:05:37 |
| 3 | TABLE ACCESS FULL| T2 | 1000K| 99M| | 4333 (1)| 00:00:52 |
|* 4 | SORT JOIN | | 1000 | 102K| | 7 (15)| 00:00:01 |
| 5 | TABLE ACCESS FULL| T1 | 1000 | 102K| | 6 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."ID"(+)="T2"."FK")
filter("T1"."ID"(+)="T2"."FK")
The MERGE JOIN knows that the driving row source will be accessed in sorted order due to the suitable INDEX RANGE SCAN operation and therefore doesn't add a SORT operation on top.If we now run the same statement using Parallel Execution (note that the statement level PARALLEL hint used in the example is only supported from 11g on), we'll see the following:
select /*+ use_merge(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id = t2.fk
and t1.id between 1 and 10
;
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 909K| 181M| | 28081 (1)| 00:05:37 |
| 1 | MERGE JOIN | | 909K| 181M| | 28081 (1)| 00:05:37 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 10 | 1050 | | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 10 | | | 2 (0)| 00:00:01 |
|* 4 | SORT JOIN | | 1000K| 99M| 217M| 28078 (1)| 00:05:37 |
|* 5 | TABLE ACCESS FULL | T2 | 1000K| 99M| | 4344 (2)| 00:00:53 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T1"."ID">=1 AND "T1"."ID"<=10)
4 - access("T1"."ID"="T2"."FK")
filter("T1"."ID"="T2"."FK")
5 - filter("T2"."FK">=1 AND "T2"."FK"<=10)
So usually, due to the way things run in parallel, Oracle assumes it cannot guarantee the order of the row source and includes a SORT operation for both row sources joined.Although there are special cases where this could be avoided even for Parallel Execution, it looks like the code adds this SORT operation unconditionally in case of Parallel Execution. We'll see how this can become a threat in a moment.The Special CaseNow back to the special case I want to demonstrate here. Let's have a look at the following query:
select /*+ use_merge(t1 t2) parallel */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id = t2.fk
and t1.id between 1 and 10
;
------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 909K| 181M| | 15594 (1)| 00:03:08 | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 909K| 181M| | 15594 (1)| 00:03:08 | Q1,01 | P->S | QC (RAND) |
| 3 | MERGE JOIN | | 909K| 181M| | 15594 (1)| 00:03:08 | Q1,01 | PCWP | |
| 4 | SORT JOIN | | 10 | 1050 | | 3 (0)| 00:00:01 | Q1,01 | PCWP | |
| 5 | BUFFER SORT | | | | | | | Q1,01 | PCWC | |
| 6 | PX RECEIVE | | 10 | 1050 | | 3 (0)| 00:00:01 | Q1,01 | PCWP | |
| 7 | PX SEND BROADCAST | :TQ10000 | 10 | 1050 | | 3 (0)| 00:00:01 | | S->P | BROADCAST |
| 8 | TABLE ACCESS BY INDEX ROWID| T1 | 10 | 1050 | | 3 (0)| 00:00:01 | | | |
|* 9 | INDEX RANGE SCAN | T1_IDX | 10 | | | 2 (0)| 00:00:01 | | | |
|* 10 | SORT JOIN | | 1000K| 99M| 217M| 15591 (1)| 00:03:08 | Q1,01 | PCWP | |
| 11 | PX BLOCK ITERATOR | | 1000K| 99M| | 2407 (1)| 00:00:29 | Q1,01 | PCWC | |
|* 12 | TABLE ACCESS FULL | T2 | 1000K| 99M| | 2407 (1)| 00:00:29 | Q1,01 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
9 - access("T1"."ID">=1 AND "T1"."ID"<=10)
10 - access("T1"."ID"="T2"."FK")
filter("T1"."ID"="T2"."FK")
12 - filter("T2"."FK">=1 AND "T2"."FK"<=10)
Notice that I now got a MERGE JOIN although I haven't provided any hints to do so, so this execution plan was automatically favored by optimizer. Why?This is a special case, because the optimizer understands that the join key is actually a single value, due to the predicate on T2.FK. So for a serial execution it doesn't bother to SORT the large row source (since it knows there will only be the value "1") and hence the MERGE JOIN comes out with a (slightly) lower cost estimate than a corresponding HASH JOIN.It's interesting to note that in this particular case here it could even be avoided to SORT the second row source, since it, too, can only return a single value. But obviously the MERGE JOIN always runs a SORT JOIN operation on the second row source, as already outlined above.Due to the way the data is designed and the direction of the outer join a NESTED LOOP join isn't a reasonable alternative either here.Note that at runtime a HASH JOIN seems to be slightly more efficient in this particular case here, so this is already an indication that the cost estimates do not reflect the efficiency at runtime very well, in particular the CPU overhead of the actual join operation seems to be underestimated for the MERGE JOIN.Now let's see what happens if we run this query using Parallel Execution:
select
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id (+) = t2.fk
and t2.fk = 1
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 199M| 4342 (1)| 00:00:53 |
| 1 | MERGE JOIN OUTER | | 1000K| 199M| 4342 (1)| 00:00:53 |
|* 2 | TABLE ACCESS FULL | T2 | 1000K| 99M| 4339 (1)| 00:00:53 |
|* 3 | SORT JOIN | | 1 | 105 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 105 | 2 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | T1_IDX | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T2"."FK"=1)
3 - access("T1"."ID"(+)="T2"."FK")
filter("T1"."ID"(+)="T2"."FK")
5 - access("T1"."ID"(+)=1)
Look very carefully at the order of the operations, and what part of the execution plan runs in parallel and what is executed serially.This is where things become pretty weird and threatening: The TABLE ACCESS to the large row source T2 runs in parallel (with the corresponding lower cost), but the data is then handed over to the Query Coordinator for a SORT JOIN operation - which wasn't there in serial execution and is in fact unnecessary since we still have a single value in the join key.After sorting the large row source, the MERGE JOIN operation itself is performed by the Query Coordinator, so no Parallel Execution is involved here either.Both the serial SORT JOIN of the large row source and the MERGE JOIN operation itself are literally free of cost here, which is clearly unreasonable, in particular if the row source is very large.Although the SORT JOIN will basically turn into a simple "BUFFER SORT" operation, since there is effectively nothing to sort, it still means that a potentially very big volume of data will have to be handed over from the Parallel Worker processes scanning the row source to the Query Coordinator - in this particular case by definition an inefficient operation, because a large data volume has to be passed from multiple Parallel Processes to the single Query Coordinator - and then this potentially very big volume of data will have to be SORTED by the Query Coordinator, which very likely means that this operation won't fit into PGA memory of that single process, hence spill to TEMP causing potentially significant additional (and unnecessary) read and write I/O, all to be done serially by the Query Coordinator.This is a textbook example of a Parallel Execution plan that is deemed to take longer than the corresponding serial execution plan, and it is the execution plan that is preferred by the optimizer when left unhinted.Let's have a look at the Parallel Execution plan when using a HASH JOIN:
select /*+ parallel */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id (+) = t2.fk
and t2.fk = 1
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 199M| 2406 (1)| 00:00:29 | | | |
| 1 | MERGE JOIN OUTER | | 1000K| 199M| 2406 (1)| 00:00:29 | | | |
| 2 | SORT JOIN | | 1000K| 99M| 2403 (1)| 00:00:29 | | | |
| 3 | PX COORDINATOR | | | | | | | | |
| 4 | PX SEND QC (RANDOM) | :TQ10000 | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,00 | P->S | QC (RAND) |
| 5 | PX BLOCK ITERATOR | | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,00 | PCWC | |
|* 6 | TABLE ACCESS FULL | T2 | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,00 | PCWP | |
|* 7 | SORT JOIN | | 1 | 105 | 3 (34)| 00:00:01 | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 105 | 2 (0)| 00:00:01 | | | |
|* 9 | INDEX UNIQUE SCAN | T1_IDX | 1 | | 1 (0)| 00:00:01 | | | |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("T2"."FK"=1)
7 - access("T1"."ID"(+)="T2"."FK")
filter("T1"."ID"(+)="T2"."FK")
9 - access("T1"."ID"(+)=1)
Looking at the child operations' cost estimates of the HASH JOIN it becomes obvious that it is the costing of the HASH JOIN itself that makes the whole operation more costly than the MERGE JOIN, which is clearly questionable.So the strange thing about the MERGE JOIN Parallel Execution plan is that the join operation itself is done serially, whereas the HASH JOIN execution plan, although it uses the same access to the row sources (INDEX UNIQUE SCAN and FULL TABLE SCAN), happily runs in parallel.What causes this strange execution plan shape? Obviously it is the UNIQUE index on the other, smaller row source. Somehow the MERGE JOIN code is mislead by the UNIQUE index scan, which causes the join operation to run serially.Replacing the UNIQUE index with a NON-UNIQUE index (and using a UNIQUE constraint on top to achieve the same uniqueness) gives this execution plan:
select /*+ parallel use_hash(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id (+) = t2.fk
and t2.fk = 1
---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 199M| 2411 (1)| 00:00:29 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 1000K| 199M| 2411 (1)| 00:00:29 | Q1,01 | P->S | QC (RAND) |
|* 3 | HASH JOIN RIGHT OUTER | | 1000K| 199M| 2411 (1)| 00:00:29 | Q1,01 | PCWP | |
| 4 | BUFFER SORT | | | | | | Q1,01 | PCWC | |
| 5 | PX RECEIVE | | 1 | 105 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 6 | PX SEND BROADCAST | :TQ10000 | 1 | 105 | 2 (0)| 00:00:01 | | S->P | BROADCAST |
| 7 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 105 | 2 (0)| 00:00:01 | | | |
|* 8 | INDEX UNIQUE SCAN | T1_IDX | 1 | | 1 (0)| 00:00:01 | | | |
| 9 | PX BLOCK ITERATOR | | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,01 | PCWC | |
|* 10 | TABLE ACCESS FULL | T2 | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,01 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T1"."ID"(+)="T2"."FK")
8 - access("T1"."ID"(+)=1)
10 - filter("T2"."FK"=1)
So now we still have the unnecessary SORT JOIN operation of the large row source, but at least the SORT JOIN and MERGE JOIN operations are now executed in parallel, which should make it far less threatening.Of course, a corresponding HASH JOIN will still be much more efficient for larger row sources, but needs to be hinted in this special case here.SummaryFor MERGE JOINs there are some special cases where the current costing model doesn't properly reflect the actual work - together with some strange behaviour of the MERGE JOIN code when using Parallel Execution this can lead to questionable execution plans preferred by the optimizer.Carefully check the resulting execution plans when using Parallel Execution and MERGE JOINs get preferred by the optimizer.
drop index t1_idx;
create index t1_idx on t1 (id);
alter table t1 add constraint t1_uq unique (id) using index t1_idx;
select /*+ parallel */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id (+) = t2.fk
and t2.fk = 1
----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 199M| 2406 (1)| 00:00:29 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 1000K| 199M| 2406 (1)| 00:00:29 | Q1,01 | P->S | QC (RAND) |
| 3 | MERGE JOIN OUTER | | 1000K| 199M| 2406 (1)| 00:00:29 | Q1,01 | PCWP | |
| 4 | SORT JOIN | | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,01 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,01 | PCWC | |
|* 6 | TABLE ACCESS FULL | T2 | 1000K| 99M| 2403 (1)| 00:00:29 | Q1,01 | PCWP | |
|* 7 | SORT JOIN | | 1 | 105 | 3 (34)| 00:00:01 | Q1,01 | PCWP | |
| 8 | BUFFER SORT | | | | | | Q1,01 | PCWC | |
| 9 | PX RECEIVE | | 1 | 105 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 10 | PX SEND BROADCAST | :TQ10000 | 1 | 105 | 2 (0)| 00:00:01 | | S->P | BROADCAST |
| 11 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 105 | 2 (0)| 00:00:01 | | | |
|* 12 | INDEX RANGE SCAN | T1_IDX | 1 | | 1 (0)| 00:00:01 | | | |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("T2"."FK"=1)
7 - access("T1"."ID"(+)="T2"."FK")
filter("T1"."ID"(+)="T2"."FK")
12 - access("T1"."ID"(+)=1)
"Cost-free" joins - 1
Recently I came across some interesting edge cases regarding the costing of joins. They all have in common that they result in (at first sight) unexpected execution plans, but only some of them are actual threats to performance.Outer JoinsThe first one is about outer joins with an extreme data distribution. Consider the following data setup:
The following query is a simple outer join between T1 and T2 and the default, unhinted execution plan that I get from 11.2.0.1 (11.1.0.7 and 10.2.0.4 show similar results):
create table t1
as
select
rownum as id
, rpad('x', 100) as filler
, case when rownum > 1e6 then rownum end as null_fk
from
dual
connect by
level <= 1e6
;
exec dbms_stats.gather_table_stats(null, 't1')
create table t2
as
select
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e6
;
exec dbms_stats.gather_table_stats(null, 't2')
create /*unique*/ index t2_idx on t2 (id);
The optimizer preferred a Nested Loop join albeit the fact that the number of estimated loop iterations is pretty large. Notice in particular the cost column: Although the inner rowsource is estimated to be started 1000K times, the cost of doing so corresponds to just a single execution.For reference here is a cost estimate for a similar operation that corresponds to the expected costing model:
select
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.null_fk = t2.id (+)
;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 202M| 4204 (1)| 00:00:51 |
| 1 | NESTED LOOPS OUTER | | 1000K| 202M| 4204 (1)| 00:00:51 |
| 2 | TABLE ACCESS FULL | T1 | 1000K| 101M| 4202 (1)| 00:00:51 |
| 3 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 106 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | T2_IDX | 1 | | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."NULL_FK"="T2"."ID"(+))
I now had to force the Nested Loop join via a hint, because by default other join methods were preferred by the optimizer. The cost of a single iteration of the loop has now increased to 3, although the access is exactly the same as before (T2 random table access via index lookup of T2.ID), and the cost of the Nested Loop join corresponds to the known formula: Estimated starts times the cost of a single iteration, which is 3000K in this case here, plus the 4200 of the Full Table Scan for accessing the outer row source, plus CPU costing overhead.So what makes the difference between the two? It's the data. The column name chosen for the column in T1 already reveals what's so special: The join column used (NULL_FK) is actually NULL for all rows.The optimizer takes this into account and assumes that none of those rows from the driving row source will actually match a row of the inner row source - in fact the lookup to the inner row source could be short-circuited in some way, since a NULL value by definition isn't supposed to find a match for this join. I haven't investigated to what extent the runtime engine does this, however in the Rowsource Statistics the inner row source is started the expected number of times, although no logical I/O is recorded for it, but some CPU time, so at least some work seems to be done there.Modifying the test case so that more of the FKs are actually non-null shows that the cost calculation is scaled accordingly. In fact the cost calculation is more or less the same than that of a corresponding inner join that could filter out those driving rows with NULL values in the join column.The overall performance of the execution plan is quite decent, so although it looks quite unusual it performs pretty well.In the second part I'll show another interesting, unexpected join execution plan that however can cause real performance problems.
select /*+ use_nl(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id = t2.id (+)
;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 202M| 3006K (1)| 10:01:21 |
| 1 | NESTED LOOPS OUTER | | 1000K| 202M| 3006K (1)| 10:01:21 |
| 2 | TABLE ACCESS FULL | T1 | 1000K| 101M| 4200 (1)| 00:00:51 |
| 3 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 106 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | T2_IDX | 1 | | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."ID"="T2"."ID"(+))
QB_NAME hint query block name length limitation
Oracle 10g introduced the QB_NAME hint that can come handy in case hints need to be applied to more complex statements, in particular when possibly multiple layers of views / subqueries are involved.Jonathan Lewis has a older blog post that describes more details.Just in case you wonder why sometimes apparently the QB_NAME hint - along with all other hints that refer to the assigned query block name - seems to be ignored: One possible reason is that it looks like there is an undocumented length limitation of the query block names that can be assigned - 20 characters seem to be the maximum possible (I haven't checked the effect of multi-byte database character sets).Consider this simple example:
By default Oracle 11.2.0.1 unnests the correlated subquery and runs an anti-join for the following simple query:
drop table t1;
purge table t1;
create table t1
as
select
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 100
;
exec dbms_stats.gather_table_stats(null, 't1')
Notice the aliases and query block names assigned, like SEL$1, SEL$2 etc.Using the QB_NAME hint the NO_UNNEST hint can be applied to the correlated subquery from the outer query block:
select
*
from
t1
where
not exists
(
select
null
from
t1 t1_inner
where
t1_inner.id = t1.id
);
-----------------------------------
| Id | Operation | Name |
-----------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | HASH JOIN ANTI | |
| 2 | TABLE ACCESS FULL| T1 |
| 3 | TABLE ACCESS FULL| T1 |
-----------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$5DA710D3
2 - SEL$5DA710D3 / T1@SEL$1
3 - SEL$5DA710D3 / T1_INNER@SEL$2
The correlated subquery is now executed using a FILTER operator as requested. Notice the aliases and query block names, in particular for the instance of T1 at operation id = 3Now if I repeat the same query, but use a longer query block name, the hints are effectively ignored and the default unnesting takes place again:
select /*+ no_unnest(@nested_query) */
*
from
t1
where
not exists
(
select /*+ qb_name(nested_query) */
null
from
t1 t1_inner
where
t1_inner.id = t1.id
);
-----------------------------------
| Id | Operation | Name |
-----------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | FILTER | |
| 2 | TABLE ACCESS FULL| T1 |
|* 3 | TABLE ACCESS FULL| T1 |
-----------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
2 - SEL$1 / T1@SEL$1
3 - NESTED_QUERY / T1_INNER@NESTED_QUERY
Notice how the default aliases and query block names get used although explicitly hinted.It is probably rather uncommon to use such lengthy query block names, nevertheless it can be puzzling when hitting such an undocumented limitation.
select /*+ no_unnest(@nested_query_longer_name) */
*
from
t1
where
not exists
(
select /*+ qb_name(nested_query_longer_name) */
null
from
t1 t1_inner
where
t1_inner.id = t1.id
);
-----------------------------------
| Id | Operation | Name |
-----------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | HASH JOIN ANTI | |
| 2 | TABLE ACCESS FULL| T1 |
| 3 | TABLE ACCESS FULL| T1 |
-----------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$5DA710D3
2 - SEL$5DA710D3 / T1@SEL$1
3 - SEL$5DA710D3 / T1_INNER@SEL$2
Exadata Smart Scan Projection Limitation
Here is an interesting limitation to Exadata Smart Scans - if more than 254 columns from a table (not HCC compressed, more on that in moment) need to be projected, Smart Scans for that particular segment will be disabled and Exadata will fall back to conventional I/O. This means that the number of columns in the projection clause can make a significant difference to performance, since only Smart Scans allow taking advantage of offloading and particularly avoiding I/O via Storage Indexes.Now the expression "254 columns" might ring a bell, since it is the maximum number of columns that Oracle can store in a single row piece - tables consisting of more than 254 columns will have to be stored in multiple row pieces.However, what I'm talking about here is not related to such columns residing in different row pieces of a row - Smart Scans still happily work even if columns from different row pieces are projected (which was subject to several bugs in the past), although you might end up with additional "cell single block reads" in case of truly chained rows rather than just additional logical I/O for picking up the different row pieces from the same block, also sometimes called "intra-block" chaining.No, the limitation simply seems to be that Smart Scans - broadly speaking and ignoring edge cases - can only transport a maximum of 254 columns from a single (non-HCC) segment. Requesting more columns will simply disable Smart Scans for that segment.Now you might say, offloading and in particular offloading column projection isn't that much relevant if you select that many columns from a table anyway, but the point is that you loose the ability to benefit from Storage Indexes and only transporting the relevant rows back to the compute nodes.Both features can speed up the processing significantly, in particular if the number of rows selected is only a fraction of the total number of rows, and/or the cells could avoid a significant amount of I/O via Storage Indexes.To demonstrate the point I've put together a simple test case that generates a test table with more than 254 columns - the script below generates a table of approx. 40GB uncompressed size so that a significant difference in performance could be measured.
Increasing the number of columns projected from the segment above 254 (and as outlined above it doesn't matter from which row pieces these columns come from) disables the Smart Scan, and it takes more than 9 seconds to run essentially the same query, pumping all 40GB through the compute nodes to filter all rows.
Funnily, in this deliberately crafted, extreme case here, it is much faster to access the segment twice and get the remaining columns via a self-join in order to benefit from the offloading features - it only takes 4.7 seconds to run the self-join, and the session statistics confirm that both segment scans could leverage offloading and in particular avoid I/O via Storage Indexes:
Assuming a Storage Index was generated on the ID column a query like the following (using Parallel Query at a DOP of 32 in the test runs here) can benefit from offloading, since in principle all I/O could be avoided via the Storage Index, and virtually no data needs to be transported from the cells to the compute nodes.Note that it projects exactly 254 columns from different row pieces.
set echo on timing on time on
-- MAX is 999, there is a ID column as first col
define num_cols = 300
define compression = nocompress
--define compression = "compress for query low"
drop table many_cols_rg;
purge table many_cols_rg;
declare
s_sql varchar2(32767);
begin
s_sql := q'!
create table many_cols_rg pctfree 0
&compression
parallel nologging
as
with generator1 as
(
select /*+ materialize cardinality(1000) */
rownum as id
-- this makes the rowsource wider otherwise PX BLOCK ITERATOR has problems properly spreading work among the slaves
, rpad('x', 4000) as padding
from
dual
connect by
level <= 1e3
),
generator2 as
(
select /*+ materialize cardinality(10000) */
rownum as id
-- this makes the rowsource wider otherwise PX BLOCK ITERATOR has problems properly spreading work among the slaves
, rpad('x', 4000) as padding
from
dual
connect by
level <= 1e4
)
select
num_id as id
!';
for i in 1..&num_cols loop
s_sql := s_sql || ', char_id as col' || to_char(i, 'FM000');
end loop;
s_sql := s_sql || q'!
from
(
select /*+ no_merge */
a.id + (b.id - 1) * 1e3 as num_id
, cast(to_char(a.id + (b.id - 1) * 1e3, 'FM0000000000') as varchar2(10)) as char_id
from
generator1 a
, generator2 b
)
!';
execute immediate s_sql;
end;
/
exec dbms_stats.gather_table_stats(null, 'many_cols_rg')
As you can see from the snippet, it took less than 2.5 seconds to run the query on the 40GB segment, and the session statistics report 20GB avoided via the Storage Index (which seems to be an instrumentation bug as it always reports only 50% of the total segment size as a maximum, this output was taken from 11.2.0.2 Exadata BP14). Furthermore only a couple of MB were exchanged between the cells and the compute nodes.The corresponding Real-Time SQL Monitoring report confirms the "Smart Scan":
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
11:28:22 SQL>
11:28:22 SQL> select a.name, b.value from v$mystat b, v$statname a where a.name in ('cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan') and a.statistic# = b.statistic#;
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 0
cell physical IO interconnect bytes returned by smart scan 0
Elapsed: 00:00:00.00
11:28:22 SQL>
11:28:22 SQL> select
11:28:22 2 col001,
11:28:22 3 col002,
.
.
.
11:28:23 254 col253,
11:28:23 255 col300/*,
11:28:23 256 col254*/
11:28:23 257 from many_cols_rg where id between -2 and -1;
no rows selected
Elapsed: 00:00:02.40
11:28:25 SQL>
11:28:25 SQL> select a.name, b.value from v$mystat b, v$statname a where a.name in ('cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan') and a.statistic# = b.statistic#;
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 2,1195E+10
cell physical IO interconnect bytes returned by smart scan 3000784
Elapsed: 00:00:00.01
11:28:25 SQL>
Increasing the number of columns projected from the segment above 254 (and as outlined above it doesn't matter from which row pieces these columns come from) disables the Smart Scan, and it takes more than 9 seconds to run essentially the same query, pumping all 40GB through the compute nodes to filter all rows.The corresponding Real-Time SQL Monitoring report confirms the fallback to "direct path reads":
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
11:29:14 SQL>
11:29:14 SQL> select a.name, b.value from v$mystat b, v$statname a where a.name in ('cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan') and a.statistic# = b.statistic#;
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 0
cell physical IO interconnect bytes returned by smart scan 0
Elapsed: 00:00:00.00
11:29:14 SQL>
11:29:14 SQL> select
11:29:14 2 col001,
11:29:14 3 col002,
.
.
.
11:29:15 254 col253,
11:29:15 255 col300,
11:29:15 256 col254
11:29:15 257 from many_cols_rg where id between -2 and -1;
no rows selected
Elapsed: 00:00:09.22
11:29:24 SQL>
11:29:24 SQL> select a.name, b.value from v$mystat b, v$statname a where a.name in ('cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan') and a.statistic# = b.statistic#;
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 0
cell physical IO interconnect bytes returned by smart scan 0
Elapsed: 00:00:00.01
11:29:24 SQL>
Funnily, in this deliberately crafted, extreme case here, it is much faster to access the segment twice and get the remaining columns via a self-join in order to benefit from the offloading features - it only takes 4.7 seconds to run the self-join, and the session statistics confirm that both segment scans could leverage offloading and in particular avoid I/O via Storage Indexes:Note: This is one of the cases where you don't want the optimizer to eliminate a self-join via a query transformation based on corresponding constraints on the join columns :-)HCC CompressionNow the funny thing is that if you repeat the table creation script but uncomment the HCC compression, the Smart Scan happily works with up to 1,000 columns of such a compressed segment.So obviously the general code implementation supports transporting rows with more than 254 columns from the cell to the compute nodes, but the question is why does it only do so with HCC compressed segments. It's probably a question that my client will raise with Oracle Support to find out the answer.FootnoteAt least only the number of "raw" columns projected count towards the limitation - any expressions based on columns don't count, therefore you can project actually more than 254 expressions from a segment and still benefit from Smart Scans as long as the expressions refer to a maximum of 254 "base" columns.The same limitation could also be reproduced when using (almost) the latest available Exadata version as of writing this (11.2.0.3 BP12, almost because BP14 just came out as far as I know).
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
11:29:37 SQL>
11:29:37 SQL> select a.name, b.value from v$mystat b, v$statname a where a.name in ('cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan') and a.statistic# = b.statistic#;
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 0
cell physical IO interconnect bytes returned by smart scan 0
Elapsed: 00:00:00.00
11:29:37 SQL>
11:29:37 SQL> select
11:29:37 2 a.col001,
11:29:37 3 a.col002,
.
.
.
11:29:37 254 a.col253,
11:29:37 255 a.col300,
11:29:37 256 b.col254
11:29:37 257 from many_cols_rg a, many_cols_rg b
11:29:37 258 where a.id between -2 and -1 and b.id between -2 and -1
11:29:37 259 and a.id = b.id;
no rows selected
Elapsed: 00:00:04.77
11:29:42 SQL>
11:29:42 SQL> select a.name, b.value from v$mystat b, v$statname a where a.name in ('cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan') and a.statistic# = b.statistic#;
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 4,2390E+10
cell physical IO interconnect bytes returned by smart scan 6001568
Elapsed: 00:00:00.01
11:29:42 SQL>
11:29:42 SQL>


