Pythian Group

Subscribe to Pythian Group feed
Love Your Data
Updated: 16 min 25 sec ago

How to Patch an Exadata (Part 6) – Timing

Tue, 2017-03-28 15:25

Quick links to Part 1 / Part 2 / Part 3 / Part 4 / Part 5 / Part 6
 

 

6: Timing

Now that we know how to patch every component and the different options available to do so (rolling, non-rolling), which one is the best? How much time does it take?

The answer is obviously “it depends” but I will try to bring few insights so you can have a bright answer when your manager inevitably asks you “How long will that patch be? I need to negotiate the window maintenance with the business… they aren’t happy…” ;)
 
 
Here is a summary of the length of the patch application in a Rolling fashion and in a Non-Rolling fashion (as well as the downtime for each method). Please note that I put in green what I recommend.
 

Cells

  • Rolling : 1h30 x number of cells
  • Rolling downtime : 0 minute
  • Non-rolling : 2h (1h30 to patch a cell + 30 minutes to stop and start everything before and after the patch)
  • Non-rolling downtime : 2h

Note : Refer to my notes at the end of this page about this choice
 
IB Switches

  • Rolling : 45 minutes per switch then 1h30 total
  • Rolling downtime : 0 minute
  • Non-rolling : not available
  • Non-rolling downtime : not available

Note: There’s no non-rolling method for the IB Swicthes then here the choice is an easy one!
 
Database Servers

Note: Refer to my notes at the end of this page about this choice
 
Grid

Note: No green color here? To patch the grid, I recommend to go for a mix like:

  • Rebalance the services away from node 1
  • Patch the node 1
  • Verify that everything is well restarted on the node 1
  • Move all the services to the node 1 (if it is possible that only one node can handle the whole activity – but usually we patch during a quiet period)
  • Apply the patch in a non-rolling method (for the Grid it means launching the patch manually in parallel on the remaining nodes)
  • Once the grid has been patched on all the nodes, restart all the services as they were before the patch

 
Databases Oracle homes

  • Rolling: 20 – 30 minutes per node + ~ 20 minutes per database for the post installation steps
  • Rolling downtime:

    – Can be 0 minute if you rebalance the services before patching a node (as described here for the Grid patching, you can apply the same concept for the database patching as well) + ~ 20 minutes per database for the post installation steps.

    Please note that if you have 30 databases sharing the same ORACLE_HOME, you won’t be able to easily apply 30 post-install steps at the same time then the 30th database will suffer a bigger outage than the 1st one you restart on the patched ORACLE_HOME. This is why I strongly recommend the use of this quicker method.

    – An ~ 20 minutes downtime per database you can chose when using the quicker way !

  • Non-rolling: 20 – 30 minutes
  • Non-rolling downtime: 20 – 30 minutes for all the databases running on the patched Oracle home + ~ 20 minutes per database for the post installation steps. Note that if you have 30 databases sharing the same ORACLE_HOME, you won’t be able to apply 30 post-install steps at the same time then the 30th database will suffer a bigger outage than the 1st one you restart on the patched ORACLE_HOME.

Note: In this instance, I will definitely go for the quicker way ! : clone the Oracle home you want to patch to another one, apply the patch and move the databases one by one to the new patched Oracle home
 
 

Notes on my recommendations

Yes, I always prefer the rolling method for the Infrastructure components (Grid and Database Servers). This is because I can mitigate the outage and I’m also sure to avoid any outage created by the patch or anything preventing for example a reboot as we do not reboot those servers frequently.

Imagine if you go for a cell rolling upgrade and one cell does not reboot after the patch. You’ll have no issue here as the patch will stop automatically; everything will work as before with one cell down, no one will notice anything, you are still supported as it is supported to run different version across different servers. You can then quietly check the troubleshooting section of this blog or go to the pool while Oracle finds a solution for you.

It happened to us on production (it didn’t happen on the DEV on QA Exadatas before…), we warned the client and it took few days to Oracle to provide an action plan. All ran perfectly during a week with a cell down, we then applied the Oracle action plan during the next week-end and could properly finish the patch. The result here is that we applied the patch successfully. We had an issue that caused no outage nor performance degradation and we still fit in the maintenance window – very good job from a client and process point of view !

But if you go for a non-rolling cell patching and all your cells (or few of them) do not reboot after the patch, then you are in trouble and you will lose ten times the time you think you could have won by doing a non-rolling manner. You will most likely have a failed patch outside of the maintenance window, a Root Cause Analysis to provide to the process guys and you probably won’t patch this Exadata any more for a while as the client will be… hmmm… a bit chilly about that question in the future.

And this risk is the same for the databases servers.
 
I do not say that the Bundle won’t work and create a big outage, it is just all about risk mitigation. But remember: “highest level of patch = highest level of bug” :)
 
 
 
If you’ve reached this point, I hope that you enjoyed this Odyssey into the Exadata patching world as much as I enjoy working with it on a daily basis!
 
 

Quick links to Part 1 / Part 2 / Part 3 / Part 4 / Part 5 / Part 6
 

Categories: DBA Blogs

How to Patch an Exadata (Part 1) – Introduction and Prerequisites

Tue, 2017-03-28 12:30

Once you have installed your new Exadata machine will come a time where you’ll be asked :

shouldn’t we patch the Exadata” ?

And the answer is “yes, definitely“.

 
Indeed, Oracle releases huges (~ 10 GB) “Quarterly Full Stack” patches (aka Bundles) every quarter (for example : Patch 24436624 – Quarterly Full Stack Download For Oracle Exadata (Oct 2016 – 12.1.0.2)); these Bundles contain all the patches for all the components that make an Exadata. You will need (almost :)) nothing else to be able to patch your whole Exadata.
 
Even if it looks a tough operation at first sight, it is not that much. And this blog’s aim is to clearly describe every step to make it easier for all of us. Let’s start with a preview of this patching with the order we will be proceeding and the tools we will be using :
 
 

 
 
As it is quite a long odyssey, I will split this blog in different parts which are also a logic order to patch all the components :
 
0/ An advice

1/ General Information

2/ Some prerequisites it is worth doing before the maintenance

3/ The patching procedure

3.1/ Patching the cells (aka Storage servers)
3.2/ Patching the IB switches
3.3/ Patching the Database servers (aka Compute Nodes)
3.4/ Patching the Grid Infrastructure
3.5/ Patching the databases ORACLE_HOMEs

4/ The Rollback procedure

4.1/ Cell Rollback
4.2/ DB nodes Rollback
4.3/ IB Switches Rollback

5/ Troubleshooting

5.1/ Cell patching issue
5.2/ CRS does not restart issue
5.3/ A procedure to add instances to a database
5.4/ OPatch resume

6/ Timing

 

 
 
 

0/ An advice

First of all, please strongly keep in mind this advice :

Do NOT continue to the next step before a failed step is properly resolved.

Indeed, everything that needs to be redundant is redundant and it is supported to run different versions between servers. In the MOS note “Exadata Patching Overview and Patch Testing Guidelines (Doc ID 1262380.1)“, we can read that :
 

It is supported to run different Exadata versions between servers. For example, some storage servers may run 11.2.2.4.2 while others run 11.2.3.1.1, or all storage servers may run 11.2.3.1.1 while database servers run 11.2.2.4.2. However, it is highly recommended that this be only a temporary configuration that exists for the purpose and duration of rolling upgrade.

 
Then if when patching your cells one cell is not rebooting, stop here, do not continue, do not force patch the next one. Indeed, everything will still be working fine and in a supported manner with one cell down (I did it on production, no user could notice anything), it will most likely not be the case with 2 cells down. If this kind of issue happens, have a look at the troubleshooting section of this blog and open a MOS Sev 1.

 
 
 

1/ General Information

Some information you need to know before starting to patch your Exadata :

  • It is better to have a basic understanding of what is an Exadata before jumping to this patch procedure
  • This procedure does not apply to an ODA (Oracle Database Appliance)
  • I will use the /patches/OCT2016_bundle_patch FS to save the Bundle in the examples of this blog
  • I use the “DB node” term here, it means “database node“, aka “Compute node“; the nodes where the Grid Infrastructure and the database are running, I will also use the db01 term for the database node number 1, usually named db01
  • I use the “cell” word aka “storage servers“, the servers that manage your storage. I will also use cel01 for the storage server number 1, usually named cel01
  • It is good to have the screen utility installed; if not, use nohup
  • Almost all the procedure will be executed as root
  • I will patch the IB Switches from the DB node 1 server
  • I will patch the cells from the DB node 1 server
  • I will patch the DB nodes from the cel01 server

 
 
 

1/ Some prerequisites it is worth doing before the maintenance 1.1/ Download and unzip the Bundle

Review the Exadata general note (Exadata Database Machine and Exadata Storage Server Supported Versions (Doc ID 888828.1)) to find the latest Bundle, download, unzip it; be sure that every directory is owned by oracle:dba to avoid any issue in the future :

/patches/OCT2016_bundle_patch >
oracle@myclusterdb01) ls -ltr
total 9609228
-rw-r--r-- 1 oracle oinstall 560430690 Nov 16 18:24 p24436624_121020_Linux-x86-64_10of10.zip
-rw-r--r-- 1 oracle oinstall 1030496554 Nov 16 18:26 p24436624_121020_Linux-x86-64_1of10.zip
-rw-r--r-- 1 oracle oinstall 1032681260 Nov 16 18:27 p24436624_121020_Linux-x86-64_2of10.zip
-rw-r--r-- 1 oracle oinstall 1037111138 Nov 16 18:29 p24436624_121020_Linux-x86-64_3of10.zip
-rw-r--r-- 1 oracle oinstall 1037009057 Nov 16 18:31 p24436624_121020_Linux-x86-64_4of10.zip
-rw-r--r-- 1 oracle oinstall 1037185003 Nov 16 18:33 p24436624_121020_Linux-x86-64_5of10.zip
-rw-r--r-- 1 oracle oinstall 1026218494 Nov 16 18:35 p24436624_121020_Linux-x86-64_6of10.zip
-rw-r--r-- 1 oracle oinstall 1026514887 Nov 16 18:36 p24436624_121020_Linux-x86-64_7of10.zip
-rw-r--r-- 1 oracle oinstall 1026523343 Nov 16 18:39 p24436624_121020_Linux-x86-64_8of10.zip
-rw-r--r-- 1 oracle oinstall 1025677014 Nov 16 18:41 p24436624_121020_Linux-x86-64_9of10.zip

/patches/OCT2016_bundle_patch >
oracle@myclusterdb01) for I in `ls p24436624_121020_Linux-x86-64*f10.zip`
do
unzip $I
done
Archive: p24436624_121020_Linux-x86-64_10of10.zip
 inflating: 24436624.tar.splitaj
...
Archive: p24436624_121020_Linux-x86-64_9of10.zip
 inflating: 24436624.tar.splitai

/patches/OCT2016_bundle_patch >
oracle@myclusterdb01) cat *.tar.* | tar -xvf -
24436624/
24436624/automation/
24436624/automation/bp1-out-of-place-switchback.xml
24436624/automation/bp1-auto-inplace-rolling-automation.xml

...

 
 

1.2/ SSH keys

For this step, if you are not confident with the dbs_group, cell_group, etc… files,  here is how to create them as I have described it in this post (look for “dbs_group” in the post).

[root@myclusterdb01 ~]# ibhosts | sed s'/"//' | grep db | awk '{print $6}' | sort > /root/dbs_group
[root@myclusterdb01 ~]# ibhosts | sed s'/"//' | grep cel | awk '{print $6}' | sort > /root/cell_group
[root@myclusterdb01 ~]# cat /root/dbs_group ~/cell_group > /root/all_group
[root@myclusterdb01 ~]# ibswitches | awk '{print $10}' | sort > /root/ib_group
[root@myclusterdb01 ~]#

We would need few SSH keys deployed in order to ease the patches application :

  • root ssh keys deployed from the db01 server to the IB Switches (you will have to enter the root password once for each IB Switch)
[root@myclusterdb01 ~]# cat ~/ib_group
myclustersw-ib2
myclustersw-ib3
[root@myclusterdb01 ~]# dcli -g ~/ib_group -l root -k -s '-o StrictHostKeyChecking=no'
root@myclustersw-ib3's password:
root@myclustersw-ib2's password:
myclustersw-ib2: ssh key added
myclustersw-ib3: ssh key added
[root@myclusterdb01 ~]#
  • root ssh keys deployed from the cel01 server to all the database nodes (you will have to enter the root password once for each database server)
[root@myclustercel01 ~]# cat ~/dbs_group
myclusterdb01
myclusterdb02
myclusterdb03
myclusterdb04
[root@myclustercel01 ~]# dcli -g ~/dbs_group -l root -k -s '-o StrictHostKeyChecking=no'
root@myclusterdb01's password:
root@myclusterdb03's password:
root@myclusterdb04's password:
root@myclusterdb02's password:
myclusterdb01: ssh key added
myclusterdb02: ssh key added
myclusterdb03: ssh key added
myclusterdb04: ssh key added
[root@myclustercel01 ~]#
  • root ssh keys deployed from the db01 server to all the cells (you will have to enter the root password once for each cell)
[root@myclusterdb01 ~]# dcli -g ~/cell_group -l root hostname
myclustercel01: myclustercel01.mydomain.com
myclustercel02: myclustercel02.mydomain.com
myclustercel03: myclustercel03.mydomain.com
myclustercel04: myclustercel04.mydomain.com
[root@myclusterdb01 ~]# dcli -g ~/cell_group -l root -k -s '-o StrictHostKeyChecking=no'
root@myclustercel04's password:
...
root@myclustercel03's password:
myclustercel01: ssh key added
...
myclustercel06: ssh key added
[root@myclusterdb01 ~]#
1.3/ Upgrade opatch

It is highly recommended to upgrade opatch before any patching activity and this Bundle is not an exception. Please find the procedure to quickly upgrade opatch with dcli in this post.

Please note that upgrading opatch will also allow you to be ocm.rsp-free !

 
 

1.4/ Run the prechecks

It is very important to run those prechecks and take a good care of the outputs. They have to be 100% successful to ensure a smooth application of the patches.

  • Cell patching prechecks (launch them from the DB Node 1 as you will patch them from here)
[root@myclusterdb01 ~]# cd /patches/OCT2016_bundle_patch/24436624/Infrastructure/12.1.2.3.3/ExadataStorageServer_InfiniBandSwitch/patch_12.1.2.3.3.161013/
[root@myclusterdb01 ~]# ./patchmgr -cells ~/cell_group -patch_check_prereq -rolling

 
 

  • DB Nodes prechecks (launch them from the cel01 server as you will patch them from here)

As we will use the cell node 1 server to patch the databases servers, we first need to copy patchmgr and the ISO file to this server

[root@myclusterdb01 ~]#  scp /patches/OCT2016_bundle_patch/24436624/Infrastructure/SoftwareMaintenanceTools/DBServerPatch/5.161014/p21634633_121233_Linux-x86-64.zip root@myclustercel01:/tmp/.                    # This is patchmgr
[root@myclusterdb01 ~]#  scp /patches/OCT2016_bundle_patch/24436624/Infrastructure/12.1.2.3.3/ExadataDatabaseServer_OL6/p24669306_121233_Linux-x86-64.zip root@myclustercel01:/tmp/.                               # This is the ISO file, do NOT unzip it
[root@myclusterdb01 ~]#  ssh root@myclustercel01
[root@myclustercel01 ~]#  cd /tmp
[root@myclustercel01 ~]#  nohup unzip p21634633_121233_Linux-x86-64.zip &
[root@myclustercel01 ~]# cd /tmp/dbserver_patch_5.5.161014
[root@goblxdex02cel01 dbserver_patch_5.161014]# ./patchmgr -dbnodes ~/dbs_group -precheck -iso_repo /tmp/p24669306_121233_Linux-x86-64.zip -target_version 12.1.2.3.3.161013
[root@goblxdex02cel01 dbserver_patch_5.161014]#

Note : if you have some NFS mounted, you will have some error messages, you can ignore them at this stage, we will umount the NFS before patching the DB nodes
 
 

  • IB Switches prechecks (launch them from the DB Node 1 as you will patch them from here)
[root@myclusterdb01]# cd /patches/OCT2016_bundle_patch/24436624/Infrastructure/12.1.2.3.3/ExadataStorageServer_InfiniBandSwitch/patch_12.1.2.3.3.161013/
[root@myclusterdb01]# patch_12.1.2.3.3.161013]# ./patchmgr -ibswitches ~/ib_group -upgrade -ibswitch_precheck

 
 

  • Grid Infrastructure prechecks
[root@myclusterdb01]# . oraenv <<< +ASM1
[root@myclusterdb01]# $ORACLE_HOME/OPatch/opatchauto apply /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103 -oh /u01/app/12.1.0.2/grid -analyze
[root@myclusterdb01]#

Notes :

  • You will most likely see some warnings here, check the logfiles and they will probably be due to some patches that will be rolled back as they will not be useful any more.

 
 

[root@myclusterdb01]# $ORACLE_HOME/OPatch/opatch prereq CheckSystemSpace -phBaseFile /tmp/patch_list_gihome.txt

With the following /tmp/patch_list_gihome.txt file (check the README as the patch numbers will change with the versions)

[root@myclusterdb01]#cat /tmp/patch_list_gihome.txt
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/21436941
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24007012
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24846605
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24340679
[root@myclusterdb01]#

 
 

  • Database patch prechecks
[oracle@myclusterdb01]$ . oraenv <<< A_DATABASE_WITH_THE_ORACLE_HOME_YOU_WANT_TO_PATCH
[oracle@myclusterdb01]$ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24340679 $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24846605 $ORACLE_HOME/OPatch/opatch prereq CheckSystemSpace -phBaseFile /tmp/patch_list_dbhome.txt

The file /tmp/patch_list_dbhome.txt containing (check the README, the patch numbers will change depending on the versions) :

/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24340679
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24846605

 
 

  • OJVM prechecks
[oracle@myclusterdb01]$ cd /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018OJVMPSU/24315824
[oracle@myclusterdb01]$ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -ph ./
[oracle@myclusterdb01]$

- Do a lsinventory -all_nodes before patching and save the output somewhere

[oracle@myclusterdb01]$ $ORACLE_HOME/OPatch/opatch lsinventory -all_nodes

 
 

  • Check disk_repair_time and set it to 24h

Oracle recommends to set this parameter to 8h. As we had issues in the past with a very long cell patching, we now use to set this parameter to 24h as Oracle has recommended us.
Please note that this prerequisite is only needed for a rolling patch application.

SQL> select dg.name as diskgroup, a.name as attribute, a.value from v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and (a.name like '%repair_time' or a.name = 'compatible.asm');

DISKGROUP ATTRIBUTE VALUE
-------------------- ---------------------------------------- ----------------------------------------
DATA disk_repair_time 3.6h
DATA compatible.asm 11.2.0.2.0
DBFS_DG disk_repair_time 3.6h
DBFS_DG compatible.asm 11.2.0.2.0
RECO_ disk_repair_time 3.6h
RECO compatible.asm 11.2.0.2.0

6 rows selected.

SQL> connect / as sysasm
Connected.
SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'disk_repair_time' = '24h' ;

Diskgroup altered.

SQL> ALTER DISKGROUP DBFS_DG SET ATTRIBUTE 'disk_repair_time' = '24h' ;

Diskgroup altered.

SQL> ALTER DISKGROUP RECO SET ATTRIBUTE 'disk_repair_time' = '24h' ;

Diskgroup altered.

SQL>

 
If one of this precheck points a problem, resolve it before heading to the next steps.
 
 
Now that everything is downloaded, unzipped, updated, we can safely jump to the patching procedure in part 2 !
 
 

Quick links to Part 1 / Part 2 / Part 3 / Part 4 / Part 5 / Part 6
 

Categories: DBA Blogs

Log Buffer #507: A Carnival of the Vanities for DBAs

Mon, 2017-03-20 07:58

This Log Buffer edition covers Oracle, SQL Server and MySQL blog posts with a touch of Cloud.

Oracle:

Provisioning EBS on Oracle Cloud: Latest Enhancements

ORA-54002 when trying to create Virtual Column using REGEXP_REPLACE on Oracle 12cR2

Business rules, common sense and query performance

Problem with V$RECOVERY_AREA_USAGE view and FRA space not being reclaimed

Updated SQLcl: 2 New Commands, Bug Fixes, and 12cR2 Oh My!

SQL Server:

Windowing Functions: Tell me when that changes

SQL Server Bulk Insert Row Terminator Issues

Introducing DLM Techniques for a Multi-Database Multi-Server System

Ding – The World’s Largest Mobile Top-up Network Streamlines QA with SQL Server Containers

Enable Trace Flags in SQL Server

MySQL:

Migrating MySQL database from Amazon RDS to DigitalOcean

Monitoring Databases: A Product Comparison

New collations in MySQL 8.0.0

How to store date, time, and timestamps in UTC time zone with JDBC and Hibernate

MySQL Connector/NET 7.0.7 m6 development has been released

Categories: DBA Blogs

The Side Effects of Drop ‘Unused’ Index

Tue, 2017-03-07 10:54

In this blog post I’ll demonstrate a scenario where dropping a potentially ‘unused’ index can have a negative influence on the optimizer’s cardinality estimation. Having columns with logical correlation and skewed data can cause difficulties for the optimizer when calculating cardinality. This post will also address what we can do to help the optimizer with better cardinality estimates.

The inspiration for this post was derived from a recent index usage review. One of the requirements was to determine how to confirm which unused indexes qualify to be dropped. I decided to do some tests regarding extended statistics and the effect of potentially dropping an ‘unused’ index. You will observe what kind of result may be seen from the drop of an index which has not been used. It’s important to remember that it does not apply in all cases.  Occasionally, even if the index is used, it doesn’t mean that it’s needed.

This is more or less linked to columns with skewed data and which might have logical relationship.
Hopefully, it can help you answer some of the following questions:

  1. Is the optimizer using the indexes behind scenes?
  2. While there are methods to determine if an index has been used in an execution plan, can an index be dropped on this basis only?
  3. If we drop composite index (constructed from correlated columns), can we do anything to avoid performance degradation?

Before we start with the use case, let’s briefly review some concepts.

The basic formula for selectivity is 1/NDV. The cardinality (CDN) is calculated as selectivity * total number of rows.

The selectivity of a join is defined as the selectivity of the most selective join column adjusted by the proportion of not null values in each join column.

Join Selectivity:
Sel = 1/max[NDV(t1.c1),NDV(t2.c2)] *
           ( (Card t1 - # t1.c1 NULLs) / Card t1) *
           ( (Card t2 - # t2.c2 NULLs) / Card t2)

Join Cardinality:
Card(Pj) = Card(T1) * Card(T2) * Sel(Pj)

In Oracle’s Doc ID 68992.1 you can find a more detailed explanation about different selectivity calculations based on different predicates. For simplicity, I will use equality predicate.

This blog post is divided in three sections.

  1. Use case where we demonstrate how drop of an “unused” index can mess up optimizer cardinality calculation.
  2. How to help optimizer for better cardinality estimation with extended statistics.
  3. More explanation on column correlation (CorStregth).
Preparation

For the first and second part I’ll be using the following configuration.

drop table t1 purge;
drop table t2 purge;
drop table t3 purge;

create table t1
as
select rownum pk,
    mod(level,  20) n1,
    mod(level, 200) n2,
    mod(level,1000) n3,
    lpad('x',42,'x') c1
from dual
connect by level <= 10000
/

create table t2
as
select rownum pk,
    mod(level,  20) n1,
    mod(level, 200) n2,
    mod(level,1000) n3,
    lpad('x',42,'x') c1
from dual
connect by level <= 10 
/ 

CREATE TABLE t3
AS
SELECT rownum pk, lpad('x',rownum,'x') c1
FROM dual
CONNECT BY level <= 2000 / exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'for all columns size 1');
exec dbms_stats.gather_table_stats(user,'T2',method_opt=>'for all columns size 1');
exec dbms_stats.gather_table_stats(user,'T3',method_opt=>'for all columns size 1');


alter table t1 add constraint t1pk primary key(pk);
alter table t2 add constraint t2pk primary key(pk);
alter table t3 add constraint t3pk primary key(pk);

create index t1ix on t1(n1,n2);
create index t2ix on t2(n1,n2);

It creates two tables (T1 and T2) with composite index on N1 and N2 columns for both tables.
The statistics are as follows:

--Column statistics
TABLE_NAME	     COLUMN_NAME	  NUM_DISTINCT	  COL_SEL
-------------------- -------------------- ------------ ----------
T1		     C1 			     1		1
T1		     N1 			    20	      .05
T1		     N2 			   200	     .005
T1		     N3 			  1000	     .001
T2		     C1 			     1		1
T2		     N1 			    10	       .1
T2		     N2 			    10	       .1
T2		     N3 			    10	       .1

--Index statistics
TABLE_NAME	     INDEX_NAME 	    NUM_ROWS DISTINCT_KEYS    IND_SEL
-------------------- -------------------- ---------- ------------- ----------
T1		     T1IX		       10000	       200	 .005
T2		     T2IX			  10		10	   .1

Table T1 has 10000 rows and T2 only 10 rows. I’ve created two composite indexes for T1 and T2 since I am using both (N1,N2) columns in selection. As you can see (from the data dictionary) we have 200 distinct values for T1IX index and 10 for T2IX index.
e.g. Based on the formula we can easily derive selectivity as 1/NDV, for T1IX index we have 1/200 = 0.005.

Drop of an “unused” index can mess up optimizer’s cardinality estimation

The query I’m going to use in this case is:

select t1.c1,t2.c1
from t1,t2
where t1.n1 = t2.n1
and t1.n2 = t2.n2
/

Let’s first check the amount of the cardinality (CDN) before we do any changes. The column we’re interested in is ROWS, that’s where cardinality estimation is shown.

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

Based on the formula provided, you may have expected to see cardinality (CDN) of 25. Since, we don’t have nulls we can do the calculation as 1/max(t1.n1,t2.n1) * 1/max(t1.n2,t2.n2) = 1/20 * 1/200 = 0.05 * 0.005 = 0.00025. We came up with selectivity of 0.00025.
Now, if we now do the join cardinality calculation 10 (T1) * 10000 (T2) * selectivity 0.00025 = 1000000 * 0.00025 = 25

But instead of 25, the optimizer calculated cardinality (CDN) of 500. In order to better understand the calculation, let’s review the 10053 event trace file.

  ColGroup (#1, Index) T1IX
    Col#: 1 2    CorStregth: 20.00
  ColGroup (#1, Index) T2IX
    Col#: 1 2    CorStregth: 10.00
Join ColGroups for T1[T1] and T2[T2] : (#1, #1)

What’s interesting here is the Join ColGroups analysis. Since, we have full match on the join with the indexed columns, the optimizer is considering Join ColGroups between T1 and T2. It will use ColGroups statistics to calculate join cardinality (CDN).

Col#: column position (self explanatory)
CorStregth: correlation strength. Will come back later on correlation strength (CorStregth).

Within the trace file in Nested Loop (NL) section (remember join cardinality is only listed in NL section) you should see something like:

Join selectivity using 1 ColGroups: 0.005000 (sel1 = 0.000000, sel2 = 0.000000)
Join Card:  500.000000 = outer (10.000000) * inner (10000.000000) * sel (0.005000)
Join Card - Rounded: 500 Computed: 500.000000

Based on the join selectivity formula, the optimizer chooses T1IX index (ColGroup) statistics for selectivity calculation, hence 1/NDV (T1IX index) = 1/200 = 0.005.
Afterwards, join cardinality is calculated as num_rows (T1) * num_rows (T2) * selectivity (0.005) or 10 * 10000 * 0.005 = 100000 * 0.005 = 500.

If you’re not aware of such cases, you may decide to drop an index based solely on the execution plan information. Even worse, the developers can easily decide to drop an index without knowing the possible effect.

Now, let’s observe the cardinality estimation after T1IX index is dropped.

SQL> drop index t1ix;

Index dropped.

...

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

After we dropped T1IX the cardinality dropped from 500 to 25! This can have big impact on the performance. If we add additional tables to the join the optimizer can possibly re-order tables, switch Hash Join to Nested Loops or Merge Joins.

So, why did the cardinality (CDN) change after “unused” index was dropped?

Even if the index was not used, it doesn’t mean that its statistics were also not used. That’s the case here. The optimizer used the NDV of the column group created by the T1IX index to calculate the join selectivity.

Lets again check the 10053 event trace file, now where the query has been executed without T1IX index.

  
ColGroup (#1, Index) T2IX
    Col#: 1 2    CorStregth: 10.00
Join ColGroups for T1[T1] and T2[T2] : Using cdn sanity check

...

ColGroup cardinality sanity check: ndv for  T1[T1] = 4000.000000  T2[T2] = 10.000000
Join selectivity using 1 ColGroups: 2.5000e-04 (sel1 = 0.000000, sel2 = 0.000000)
Join Card:  25.000000 = outer (10.000000) * inner (10000.000000) * sel (2.5000e-04)
Join Card - Rounded: 25 Computed: 25.000000

We don’t have full match for Join ColGroups (we’re missing the statistics from T1IX index). The optimizer decided to perform cardinality (CDN) sanity check where (after the sanity check) it ended up with a number of distinct values (NDV) of 4000.

How did the optimizer calculate 4000 ?

Since we don’t have column group, the optimizer did a basic calculation of N1 (sel) * N2 (sel) = 0.00025. Having the (N1,N2) selectivity, we can extract the NDV as 1/0.00025 = 4000.
The representation in the trace file is slightly different as 2.5000e-04 = 2.5000 * (1/10000) = 0.00025

Having incorrect cardinality estimate the optimizer can make wrong decision for join methods, as we can see in the following example when we join T3 table.

--query
select t1.c1,t2.c1,t3.c1
from t1,t2,t3
where t1.n1 = t2.n1
and t1.n2 = t2.n2
and t1.pk = t3.pk
/

Execution Plan
----------------------------------------------------------
Plan hash value: 1184213596

----------------------------------------------------------------------------
| Id  | Operation	    | Name | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |	   |   500 |   541K|   120   (0)| 00:00:01 |
|*  1 |  HASH JOIN	    |	   |   500 |   541K|   120   (0)| 00:00:01 |
|*  2 |   HASH JOIN	    |	   |   500 | 51500 |	30   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T2   |	10 |   490 |	 3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1   | 10000 |   527K|	27   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL | T3   |  2000 |  1964K|	90   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."PK"="T3"."PK")
   2 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

SQL> drop index t1ix; 

Index dropped.

Execution Plan
----------------------------------------------------------

Plan hash value: 142045272

-------------------------------------------------------------------------------------
| Id  | Operation		     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |	    |	 25 | 27725 |	 35   (0)| 00:00:01 |
|   1 |  NESTED LOOPS		     |	    |	 25 | 27725 |	 35   (0)| 00:00:01 |
|   2 |   NESTED LOOPS		     |	    |	 25 | 27725 |	 35   (0)| 00:00:01 |
|*  3 |    HASH JOIN		     |	    |	 25 |  2575 |	 30   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL	     | T2   |	 10 |	490 |	  3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL	     | T1   | 10000 |	527K|	 27   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN	     | T3PK |	  1 |	    |	  0   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID| T3   |	  1 |  1006 |	  1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")
   6 - access("T1"."PK"="T3"."PK")

Another example with different data set where there is a change in the row source used for building a hash table.

--query
SELECT t1.c1, t2.c1
FROM t1,t2,t3
WHERE t1.n1 = t2.n1
AND t1.n2 = t2.n2
AND t1.n1 = t3.pk
/

Execution Plan
----------------------------------------------------------
Plan hash value: 2855555999

-------------------------------------------------------------------------------
| Id  | Operation	       | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  5000K|  1525M|  2766	(1)| 00:00:01 |
|*  1 |  HASH JOIN	       |      |  5000K|  1525M|  2766	(1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL    | T2   |  1000 |   251K|    14	(0)| 00:00:01 |
|*  3 |   HASH JOIN	       |      |  1000K|    59M|  2739	(1)| 00:00:01 |
|   4 |    INDEX FAST FULL SCAN| T3PK |  2000 |  8000 |     3	(0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL   | T1   |  1000K|    55M|  2734	(1)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")
   3 - access("T1"."N1"="T3"."PK")

SQL> drop index t1ix; 

Index dropped.

...

Execution Plan
----------------------------------------------------------
Plan hash value: 1993272941

------------------------------------------------------------------------------
| Id  | Operation	      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |	 250K|	  76M|	2754   (1)| 00:00:01 |
|*  1 |  HASH JOIN	      |      |	 250K|	  76M|	2754   (1)| 00:00:01 |
|   2 |   INDEX FAST FULL SCAN| T3PK |	2000 |	8000 |	   3   (0)| 00:00:01 |
|*  3 |   HASH JOIN	      |      |	 250K|	  75M|	2750   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T2   |	1000 |	 251K|	  14   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL  | T1   |	1000K|	  55M|	2734   (1)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T3"."PK")
   3 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

Although this will not apply in all cases, be careful before deciding to drop an index. Although it may seem like a good idea initially, it can have big impact with wrong cardinality estimation. This could then lead to incorrect execution plan decisions, poor performance and unhappy customers.

So, what is the solution?
We don’t want huge indexes to be (DML) maintained if not used, right?

This brings me to the second part, Extended Statistics.

Extended Statistics

Oracle introduced extended statistics with Oracle 11g. Its main functionality is to highlight column group relation. The extended statistics helps the optimizer become more familiar with logical relation between column groups, otherwise unknown information. The extended statistics are represented as virtual (hidden) columns on which statistics are gathered.

When we had T1IX index, the optimizer used its NDV to calculate correct cardinality. The index statistics were used for the column group of N1 and N2 columns. However, after we dropped the index (even though it was never used), we’ve made a mess of the optimizer with its cardinality estimates.

Extended statistics are maintained by DBMS_STATS package, function CREATE_EXTENDED_STATS. Let’s create extended statistics and see the cardinality estimation.

SQL> select dbms_stats.create_extended_stats(null,'T1','(N1,N2)') from dual
  2  /

DBMS_STATS.CREATE_EXTENDED_STATS(NULL,'T1','(N1,N2)')
------------------------------------------------------------
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS

SQL> set autotrace trace exp
SQL> 
SQL> select t1.c1,t2.c1
  2  from t1,t2
  3  where t1.n1 = t2.n1
  4  and t1.n2 = t2.n2
  5  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

I know, you’re thinking “It didn’t help” and you’re right. The optimizer did not use the column group (extension). If you look a little deeper, USER_TAB_COL_STATISTICS you will see that the virtual column is not created, only the extension definition was created.

SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/

COLUMN_NAME				 NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
N1						   20 NONE
N2						  200 NONE
N3						 1000 NONE
C1						    1 NONE

SQL> col extension_name for a40
SQL> SELECT extension_name, extension
FROM   user_stat_extensions
WHERE  table_name = 'T1'
/

EXTENSION_NAME				 EXTENSION
---------------------------------------- --------------------
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS		 ("N1","N2")

This behavior is a bit expected, because DBMS_STATS.CREATE_EXTENDED_STATS will only create an extension. It won’t gather statistics for the extension. If you check the 10053 event trace file you will see NO STATISTICS (using defaults).

Column (#5): SYS_STUBZH0IHA7K$KEBJVXO5LOHAS(NUMBER)  NO STATISTICS (using defaults)

In order to fix this, we need to re-gather statistics on the table.

SQL> exec dbms_stats.gather_table_stats(null,'T1',method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/

COLUMN_NAME				 NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
N1						   20 NONE
N2						  200 NONE
N3						 1000 NONE
C1						    1 NONE
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS			  200 NONE


SQL> select t1.c1,t2.c1
from t1,t2
where t1.n1 = t2.n1
and t1.n2 = t2.n2
/

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

Now the optimizer calculated the correct cardinality. If you check the trace file more extensively, you should see the same join cardinality information comparable to when we had the index in place. The difference would only be that this time it’s using the extension statistics.

...
  ColGroup (#1, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2    CorStregth: 20.00
  ColGroup (#1, Index) T2IX
    Col#: 1 2    CorStregth: 10.00

...

Join selectivity using 1 ColGroups: 0.005000 (sel1 = 0.000000, sel2 = 0.000000)
Join Card:  500.000000 = outer (10.000000) * inner (10000.000000) * sel (0.005000)
Join Card - Rounded: 500 Computed: 500.000000

With extended statistics we were able to help the optimizer to calculate cardinality (CDN) of 500. It used extension NDV in the same way it used T1IX index statistics.

However, what if T1 table was a big table and you already gathered statistics before creating the extension (column group)? Re-gathering can be time-consuming and have negative impact on the performance. To answer this question, we have to employ the METHOD_OPT approach from DBMS_STATS.GATHER_TABLE_STATS.
Creating extension via METHOD_OPT will automatically gather statistics on the column group.

SQL>
SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/  

COLUMN_NAME	     NUM_DISTINCT HISTOGRAM
-------------------- ------------ ---------------
C1				1 NONE
N3			     1000 NONE
N2			      200 NONE
N1			       20 NONE

SQL> exec dbms_stats.gather_table_stats(null,'T1',method_opt=>'for all columns size 1 for columns size 1 (N1,N2)');

PL/SQL procedure successfully completed.

SQL> col column_name for a40
SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/

COLUMN_NAME				 NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS			  200 NONE
C1						    1 NONE
N3						 1000 NONE
N2						  200 NONE
N1						   20 NONE

In this case an extension was automatically created and in the same time statistics were gathered.

If you find yourself in a situation were you used DBMS_STATS.CREATE_EXTENDED_STATS after gathering statistics, there are multiple ways to gather statistics on already existing extensions.

1) DBMS_STATS.GATHER_TABLE_STATS
(it will include all virtual columns)

2) METHOD_OPT=>’for columns size … (column group)’
(you can list the actual column group as you do for CREATE_EXTENDED_STATS)

3) METHOD_OPT=>’for columns size … extension_name’
(you can specify the extension name obtained from DBA|ALL|USER_STAT_EXTENSIONS dictionary view)

4) METHOD_OPT=>’for all hidden columns’
(hidden columns are by default included in default DBMS_STATS.GATHER_TABLE_STATS configuration)

What’s the difference between METHOD_OPT and DBMS_STATS.CREATE_EXTENDED_STATS?

I believe it behaves in that manner because METHOD_OPT is a part of a procedure that is designed for statistics gathering. Extended statistics (column groups) were introduced with 11g and are additional functionality to METHOD_OPT. The main functionality of it is to gather statistics, hence gathering statistics via METHOD_OPT creates the extension and gathers statistics at the same time.
Perhaps Oracle could have used different naming, something like CREATE_COLUMN_GROUP instead of CREATE_EXTENDED_STATS (which is misleading).

Column Correlation (CorStreght)

Column correlation represents how much the columns within a given group have mutual relation. It’s calculated as product of NDV of all columns in the group divided with the NDV of the column group. This is clearly explained in MultiColumn/Column Group Statistics – Additional Examples (Doc ID 872406.1).

Product of individual NDVs / NDV for the column group.

If the product of all columns NDV is smaller than CG NDV, the optimizer will ignore CG (extended) stats. Based on the tests I performed, the CorStregth is taken into consideration only when the optimizer has to decide between multiple column groups for which Full Match can be constructed. In the other cases, when Partial Match is used the optimizer it picked the column group with smaller NDV.

Here are two different cases; one where Full Match is constructed and another for Partial Match:

Column Group – Full Match

Let’s examine the first case where we can construct Full Match.

COLUMN_NAME		NUM_DISTINCT
---------------------------------------- ------------
N1						   20
N2						  200
N3						 1000
C1						    1
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS			  200 ---> (N1,N2)
SYS_STUXQEJ8WOFTK5R5EYM4DF_60V			 1000 ---> (N2,N3)

select *
from t1
where t1.n1 = 10
and t1.n2 = 40
and t1.n3 = 80
/

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |     5 |   270 |   228   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1	 |     5 |   270 |   228   (1)| 00:00:01 |
--------------------------------------------------------------------------

Trace file
...
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 20 Nulls: 0 Density: 0.050000 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#6): SYS_STUXQEJ8WOFTK5R5EYM4DF_60V(NUMBER)
    AvgLen: 12 NDV: 1000 Nulls: 0 Density: 0.001000
  Column (#5): SYS_STUBZH0IHA7K$KEBJVXO5LOHAS(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  ColGroup (#1, VC) SYS_STUXQEJ8WOFTK5R5EYM4DF_60V
    Col#: 2 3    CorStregth: 200.00
  ColGroup (#2, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2    CorStregth: 20.00
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.001000

We have T1 table with two column groups on (N1,N2) and (N2,N3).
In this case the product of the individual columns N1,N2,N3 NDV is much higher than column groups NDV, hence the optimizer will calculate the cardinality (CDN) based on the extensions. Having a selection with (N1,N2,N3) the optimizer can perform Full Match for both of the column groups. In this scenario, the cardinality can be expressed as:

1) Using CG #1 (N2,N3): 1/CG(NDV) * 1/N1(NDV) = 1/1000 * 1/20 = 0.001 * 0.05 = 0.00005 * 100000 = 5

2) Using CG #2 (N1,N2): 1/CG(NDV) * 1/N3(NDV) = 1/200 * 1/1000 = 0.005 * 0.0001 = 0.000005 * 100000 = 0.5 (rounded to 1)

The following instance demonstrates where the optimizer decided to use CG (N2,N3) or SYS_STUXQEJ8WOFTK5R5EYM4DF_60V because it has higher CorStregth of 200 (vs. 20 for CG (N1,N2)).
To show how CorStregth can influence the optimizer, I’ve changed N1 distinct value to 210 to increase CG (N1,N2) CorStregth.
Here, case column group (N1,N2) was used in cardinality estimates.

exec dbms_stats.set_column_stats(null,'T1','N1',distcnt=>210);

Execution Plan
----------------------------------------------------------

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |	1 |    11 |   228   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |	  |	1 |    11 |	       |	  |
|*  2 |   TABLE ACCESS FULL| T1   |	1 |    11 |   228   (1)| 00:00:01 |
---------------------------------------------------------------------------



--Trace file
...
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 210 Nulls: 0 Density: 0.004762 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#6): SYS_STUXQEJ8WOFTK5R5EYM4DF_60V(NUMBER)
    AvgLen: 12 NDV: 1000 Nulls: 0 Density: 0.001000
  Column (#5): SYS_STUBZH0IHA7K$KEBJVXO5LOHAS(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  ColGroup (#2, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2    CorStregth: 210.00
  ColGroup (#1, VC) SYS_STUXQEJ8WOFTK5R5EYM4DF_60V
    Col#: 2 3    CorStregth: 200.00
  ColGroup Usage:: PredCnt: 3  Matches Full: #2  Partial:  Sel: 0.005000

We can see that the optimizer chose CG #2 based on CorStregth.

Column Group – Partial Match

This second case shows what happens when Full Match can’t be constructed and the optimizer chooses the column group with smaller NDV. I’ve built the same T1 table, but now with Column Groups (N1,N2,N3) and (N1,N2,C1).

COLUMN_NAME		 NUM_DISTINCT
---------------------------------------- ------------
N1						   20
N2						  200
N3						 1000
C1						    1
SYS_STU2NGW2O#$ZX2CDZYOMNMFP64			  200
SYS_STUOYQUEIAZ7FI9DV53VLN$$$0			 1000


--Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |   500 | 27000 |   228   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1	 |   500 | 27000 |   228   (1)| 00:00:01 |
--------------------------------------------------------------------------



--Trace file
...
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 20 Nulls: 0 Density: 0.050000 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#6): SYS_STUOYQUEIAZ7FI9DV53VLN$$$0(NUMBER)
    AvgLen: 12 NDV: 1000 Nulls: 0 Density: 0.001000
  Column (#5): SYS_STU2NGW2O#$ZX2CDZYOMNMFP64(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#4): C1(VARCHAR2)
    AvgLen: 43 NDV: 1 Nulls: 0 Density: 1.000000
  ColGroup (#1, VC) SYS_STUOYQUEIAZ7FI9DV53VLN$$$0
    Col#: 1 2 3    CorStregth: 4000.00
  ColGroup (#2, VC) SYS_STU2NGW2O#$ZX2CDZYOMNMFP64
    Col#: 1 2 4    CorStregth: 20.00
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #2 (1 2 )  Sel: 0.005000
  Table: T1  Alias: T1

As CG #1 has way higher CorStregth of 4000 (vs. 20) the optimizer used CG #2. Let’s decrease CG #2 NDV to 100 to see whether it will influence the optimizer to use CG #1.

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |  1000 | 50000 |   228   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1	 |  1000 | 50000 |   228   (1)| 00:00:01 |
--------------------------------------------------------------------------



--Trace file
...
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 20 Nulls: 0 Density: 0.050000 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#6): SYS_STUOYQUEIAZ7FI9DV53VLN$$$0(NUMBER)
    AvgLen: 12 NDV: 100 Nulls: 0 Density: 0.010000
  Column (#5): SYS_STU2NGW2O#$ZX2CDZYOMNMFP64(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#4): C1(VARCHAR2)
    AvgLen: 43 NDV: 1 Nulls: 0 Density: 1.000000
  ColGroup (#1, VC) SYS_STUOYQUEIAZ7FI9DV53VLN$$$0
    Col#: 1 2 3    CorStregth: 40000.00
  ColGroup (#2, VC) SYS_STU2NGW2O#$ZX2CDZYOMNMFP64
    Col#: 1 2 4    CorStregth: 20.00
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #1 (1 2 ) Sel: 0.010000
  Table: T1  Alias: T1

Since CG #1 has smaller NDV (100) (vs. #2, 200), it calculated the cardinality (CDN) based on CG #1 as:
1/NDV (CG) * Original Cardinality = 1/100 * 100000 = 0.01 * 100000 = 1000

Summary

There are some situations where dropping a potentially ‘unused’ index can have a negative influence on the optimizer’s cardinality estimation. Consider using extended statistics to help the optimizer in order to become more familiar with columns logical relationship. The extended statistics are represented with virtual (hidden) column. Regarding statistics gathering, note that DBMS_STATS.CREATE_EXTENDED_STATS won’t gather statistics automatically, use METHOD_OPT instead. If you have skewed data, you may want to consider creating proper histograms. When histograms are on base columns, you’ll also need to create histograms on the column group in order to be taken into consideration. This is because Oracle gives priority to histograms. You can use column groups extended statistics to preserve cardinality stability as consequence from dropping an index. While it might not be appropriate for all cases, it’s a viable option for the optimizers help in good cardinality estimation.

Lastly, you can use DBMS_STATS.SEED_COL_USAGE procedure to monitor columns relation in filter predicates, join predicates and group by clauses. A report can be generated with DBMS_STATS.REPORT_COL_USAGE where its information can be used for appropriate column groups identification. I look forward to exploring more on that in a future post.

Categories: DBA Blogs

Reading Execution Plans for Parallel DML

Thu, 2017-02-23 20:36

Parallel execution in Oracle seems to be a topic of myths, mysteries and misunderstandings. In this post I hope to clear up at least one of them by diving into a discussion about parallel DML. Actually, I want to show you two things that can go wrong with it.

My demo setup and case are pretty simple, I just create two copies of dba_objects:

SQL> create table a as select * from dba_objects;

Table created.

SQL> create table b as select * from dba_objects;

Table created.

Now let’s delete from one of the tables with a subselect to the other one, and let’s make it parallel. When, if and how it makes sense to actually use parallel DML for a speedup is not the point of this post. In fact, this is a poor example as the overhead of extra communication and spinning up the processes just made the actual delete run slower.

SQL> delete /*+ PARALLEL(a 42) */ from a where object_id in (select object_id from b);

91277 rows deleted.

SQL> rollback;

And at first glance the execution plan for it seems to confirm that the operation was performed in parallel:

PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID	1q2m84vq594pj, child number 0
-------------------------------------
delete /*+ PARALLEL(a 42) */ from a where object_id in (select
object_id from b)

Plan hash value: 1475376193

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation	 	   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT	   |	      |       |       |   855 (100)|	      |        |      | 	   |
|   1 |  DELETE 		   | A	      |       |       | 	   |	      |        |      | 	   |
|   2 |   PX COORDINATOR	   |	      |       |       | 	   |	      |        |      | 	   |
|   3 |    PX SEND QC (RANDOM)	   | :TQ10002 | 91277 |   891K|   855	(1)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |
|*  4 |     HASH JOIN SEMI BUFFERED|	      | 91277 |   891K|   855	(1)| 00:00:01 |  Q1,02 | PCWP | 	   |
|   5 |      PX RECEIVE 	   |	      | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,02 | PCWP | 	   |
|   6 |       PX SEND HASH	   | :TQ10000 | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,00 | P->P | HASH	  |
|   7 |        PX BLOCK ITERATOR   |	      | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,00 | PCWC | 	   |
|*  8 | 	TABLE ACCESS FULL  | A	      | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,00 | PCWP | 	   |
|   9 |      PX RECEIVE 	   |	      | 91278 |   445K|   427	(1)| 00:00:01 |  Q1,02 | PCWP | 	   |
|  10 |       PX SEND HASH	   | :TQ10001 | 91278 |   445K|   427	(1)| 00:00:01 |  Q1,01 | S->P | HASH	  |
|  11 |        PX SELECTOR	   |	      |       |       | 	   |	      |  Q1,01 | SCWC | 	   |
|  12 | 	TABLE ACCESS FULL  | B	      | 91278 |   445K|   427	(1)| 00:00:01 |  Q1,01 | SCWP | 	   |
--------------------------------------------------------------------------------------------------------------------

But a closer look reveals that the actual delete that we wanted to run in parallel is performed only after (above and to the left in the execution plan) the parallel part coordinator is done with the query. Also, because we only specified parallelism for Table A, the scan of Table B is being performed sequentially. The results are then distributed to parallel workers, and hash joined against the parallel query of A. Those results are buffered and returned to the coordinator which then serially performs the actual deletes. Sounds silly? It is. And the second issue is that even though the hint gets ignored for the DELETE part, it is still applied to the query and join, forcing a half-parallel operation.

Enable parallel DML mode

The issue to be aware of with parallel DML in Oracle is that it is disabled by default and you have to explicitly enable it in the session before running parallel DML operations.

And at least DBMS_XPLAN in 12c will also warn you about not using parallel DML in it’s output:

Note
-----
   - Degree of Parallelism is 42 because of hint
   - PDML is disabled in current session

So let’s try the same thing again, but enable parallel DML this time:

SQL> alter session enable parallel dml;

Session altered.

SQL> delete /*+ PARALLEL(a 42) */ from a where object_id in (select object_id from b);

91277 rows deleted.

SQL> rollback;

PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID	1q2m84vq594pj, child number 1
-------------------------------------
delete /*+ PARALLEL(a 42) */ from a where object_id in (select
object_id from b)

Plan hash value: 2691386263

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation			 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |	 TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT		 |	    |	    |	    |	438 (100)|	    |	    |	    |		 |
|   1 |  PX COORDINATOR 		 |	    |	    |	    |		 |	    |	    |	    |		 |
|   2 |   PX SEND QC (RANDOM)		 | :TQ10003 | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,03 | P->S | QC (RAND)  |
|   3 |    DELETE			 | A	    |	    |	    |		 |	    |  Q1,03 | PCWP |		 |
|   4 |     PX RECEIVE			 |	    | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,03 | PCWP |		 |
|   5 |      PX SEND HASH (BLOCK ADDRESS)| :TQ10002 | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,02 | P->P | HASH (BLOCK|
|*  6 |       HASH JOIN SEMI BUFFERED	 |	    | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,02 | PCWP |		 |
|   7 |        PX RECEIVE		 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|   8 | 	PX SEND HASH		 | :TQ10000 | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | P->P | HASH	 |
|   9 | 	 PX BLOCK ITERATOR	 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWC |		 |
|* 10 | 	  TABLE ACCESS FULL	 | A	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWP |		 |
|  11 |        PX RECEIVE		 |	    | 91278 |	445K|	427   (1)| 00:00:01 |  Q1,02 | PCWP |		 |
|  12 | 	PX SEND HASH		 | :TQ10001 | 91278 |	445K|	427   (1)| 00:00:01 |  Q1,01 | S->P | HASH	 |
|  13 | 	 PX SELECTOR		 |	    |	    |	    |		 |	    |  Q1,01 | SCWC |		 |
|  14 | 	  TABLE ACCESS FULL	 | B	    | 91278 |	445K|	427   (1)| 00:00:01 |  Q1,01 | SCWP |		 |
--------------------------------------------------------------------------------------------------------------------------

Now we see the DELETE operation below the PX COORDINATOR, which means it gets executed in parallel by the parallel workers. B is still read serially because we only specified table A in the hint. Let me just add this 100% parallel plan for the sake of completeness…

PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID	02w21a0rgz02m, child number 1
-------------------------------------
delete /*+ PARALLEL(42) */ from a where object_id in (select object_id
from b)

Plan hash value: 149866034

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation			 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |	 TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT		 |	    |	    |	    |	 23 (100)|	    |	     |	    |		 |
|   1 |  PX COORDINATOR 		 |	    |	    |	    |		 |	    |	     |	    |		 |
|   2 |   PX SEND QC (RANDOM)		 | :TQ10003 | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,03 | P->S | QC (RAND)  |
|   3 |    DELETE			 | A	    |	    |	    |		 |	    |  Q1,03 | PCWP |		 |
|   4 |     PX RECEIVE			 |	    | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,03 | PCWP |		 |
|   5 |      PX SEND HASH (BLOCK ADDRESS)| :TQ10002 | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,02 | P->P | HASH (BLOCK|
|*  6 |       HASH JOIN SEMI BUFFERED	 |	    | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|   7 |        PX RECEIVE		 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|   8 | 	PX SEND HASH		 | :TQ10000 | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | P->P | HASH	 |
|   9 | 	 PX BLOCK ITERATOR	 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWC |		 |
|* 10 | 	  TABLE ACCESS FULL	 | A	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWP |		 |
|  11 |        PX RECEIVE		 |	    | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|  12 | 	PX SEND HASH		 | :TQ10001 | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,01 | P->P | HASH	 |
|  13 | 	 PX BLOCK ITERATOR	 |	    | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,01 | PCWC |		 |
|* 14 | 	  TABLE ACCESS FULL	 | B	    | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,01 | PCWP |		 |
--------------------------------------------------------------------------------------------------------------------------
Conclusion

Keep in mind that parallel DML has to be enabled at the session level specifically. And when reviewing execution plans with parallel DML, make sure the update, insert or delete operation is below the PX coordinator or else you will serialize again before performing the actual operation.

Categories: DBA Blogs

Oracle New Public Cloud Licensing Policy – Good or Bad?

Wed, 2017-02-22 11:05

A little while ago after a question from a customer about supporting Oracle products on the Google Cloud Platform (GCP) I decided to take a look for any updates to the Oracle public cloud support policies. The document can be easily found on the Oracle website. I quickly noticed some significant changes in the new version of the document. More specifically, I’m referring to the changes that came with the latest version of that document dated January 23, 2017.

In this blog I will attempt to summarize the findings and my thoughts about the matter. Before proceeding any further, let’s begin with a safe harbor notice (as Oracle does) and mention that I do not work for Oracle and I am not a licensing expert. Some of my observations may be incorrect and everyone reading this post is strongly encouraged to make a fully informed decision only after consultation with an Oracle sales representative. After reading the Oracle licensing policy, you should bear in mind that the document provides only guidelines and it is published for “education purposes only”.

So, what do we have in the new edition? The document shares details about the Oracle licensing policy on public cloud environments including AWS EC2, AWS RDS and Microsoft Azure platforms. Alas, still no mention of Google Cloud Platform (GCP). It leaves GCP out of charted territory and even though it doesn’t explicitly prohibit you from moving your Oracle products to GCP, it makes it difficult to estimate the impact and cost.

The first paragraph has a link to listing of all Oracle products where the policy applies. Notably, the document explicitly lists almost all Oracle Database Enterprise Edition options and packs except Oracle RAC and Multitenant. If the absence of Oracle RAC may have some technical justifications, then the exclusion of the Multitenant option doesn’t make too much sense for me.

The next paragraph reveals a lot of changes. The new version of the document officially recognizes AWS vCPU as a thread, not as a core. Prior to January 23 2017, we used to have an AWS published document showing vCores by instance type for licensing calculations, and it was widely-used even though it was never officially provided by Oracle. People used number of cores from the document and applied the Oracle multi-core factor table on top of the cores count. There was never a similar document for Azure consequently considered a vCPU as a vCore and the same calculation using the multi-core factor table. The new version of the Oracle document now states explicitly that the two vCPU on AWS have the same licensing cost as one vCPU on Azure.

It’s explicitly stated in the document that either two vCPU from AWS or one vCPU from Azure are equivalent to one Oracle CPU license. Another statement confirms that from now on, the Oracle multi-core factor no longer applies for the mentioned public cloud environments. This can be a serious impact to people migrating or planning to migrate to AWS or Azure using Bring Your Own License (BYOL) policy. They may now find themselves in a difficult position. Either you plan your migration to a smaller environment or increase your licensing cost. In this case, it’s important to keep in mind that the size of AWS EC2 instance may have direct impact not only to CPU power but to maximum available I/O on the storage layer.

Additionally, there is now a section containing formally defined rules for Oracle Database Standard Edition and Standard Edition 2. According to the paper, we count every four vCPUs on AWS or two vCPUs on Azure as one socket. This means that you cannot have more than 8 AWS vCPUs for a Standard Edition 2 (SE2) license. Standard Edition (SE) allows you to have a 16 vCPU machine, and hence still provides more opportunities when sizing EC2 or RDS . Unfortunately, the SE is only available up to version 12.1.0.1 and support for that release is coming to an end. So what can still be used for an Oracle SE2 on AWS? We can get one of the *.2xlarge instances on EC2 or pick up a similarly sized RDS instance. Is that going to be big enough? That depends on your workload profile, but again, keep in mind IO limitations per instance type. It is going to be a maximum of 8000 IOPS per the instance. Not a small number, but you will need to measure and identify your IO requirements before going to the cloud.

On one hand, the new policies are way more clear and direct than they used to be and I believe that the clarification is good. It is always easier to plan your implementations and budget when you are confident of what to expect. On the other hand, it looks like we have to pay twice as much in licensing fees when moving to AWS or Azure when compared with any bare metal or OVM environment on premises. Will it make Oracle products more attractive for customers? I have some doubts that it will. Will it make the Oracle Cloud a more interesting target platform for cloud migrations? Possibly. That is likely the main goal of Oracle but we’ve yet to see if it works out for them as expected. I liked it when Oracle made Oracle Express Edition (XE) available for everyone for free and when Oracle Standard Edition came with the RAC option at no additional costs. While I don’t have any official numbers, I think that the Express Edition and RAC included with SE turned many customers onto Oracle products. However, I’m afraid that the new licensing policy for cloud may do the opposite and turn some people away from Oracle and consequently play out really badly for Oracle in the long term.

Categories: DBA Blogs

Log Buffer #505: A Carnival of the Vanities for DBAs

Tue, 2017-02-21 15:16

This Log Buffer Edition searches through various blogs of Oracle, SQL Server and MySQL and picks a few contemporary ones.

Oracle:

Comma separated search and search with check-boxes in Oracle APEX

Once you have defined your users for your Express Cloud Service, all users with the role of Database Developer or higher can access the database Service Console.

Big Data Lite 4.7.0 is now available on OTN!

Install and configure Oracle HTTP Server Standalone

Can I Customize EBS on Oracle Cloud?

SQL Server:

vCenter Server fails to start, Purge and Shrink Vcenter SQL Database

Introducing a DevOps Workgroup to Farm Credit Services of America

Scaling out SSRS on SQL Server Standard Edition

Using AT TIME ZONE to fix an old report

How to import data to Azure SQL Data Warehouse using SSIS

MySQL:

MySQL Bug 72804 Workaround: “BINLOG statement can no longer be used to apply query events”

Sysadmin 101: Troubleshooting

Making MaxScale Dynamic: New in MariaDB MaxScale 2.1

Improving the Performance of MariaDB MaxScale

Group Replication: Shipped Too Early

Categories: DBA Blogs

Backup Oracle Databases to AWS S3

Tue, 2017-02-21 10:17

There are different options for backing up Oracle databases to Cloud, but using Oracle Secure Backup module to take backups into AWS S3 is one of the most efficient methods in terms of costs and backup/restore performance.

In this post I will show you how to install, configure and use Oracle Secure Backup to take your Oracle database backups to AWS S3. This method can be used for Oracle database version 9.2 or higher.

In this example, database version is 12c and platform is Linux x86_64.

Oracle Secure Backup module must be installed into database Oracle Home. Using installed libraries you can then take backups via RMAN into AWS S3 the same way you backup to sbt_tape.

Requirements:

1- An AWS account and an IAM user with access to S3:

For setting up backups to AWS you will require an AWS account and an IAM user with full access to AWS S3. During setup Access Keys and Secret Access Key of this IAM user will be used. There is no need to have access to AWS Console.

You can use AWS Free tire for test purposes.

2- Oracle Secure Backup module for AWS:
You can download Oracle Secure Backup module for AWS from here

3- OTN account:
During installation you need to provide an OTN account.

4- Java 1.7 or higher:
Java 1.7 or higher must be installed on your server before you can proceed.

Installation:

1- Create Oracle Wallet Directory:

If Oracle Wallet directory does not exist, create one. This folder will be used to store AWS Access Keys and Secret Access Key.
Create this directory in $ORACLE_HOME/dbs/:


   $ cd $ORACLE_HOME/dbs/
   $ mkdir osbws_wallet

2- Download osbws_installer.zip from the link provided above and put in your installation folder, in this example /mnt/stage/osb , unzip the compressed file and you will have two files as shown below:


   $ pwd
   /mnt/stage/osb
   $ unzip osbws_installer.zip
   Archive:  osbws_installer.zip
     inflating: osbws_install.jar
     inflating: osbws_readme.txt
   $ ls
   osbws_installer.zip  osbws_install.jar  osbws_readme.txt

3- Install OSB Cloud Module for Amazon S3 into your Oracle Home:


   $ cd /mnt/stage/osb
   $ java -jar osbws_install.jar -AWSID XxXxX -AWSKey XxXxX -walletDir $ORACLE_HOME/osbws_wallet -libDir $ORACLE_HOME/lib -location ap-southeast-2 -awsEndPoint  s3-ap-southeast-2.amazonaws.com  -otnUser bakhshandeh@pythian.com -otnPass

Parameters that you will need to set for installation are as below:


  -AWSID:       AWS Access Key

  -AWSKey:      AWS Secret Access Key

  -walletDir:   Location where Backup Module will store AWS keys

  -libDir:      Location where Backup Module libraries will be installed

  -location:    This is AWS S3 location where you want to put your backups into. 
                Value for this parameter must be a valid Region from Amazon Regions.
                In this example "ap-southeast-2" which is region for "Asia Pacific (Sydney)" has been used

  -awsEndPoint: This should be valid end-point from location AWS region specified by "location" parameter
                In this example "s3-ap-southeast-2.amazonaws.com" has been used which is one of the end-points in ""Asia Pacific (Sydney)""

  -otnUser:     OTN Account

  -otnPass:     OTN Password

In my example I did not pass any value for -otnPass parameter and this was the only workaround I found for the error noted below during my tests:


   Downloading Oracle Secure Backup Web Service Software Library from file osbws_linux64.zip.
   Error: Library download failed. Please check your network connection to Oracle Public Cloud.

When I encountered this error I could only fix the issue by passing no value for otnPass, but it might work for you.

Running Backup using RMAN:

Installation will create a file in $ORACLE_HOME/dbs which is usually named osb<SID>.ora and you need to use full path of this file in your allocate channel command in RMAN.

In my example SID is KAMRAN


   $ cd $ORACLE_HOME/dbs
   $ pwd
   /apps/oracle/product/12.1.0.2/db_1/dbs
   $ ls -al osb*.ora
   -rw-r--r-- 1 oracle oinstall 194 Jan  5 11:31 osbwsKAMRAN.ora
   $

Content of this file is as below:


   $ cat osbwsKAMRAN.ora
   OSB_WS_HOST=http://s3-ap-southeast-2.amazonaws.com
   OSB_WS_LOCATION=ap-southeast-2
   OSB_WS_WALLET='location=file:/apps/oracle/product/12.1.0.2/db_1/dbs/osbws_wallet CREDENTIAL_ALIAS=gordon-s_aws'
   $

This file can be used for any other database in same Oracle Home. For this reason,I renamed it to osbwsCONFIG.ora so that name is generic and there is no dependency to any of databases.

mv osbwsKAMRAN.ora osbwsCONFIG.ora

I will use osbwsCONFIG.ora in RMAN channel settings.

Now you just need to allocate a channel for your backup/restore commands as below using above file as below:


   allocate channel c1 device type sbt parms='SBT_LIBRARY=libosbws.so,SBT_PARMS=(OSB_WS_PFILE=/apps/oracle/product/12.1.0.2/db_1/dbs/osbwsCONFIG.ora)';

This is a complete example which shows backup piece details and how they have been located in AWS S3 regions you specified during installation:


   $ . oraenv
   KAMRAN
   $ rman target /

   Recovery Manager: Release 12.1.0.2.0 - Production on Wed Dec 28 11:21:48 2016

   Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

   connected to target database: KAMRAN (DBID=283560064)

   RMAN>run{
   2> allocate channel c1 device type sbt parms='SBT_LIBRARY=libosbws.so,SBT_PARMS=(OSB_WS_PFILE=/apps/oracle/product/12.1.0.2/db_1/dbs/osbwsCONFIG.ora)';
   3> backup datafile 1;
   4> }

   released channel: ORA_DISK_1
   allocated channel: c1
   channel c1: SID=374 instance=KAMRAN device type=SBT_TAPE
   channel c1: Oracle Secure Backup Web Services Library VER=3.16.11.11

   Starting backup at 28-DEC-16
   channel c1: starting full datafile backup set
   channel c1: specifying datafile(s) in backup set
   input datafile file number=00001 name=+DATA/KAMRAN/DATAFILE/system.258.887023011
   channel c1: starting piece 1 at 28-DEC-16
   channel c1: finished piece 1 at 28-DEC-16
   piece handle=09rojka7_1_1 tag=TAG20161228T112807 comment=API Version 2.0,MMS Version 3.16.11.11
   channel c1: backup set complete, elapsed time: 00:00:45
   channel c1: starting full datafile backup set
   channel c1: specifying datafile(s) in backup set
   including current control file in backup set
   including current SPFILE in backup set
   channel c1: starting piece 1 at 28-DEC-16
   channel c1: finished piece 1 at 28-DEC-16
   piece handle=0arojkbl_1_1 tag=TAG20161228T112807 comment=API Version 2.0,MMS Version 3.16.11.11
   channel c1: backup set complete, elapsed time: 00:00:07
   Finished backup at 28-DEC-16
   released channel: c1

   RMAN> list backup tag TAG20161228T112807;


   List of Backup Sets
   ===================


   BS Key  Type LV Size       Device Type Elapsed Time Completion Time
   ------- ---- -- ---------- ----------- ------------ ---------------
   9       Full    741.75M    SBT_TAPE    00:00:38     28-DEC-16
           BP Key: 9   Status: AVAILABLE  Compressed: NO  Tag: TAG20161228T112807
           Handle: 09rojka7_1_1   Media: s3-ap-southeast-2.amazonaws.com/oracle-data-gordonsm-ap1
     List of Datafiles in backup set 9
     File LV Type Ckp SCN    Ckp Time  Name
     ---- -- ---- ---------- --------- ----
     1       Full 58915843   28-DEC-16 +DATA/KAMRAN/DATAFILE/system.258.887023011

   BS Key  Type LV Size       Device Type Elapsed Time Completion Time
   ------- ---- -- ---------- ----------- ------------ ---------------
   10      Full    22.50M     SBT_TAPE    00:00:01     28-DEC-16
           BP Key: 10   Status: AVAILABLE  Compressed: NO  Tag: TAG20161228T112807
           Handle: 0arojkbl_1_1   Media: s3-ap-southeast-2.amazonaws.com/oracle-data-gordonsm-ap1
     SPFILE Included: Modification time: 26-DEC-16
     SPFILE db_unique_name: KAMRAN
     Control File Included: Ckp SCN: 58915865     Ckp time: 28-DEC-16

   RMAN>
Some performance statistics:

I used a 340G database for testing performance and tried full backups into AWS S3 using different number of channels.
First, I allocated two channels and backup to AWS was complete in 48 minutes. I then tried four channels, and the backup to AWS was completed in 27 minutes.

I predicted that by increasing the number of channels to eight, would make backup complete faster. Surprisingly, with 8 channels backup completed in 27 minutes (which was exactly the same result when I used four channels).
So in my case, the optimum number of channels for taking backups to AWS S3 was four.

I should mention that same database when backed up to NFS disks using four channels it completed in 22 minutes, so backup time of 27 minutes to AWS was acceptable.

Restore was even faster. I tried restore without recovering the database, same 340G database full restore of databases from AWS backups completed in 22 minutes which again is acceptable.

Categories: DBA Blogs

Log Buffer #504: A Carnival of the Vanities for DBAs

Tue, 2017-02-14 19:15

This edition of Log Buffer covers Oracle, Microsoft SQL Server, and MySQL.

Oracle:

Data Pump or Data Pain Part02 – tablespace/user

Monday Spotlight: It’s About Security – Oracle Exadata SL6

0001_specify_date_format_mask.md copy

OGG Custom Adapters: How to include a unique identifier for every record in custom adapter?

One of the optimizer enhancements that appeared in 12.2 for SQL is the “band join” that makes certain types of merge join much more efficient.

Microsoft SQL Server:

How to import data to Azure SQL Data Warehouse using SSIS

Using Index-Organized Tables in Oracle

Query Performance Tuning – A Methodical Approach

SQL Server Temporal Tables: How-To Recipes

Fix Commvault Backup Failures After an Upgrade

MySQL:

MySQL may return results in non-deterministic order with ‘order by’

Shinguz: MySQL and MariaDB authentication against pam_unix

MySQL in the Cloud – Pros and Cons of Amazon RDS

With 500+ VividCortex Users, Shopify Eliminates High Latency Queries From Redis and MySQL

An Introduction to MariaDB’s Data at Rest Encryption (DARE)

Categories: DBA Blogs

Parting With ocm.rsp Isn’t Really Sweet Sorrow…

Tue, 2017-02-07 13:00

Now that we’re well into 2017, let’s sit back and ask the question: “What was the best news of 2016 in the Oracle ecosystem?”

I’ll give you a clue.  The best news was something that sneakily appeared between April and October 2016 in your patch maintenance logs :

[root@myclusterdb02 OPatch]# ./opatchauto apply /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103 -oh /u01/app/oracle/product/12.1.0.2/dbhome_1 -ocmrf /u01/app/oracle/product/12.1.0.2/dbhome_1/OPatch/ocm/bin/ocm.rsp
System initialization log file is /u01/app/oracle/product/12.1.0.2/dbhome_1/cfgtoollogs/opatchautodb/systemconfig2016-12-11_05-23-36PM.log.
Session log file is /u01/app/oracle/product/12.1.0.2/dbhome_1/cfgtoollogs/opatchauto/opatchauto2016-12-11_05-24-09PM.log
WARNING: the option -ocmrf is deprecated and no longer needed. OPatch no longer checks for OCM configuration. It will be removed in a future release.

The interesting part is in that 4th line :

WARNING: the option -ocmrf is deprecated and no longer needed. OPatch no longer checks for OCM configuration. It will be removed in a future release.

This is awesome! We no longer need this annoying ocm.rsp file !

When I discovered this back in November 2016, I searched for an official feature announcement but found nothing (it looks like an opatch 13c new feature backported in 11g and 12c). This ocm.rsp file was still needed when I applied the April 2016 PSU, but from which opatch version was it implemented?

First, I looked at the 2016 dev comments in opatch 11.2.0.3.15 and 12.2.0.1.7 (which were the last version when I was applying this Exadata October 2016 PSU) and I sadly discovered that no specific comments were talking about this new feature (even if it would obviously deserve some):

#  opatch  04/08/16  Update CAS_VERSION to 12.2.0.1.3
#  opatch  04/11/16  Consolidate all 4 opatch scripts
#  opatch  06/07/16  Calling auto_patch.pl using the perl from Oracle Home Location
#  opatch  06/30/16  bug 23721730 - default memory was too high for 32 bit so removed and disabled opatch_env.sh script call

 

After a quick look at the code, I found the amazing piece of code that would ease our lives in the future (the function is named ocmOptionDetect()):

# Option 'ocmrf' is deprecated in OPatch 13.3 - Detect and give warning message
if [ "$OCMRF_OPTION" = "1" ] && [ "$VERSION_13_X" = "1" ]; then
echo "WARNING: the option \"-ocmrf\" is deprecated and no longer needed.  OPatch no longer checks for OCM configuration. It will be removed in a future release."

 

I then performed a few greps in different opatch versions that I had and I found that this feature has been implemented in:

  • 11.2.0.3.13 or 11.2.0.3.14 for 11g (I was unable to find any of these versions so I couldn’t check)
  • Between versions 12.1.0.1.10 and 12.2.0.1.7 for 12c (again, I was unable to find any of these versions so I couldn’t check)

If you come across any of the opatch versions specified above, please “grep -i ocmrf opatch” it and let me know so I can find when this feature has been implemented.

You can now upgrade all your opatches to become ocm.rsp-free and this (for me) was definitely the best new of 2016!

Categories: DBA Blogs

Investigating IO performance on Amazon EC2

Fri, 2016-12-30 14:23

I published a blog post called “Investigating IO Performance on Amazon RDS for Oracle” recently, and soon after posting it I received several questions asking if IO worked the same way on EC2 instances. My immediate though was it did, mostly because RDS for Oracle is basically an EC2 instance with Oracle Database on top of it, where the configuration is fully managed by Amazon. But as always, it’s better to test than assume, so here we go!

Although the testing was done by running a workload in an Oracle database, the results will apply to any other type of workload because the performance characteristics purely depend on the type of instance, type of EBS volume and the size of the IO requests, and it doesn’t matter how the data is processed after it’s retrieved from the storage.

The Testing

The testing was done exactly the same was as described in the previous blog post, the only difference was that I had to create an oracle database manually by myself. I used the 12.1.0.2.0 database Enterprise Edition and ASM, and the EBS volume was used as an ASM disk.

Measuring the IO performance

On RDS we had the nice Enhanced Monitoring which I set up with a refresh interval of a few seconds and I used it to collect performance statistics quickly. For EC2 (specifically for EBS volumes), there is no such thing as enhanced monitoring, so I needed to use the standard CloudWatch monitoring with the minimum refresh rate of 5 minutes (very inconvenient, because a single test case would have to be run for 5 to 10 minutes to collect reliable data). This was not acceptable, so I looked for alternatives and found that iostat displayed the same values the monitoring graphs did:

[root@ip-172-31-21-241 ~]# iostat 5 500
...
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.32    0.00    0.11    0.11    0.43   99.04

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
xvda              0.00         0.00         0.00          0          0
xvdf              0.40         0.00         8.00          0         40
xvdg           3060.00     24496.00        14.40     122480         72
...

the “tps” showed IO per second, and “kB_read/s”+”kB_wrtn/s” allowed me to calculate the throughput (I actually ended up using just the kB_read/s as my workload was 100% read only and the values in kB_wrtn/s were tiny).
iostat is even more convenient to use than the enhanced monitoring, it didn’t take long to see the first benefit of EC2 over RDS!

The Results

It was no surprise, the outcome of testing on EC2 was quite similar to the results from testing on RDS.

Provisioned IOPS Storage on an EC2 Instance

As on RDS, the testing was done on db.m4.4xlarge with 100G of io1 with 1000 provisioned IO/s. Also the results are very very similar, the only notable difference that I could observe (although I can’t explain it, and I’m not sure there is a patter in it or not, as I did’t do too many tests), was the fact that the throughput for 64K-96K IOs didn’t reach the same level as 128K+ IOs.

Provisioned IOPS Throughput (on EC2)

Provisioned IOPS Measured IO/s (on EC2)

These results confirm that (the same as with RDS), there are several sizes of physical IOs: (8), 16, 32, 64 and 128, and starting with 64K, the performance is throughput-bound, but with IOs of smaller size, it’s IOPS-bound.

General Purpose Storage on an EC2 Instance

The testing with General Purpose SSDs (100G with 300 baseline IOPS) didn’t provide any surprises and the results were exactly the same as for RDS.
The only difference in the graphs is the “bust performance” measures for IOs of different sizes that I’ve added to outline how the “bursting” improves both IO/s and Throughput.

General Purpose SSD Throughput (on EC2)

General Purpose SSD Measured IO/s (on EC2)

These results also confirm that (the same as with RDS), there are several sizes of physical IOs: 16, 32, 64 and 128, and starting with 32K, the performance is throughput-bound, but with IOs of smaller size, it’s IOPS-bound.

Additional Flexibility with EC2Using Multiple gp2 Volumes

Opposite to RDS, I can configure my storage and instance more freely, so instead of having just a single gp2 volume attached to it I added five 1G-sized (yes tiny) volumes to the +DATA disk group. the minimum IOPS for a gp2 volume is 100, so my 5 volumes gave cumulative 500 baseline IOPS. As ASM was used, the IOs were +/- evenly distributed between the volumes.

I didn’t do too thorough testing, but still I noticed a few things.
Take a look at these iostat outputs from testing done with 8K reads (this is burst performance):

[root@ip-172-31-21-241 ~]# iostat 5 500
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
xvda              0.00         0.00         0.00          0          0
xvdf              3.40         0.00        16.80          0         84
xvdg           1203.40      9632.00         1.60      48160          8
xvdi           1199.60      9596.80         0.00      47984          0
xvdj           1211.60      9691.20         3.20      48456         16
xvdk           1208.60      9670.40         0.00      48352          0
xvdl           1203.00      9625.60         3.20      48128         16

 

  • Bursting performance applies to each volume separately. It should allow getting up to 3000 IOPS per volume, but I reached only ~1200 per volume with cumulative throughput of 48214 KB/s (not even close to the limit). So there’s some other limit or threshold that applies to this configuration (and it’s not the CPU). But look! I’ve got 6024 IO/s burst performance, which is quite remarkable for just 5G.
  • As I was not hitting the maximum 3000 bursting IOPS per volume, the burst credit was running out much slower. if it lasts normally ~40 minutes at 3000 IOPS, it lasts ~3 times longer at ~1200 IOPS, which would allow running at better performance longer (i.e if one used 5x2G volumes instead of 1x10G volume)

This iostat output is from testing done with 1M reads (this is burst performance):

[root@ip-172-31-21-241 ~]# iostat 5 500
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
xvda              0.00         0.00         0.00          0          0
xvdf              3.40         0.00        16.80          0         84
xvdg            384.40     48820.80         0.80     244104          4
xvdi            385.80     49155.20         0.00     245776          0
xvdj            385.00     49014.40         6.40     245072         32
xvdk            386.80     49225.60         0.00     246128          0
xvdl            385.00     48897.60         6.40     244488         32

 

  • The cumulative throughput is 245111 KB/s, which is very close to the throughput limit of the instance. I wasn’t able to reach such throughput on a single volume of gp2, where the maximum I observed was just 133824 KB/s, and 163840 KB/s is a throughput limit for a single gp2 volume which was bypassed too. It appears that configuring multiple volumes allows reaching the instance throughput limit that was not possible with a single volume.

I didn’t run any non-burst tests as it required too much time (2 hours of waiting to exhaust the burst credits).

Database with a 32K Block Size

We have observed that starting with 32K block reads the EBS volume become’s throughput-bound, not IOPS-bound. Obviously I wanted to see how it performed if the database was created with a 32K block size.
I ran a few very simple tests using 1 data block sized IOs (32K) on these two configurations:

  1. db.m4.4xlarge with 100G / 1000 PIOPS (io1)
  2. db.m4.4xlarge with 20G / 100 IOPS (gp2)

There were no surprises on the Provisioned IOPS storage and I got the 1000 IOPS that were provisioned (actually it was slightly better – 1020 IO/s), and the throughput was 32576.00 KB/s
On General Purpose SSD, the story was different – we know that starting from 32K-sized IOs, the performance becomes throughput-bound, and it was confirmed here too:

  • During burst period I measured up to 4180 IO/s at 133779 KB/s, which was 4 times faster than Provisioned SSD.
  • During non-burst period I measured up to 764 IOs at 24748 KBs/s throughput. Which is somewhat slower than Provisioned SSD. Also 24748 KBs/s, was slower than the throughput I measured on a 100G gp2 volume (we already ow that the non-burst throughput limit for gp2 depends on the size of the disk). If I used a 100G gp2 volume, I’d get 2359 IO/s at 75433 KB/s (this is from the graph above), which is also better that what one can get from a Provisioned SSD volume, and costs less.
Conclusions

Most of the conclusions were already outlined in the previous blog post, and they also apply to the EC2 instances when a single EBS volume is used for storage.

On the other side, the EC2 instance allows System administrators and DBAs (or should I say “Cloud Administrator”) to work around some of the limitations by changing the “variables” that can’t be altered on RDS – like, the block size of the database (which is 8K on RDS), and the number of EBS volumes behind the RDS configuration. Using a 32K block size for a database residing on General Purpose volume allows bypassing the IOPS limitation completely, and only throughput limits stay in effect. However, if 32K block size is not an option (as for Oracle e-Business Suite), then the IOPS and throughput can still be maximized by using a configuration of multiple GP2 volumes.

After having all these tests done, I think the only reason for using RDS instead of EC2 is the database management that is provided by Amazon. If that is something very critical for your requirements, it’s the way to go. If it’s not something you require – the EC2 can be configured to perform better for the same price, but you need to think about it’s maintenance by yourself.

Categories: DBA Blogs

Investigating IO Performance on Amazon RDS for Oracle

Tue, 2016-12-13 15:55

I’ve recently been involved in quite a few database migrations to Oracle RDS. One thing that I had noticed when dealing with post-migration performance issues was related to queries that used TABLE SCAN FULL in their execution. It seemed, that in many cases, it just took a single query to max out the allocated IOPS (IOs per second) or bandwidth, which in turn would caused overall slowness of the RDS instance.

The search in documentation showed that it could have been caused by how IO operations are counted on Amazon RDS, as it’s quite different from what a routine Oracle DBA like me would expect. For multi-block reads the database (depending on storage) would typically issue IOs of size up to 1MB, so if an 8K block size was used the table scans would read up to 128 blocks in a single IO of db file scattered read or direct path read.

Now, pay attention to what the AWS documentation says:
While Provisioned IOPS (io1 storage) can work with I/O sizes up to 256 KB, most databases do not typically use such large I/O. An I/O request smaller than 32 KB is handled as one I/O; for example, 1000 16 KB I/O requests are treated the same as 1000 32 KB requests. I/O requests larger than 32 KB consume more than one I/O request; Provisioned IOPS consumption is a linear function of I/O request size above 32 KB. For example, a 48 KB I/O request consumes 1.5 I/O requests of storage capacity; a 64 KB I/O request consumes 2 I/O requests, etc. … Note that I/O size does not affect the IOPS values reported by the metrics, which are based solely on the number of I/Os over time. This means that it is possible to consume all of the IOPS provisioned with fewer I/Os than specified if the I/O sizes are larger than 32 KB. For example, a system provisioned for 5,000 IOPS can attain a maximum of 2,500 IOPS with 64 KB I/O or 1,250 IOPS with 128 KB IO.
… and …
I/O requests larger than 32 KB are treated as more than one I/O for the purposes of PIOPS capacity consumption. A 40 KB I/O request will consume 1.25 I/Os, a 48 KB request will consume 1.5 I/Os, a 64 KB request will consume 2 I/Os, and so on. The I/O request is not split into separate I/Os; all I/O requests are presented to the storage device unchanged. For example, if the database submits a 128 KB I/O request, it goes to the storage device as a single 128 KB I/O request, but it will consume the same amount of PIOPS capacity as four 32 KB I/O requests.

Based on the statements above it looked like the large 1M IOs issued by the DB would be accounted as 32 separate IO operations, which would obviously exhaust the allocated IOPS much sooner than expected. The documentation talks only about Provisioned IOPS, but I think this would apply to General Purpose SSDs (gp2 storage) too, for which the IOPS baseline is 3 IOPS/GB (i.e. 300 IOPS if allocated size is 100GB of gp2).

I decided to do some testing to find out how RDS for Oracle handles large IOs.

The Testing

For testing purposes I used the following code to create a 1G table (Thanks Connor McDonald and AskTom):

ORCL> create table t(a number, b varchar2(100)) pctfree 99 pctused 1;

Table T created.

ORCL> insert into t  values (1,lpad('x',100));

1 row inserted.

ORCL> commit;

Commit complete.

ORCL> alter table t minimize records_per_block;

Table T altered.

ORCL> insert into t select rownum+1,lpad('x',100) from dual connect by level<131072; 131,071 rows inserted. ORCL> commit;

Commit complete.

ORCL> exec dbms_stats.gather_table_stats(user,'T');

PL/ORCL procedure successfully completed.

ORCL> select sum(bytes)/1024/1024 sizemb from user_segments where segment_name='T';
SIZEMB
1088


ORCL> select value/1024/1024 buffer_cache from v$sga where name='Database Buffers';
BUFFER_CACHE
1184

The code for testing will be:

exec rdsadmin.rdsadmin_util.flush_buffer_cache;
alter session set "_serial_direct_read"=always;
alter session set db_file_multiblock_read_count=&1;

-- Run FTS against T table forever.
declare
  n number:=1;
begin
  while n>0
  loop
    select /*+ full(t) */ count(*) into n from t;
  end loop;
end;
/

Basically, I’ll flush the buffer cache, which will force the direct path reads by setting _serial_direct_read to “ALWAYS”, and then, will choose the db_file_multiblock_read_count based on how big IOs I want to issue (note, by default the db_file_multiblock_read_count is not set on RDS, and it resolves to 128, so the maximum size of an IO from the DB is 1 MB), I’ll test with different sizes of IOs, and will Capture the throughput and effective IOPS by using the “Enhanced Monitoring” of the RDS instance.

Side-note: the testing I had to do turned out to be more complex than I had expected before I started. In few cases, I was limited by the instance throughput before I could reach the maximum allocated IOPS, and due to this, the main testing needed to be one on large enough instance (db.m4.4xlarge), that had more of the dedicated EBS-throughput.

The ResultsProvisioned IOPS storage

Testing was done on a db.m4.4xlarge instance that was allocated 100GB of io1 storage of 1000 Provisioned IOPS. The EBS-optimized throughput for such instance is 256000 KB/s.
The tests were completed by using db_file_multiblock_read_count of 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 16, 32, 64 and 128.
For each test the Throughput and IO/s were captured (from RDS CloudWatch graphs), and also the efficient IO size was derived.
The DB instance was idle, but still, there could be few small IO happening during the test.

Provisioned IOPS Measured Throughput

Provisioned IOPS Measured Throughput

Provisioned IOPS Measured IO/s

Provisioned IOPS Measured IO/s

From the graphs above the following features that are not documented can be observed:

  • The RDS instance is dynamically choosing the physical IO size (I’ll call them “physical“, just to differentiate that these are the IOs to storage, while in fact, that’s only what I see in the CLoudWatch graphs, the real physical IO could be some something different) based on the size of the IO request from the database. The possible physical IO sizes appear to be 16K, 32K, 64K, 128K and probably also 8K (this could also be in fact 16K physical IO reading just 8K of data)
  • The IOPS limit applies only to smaller physical IOs sizes (up to 32K), for larger physical IOs (64K, 128K) the throughput is the limiting factor of the IO capability. The throughput limit appears to be quite close to the maximum throughput that the instance is capable of delivering, but at this point, it’s not clear how the throughput limit for particular situation is calculated.
Throughput Limits for Provisioned IOPS

I ran additional tests on differently sized instances with io1 storage to understand better how the maximum throughput was determined. The graph below represents the throughput achieved on different instances, but all had the same 100G of 1000 PIOPS io1 storage. The throughput was done by using db_file_multiblock_read_count=128:

PIOPS Throughput by Instance Type

PIOPS Throughput by Instance Type

it appears that the maximum throughput is indeed limited by the instance type, except for the very largest instance db.m4.10xlarge (For this instance the situation is somewhat weird even in the documentation because the maximum throughput is mentioned as 500 MB/s, but the maximum throughput for a single io1 EBS volume, which should be there underneath the RDS, is just 320 MB/s, and I was unable to reach any of these limits)

General Purpose SSD storage

Testing was done on a db.m4.4xlarge instance that was allocated 100GB of gp2 storage with 300 IOPS baseline. The EBS-optimized throughput for such instance is 256000 KB/s.
The tests were conducted similarly to how they were done for Provisioned IOPS above (note, this is the baseline performance, not burst performance)

General Purpose SSD Measured Throughput

General Purpose SSD Measured Throughput

General Purpose SSD Measured IO/s

General Purpose SSD Measured IO/s

Similarly to Provisioned IOPS, the General Purpose SSD storage behaves differently from what’s explained in the documentation:

  • The physical IO size again is calculated dynamically based on the size of the IO request from the database. The possible sizes appear to be the same as for io1: (8K), 16K, 32K, 64K and 128K.
  • The IOPS limit (to baseline level) appears to apply to IO sizes only up to 16K (compared to 32K in case of Provisioned IOPS), for larger physical IOs starting from 32K, the limit appears to be throughput-driven.
  • It’s not clear how the throughput limit is determined for the particular instance/storage combination, but in this case, it appeared to be around 30% of the maximum throughput for the instance, however, I didn’t confirm the same ratio for db.m4.large where the maximum achievable throughput depended on the allocated size of the gp2 storage.
Burst Performance

I haven’t collected enough data to derive anything concrete, but during my testing I observed that Burst performance applied to both maximum IOPS and also the maximum throughput. For example, while testing on db.m4.large (max instance throughput of 57600 KB/s) with 30G of 90 IOPS baseline performance, I saw that for small physical IOs it allowed bursting up to 3059 IOPS for short periods of time, while normally it indeed allowed only 300 IOPS. For larger IOs (32K+), the baseline maximum throughput was around 24500 KB/s, but the burst throughput was 55000 KB/s

Throughput Limits for General Purpose SSD storage

I don’t really know how the maximum allowed throughput is calculated for different instance type and storage configuration for gp2 instances, but one thing is clear: that instance size, and size of the allocated gp2 storage are considered in determining the maximum throughput. I was able to achieve the following throughput measures when gp2 storage was used:

  • 75144 KB/s (133776 KB/s burst) on db.m4.4xlarge (100G gp2)
  • 54500 KB/s (same as burst, this is close to the instance limit) on db.m4.large (100G gp2)
  • 24537 KB/s (54872 KB/s burst) on db.m4.large (30G gp2)
  • 29116 KB/s (burst was not measured) on db.m4.large (40G gp2)
  • 37291 KB/s (burst was not measured) on db.m4.large (50G gp2)
Conclusions

The testing provided some insight into how the maximum performance of IO is determined on Amazon RDS for Oracle, based on the instance type, storage type, and volume size. Despite finding some clues I also understood that managing IO performance on RDS is far more difficult than expected for mixed size IO requests that are typically issued by Oracle databases. There are many questions that still need to be answered (i.e. how the maximum throughput is calculated for gp2 storage instances) and it’d take many many hours to find all the answers.

On the other-hand, the testing already revealed a few valuable findings:

  1. Opposite to the documentation that states that all IOs are measured and accounted in 32KB units, we found that an IU reported by amazon can be of size 8K (probably), 16K, 32K, 64K and 128K
  2. For small physical IOs (up to 32K in case of Provisioned IOPS and up to 16K in case of General Purpose SSD) the allocated IOPS is used as the limit for the max performance.
  3. For larger physical IOs (from 64K in case of Provisioned IOPS and from 32K in case of General Purpose SSD) the throughput is used as the limit for the max performance, and the IOPS limit no longer applies.
  4. The Burst performance applies to both IOPS and throughput

P.S. As to my original issue of a single TABLE SCAN FULL severely impacting the overall performance, I found that in many cases we were using small RDS instances db.m3.large or db.m4.large, for which the maximum throughput was ridiculously small, and we were hitting the throughput limitation, not the IOPS limit that actually didn’t apply to the larger physical IOs on gp2 storage.

Categories: DBA Blogs

Enqueue Bytes – Is that a Pun?

Mon, 2016-11-07 13:53

Sometimes it is necessary to put on your uber-geek hat and start using cryptic bits of code to retrieve information from an Oracle database. Troubleshooting enqueue locking events in Oracle databases is one of the times some advanced SQL may be necessary.

Likely you have used SQL similar to the following when troubleshooting Oracle enqueue’s, probably in connection with row lock contention.

SQL# l
  1  SELECT
  2     s.username username,
  3     s.sid,
  4     e.event event,
  5     e.p1text,
  6     e.p1,
  7     e.state
  8  FROM v$session s, v$session_wait e
  9  WHERE s.username IS NOT NULL
 10     AND s.sid = e.sid
 11     AND e.event LIKE '%enq:%'
 12* ORDER BY s.username, UPPER(e.event)
 
USERNAME    SID EVENT                          P1TEXT                P1 STATE
---------- ---- ------------------------------ ---------- ------------- ----------
JKSTILL      68 enq: TX - ROW LOCK contention  name|mode     1415053318 WAITING
 
1 ROW selected.

The value for P1 is not very useful as is; Oracle has encoded the type of enqueue and the requested mode into the column. When working with current events such as when selecting from v$session, it is simple to determine the type of lock and the mode requested by querying v$lock, such as in the following example:

  1* SELECT sid, TYPE, request, block FROM v$lock WHERE sid=68 AND request > 0
SQL# /
 
 SID TY    REQUEST      BLOCK
---- -- ---------- ----------
  68 TX          6          0
 
1 ROW selected.

Session 68 is waiting on a TX enqueue with requested lock mode of 6. Seasoned Oracle DBA’s will recognize this as classic row lock contention.

Why bother to find out just which type of enqueue this is? There are many types of locks in Oracle, and they occur for differing reasons. The TX lock is interesting as it can occur not only in Mode 6 but Mode 4 as well; Mode 4 refers to locks that involve unique keys, such as when 2 or more sessions try to insert the same value for a primary key. The following example shows just that:

SQL# @s
 
USERNAME    SID EVENT                          P1TEXT                P1 STATE
---------- ---- ------------------------------ ---------- ------------- ----------
JKSTILL      68 enq: TX - ROW LOCK contention  name|mode     1415053316 WAITING
 
 
1 ROW selected.
 
SQL# @l
 
 SID TY    REQUEST      BLOCK
---- -- ---------- ----------
  68 TX          4          0
 
1 ROW selected.

Knowing just which lock mode is requested is vital, as the troubleshooting for TX Mode 4 locks will be different from what is used to troubleshoot Mode 6.

Though we can find the lock name and mode information in v$lock, there is still value in being able to decipher that cryptic P1 column.

The ASH and AWR facilities do not include any historical information for the lock name and mode; the P1 column found in v$active_session_history and dba_hist_active_sess_history does not have a corresponding dba_hist_lock view. Any research done after an event has occurred does require decoding this information.

Deciphering v$session.p1

Oracle Support document 34566.1 is the enqueue reference note that provides information needed to get the lock name and mode from the p1 column. As you will see this information is a bit puzzling.

The rest of this article will focus on TX Mode 6 locks. The value shown for this lock and mode in the P1 column is always 1415053318. Following is the SQL recommended by Oracle:

 SELECT chr(to_char(bitand(p1,-16777216))/16777215)||
         chr(to_char(bitand(p1, 16711680))/65535) "Lock",
         to_char( bitand(p1, 65535) )    "Mode"
    FROM v$session_wait
   WHERE event = 'enqueue'

As I currently have some planned row lock contention in a test database we can run this query:

  1   SELECT chr(to_char(bitand(p1,-16777216))/16777215)||
  2           chr(to_char(bitand(p1, 16711680))/65535) "Lock",
  3           to_char( bitand(p1, 65535) )    "Mode"
  4      FROM v$session_wait
  5*    WHERE event LIKE '%enq:%'
SQL# /
 
Lo Mode
-- ----------------------------------------
TX 4

Probably it is not very clear why this SQL works. Let’s try and understand it.
(Note that a small change had to be made to the WHERE clause.)

Converting the P1 value to hex may be helpful

1415053318 = 0x54580006

The two lower order bytes represent the lock mode that has been requested. This can be seen here to be 0x06, which is simple translate to decimal 6 (I can do this one in my head)

The next two bytes are also in hex and represent the two letters of the lock name.

0x54 = 84 = ‘T’
0x58 = 88 = ‘X’

Using string functions it is simple to extract the values from the hex string, convert them to numbers and retrieve the lock name and mode.

SQL# define p1 = 1415053318
 
SQL# l
  1  WITH hex AS (
  2     SELECT TRIM(to_char(&p1,'XXXXXXXXXXXXXXXX')) hexnum FROM dual
  3  ),
  4  hexbreak AS (
  5     SELECT hexnum
  6        , to_number(substr(hexnum,1,2),'XXXXXXXX') enq_name_byte_1
  7        , to_number(substr(hexnum,3,2),'XXXXXXXX') enq_name_byte_2
  8        , to_number(substr(hexnum,5),'XXXXXXXX') enq_mode
  9  FROM hex
 10  )
 11  SELECT
 12     hexnum
 13     , chr(enq_name_byte_1)
 14     || chr(enq_name_byte_2) enqueue_type
 15     , enq_mode
 16* FROM hexbreak
SQL# /
 
HEXNUM            EN   ENQ_MODE
----------------- -- ----------
54580006          TX          6

While that does work, my inner geek wants to investigate those bitmasks and find out why they work. Next are the bitmasks in decimal along with the hex equivalent.

-16777216 = 0xFFFFFFFFFF000000
 16777215 = 0xFFFFFF
 16711680 = 0xFF0000
    65535 = 0xFFFF

The bitand function is used to mask all unwanted bits to 0. The number is then divided by value needed to remove all of the now-zeroed-out lower order bytes.

The values being used as bitmasks are -16777216 and 16711680. The use of -16777216 does not seem to make sense. As the intent is to mask all but one byte, I would expect to find an FF surrounded by a number of zeroes. The bit mask of 16711680, however, looks fine.

Now let’s run the Oracle support query again, but modified to show just the integer values rather than converting them to ASCII.

 
  1  SELECT bitand(p1,-16777216)/16777215,
  2           bitand(p1, 16711680)/65535,
  3           bitand(p1, 65535)
  4      FROM v$session_wait
  5*    WHERE event LIKE '%enq:%'
SQL# /
 
BITAND(P1,-16777216)/16777215 BITAND(P1,16711680)/65535 BITAND(P1,65535)
----------------------------- ------------------------- ----------------
                    84.000005                88.0013428                6

Well, that is interesting. An implicit conversion is taking place with to_char() that is removing the decimal portion of these numbers. Is that being done with trunc(), round(), or something else? I don’t know the answer to that. What seems more important is just doing the math correctly.

There are a couple of things here that can be changed to make this work as expected.

A New BitMask

Let’s modify the first bitmask to something that seems more reasonable than -16777216. Let’s use this instead, as it masks only the single byte we need:

4278190080 = 0xFF000000

Lets’ try it out:

SQL# l
  1  SELECT bitand(p1,4278190080)/16777215,
  2           bitand(p1, 16711680)/65535,
  3           bitand(p1, 65535)
  4      FROM v$session_wait
  5*    WHERE event LIKE '%enq:%'
SQL# /
 
BITAND(P1,4278190080)/16777215 BITAND(P1,16711680)/65535 BITAND(P1,65535)
------------------------------ ------------------------- ----------------
                     84.000005                88.0013428                6

While the new bitmask didn’t break anything, it does not appear to have helped either.

Off By One Error

The solution is to consider the divisors used to remove the lower order zeroes; each of them is off by one. That is easy enough to verify:

SQL# l
  1  SELECT bitand(p1,4278190080)/16777216,
  2           bitand(p1, 16711680)/65536,
  3           bitand(p1, 65535)
  4      FROM v$session_wait
  5*    WHERE event LIKE '%enq:%'
SQL# /
 
BITAND(P1,4278190080)/16777216 BITAND(P1,16711680)/65536 BITAND(P1,65535)
------------------------------ ------------------------- ----------------
                           84                        88                6

Ah, that did it! But what was the problem previously?

Old Divisor Values

The original divisors are off by 1, which does not completely remove the lower order values.

 16777215 = 0xFFFFFF
    65535 = 0xFFFF

Increasing each by one has the desired effect.

New Divisor Values
 16777216 = 0x1000000
    65536 = 0x10000
Conclusion

Those odd bitmasks have been in the back of my mind for some time, and today it seemed like a good idea to dig in and find out more about them. It isn’t too hard to imagine that in some cases the wrong values might be returned, leading to some long and unproductive troubleshooting sessions.

There is a demo script enqueue-bitand.sql containing much of the SQL found in this article. There is also a script awr-top-sqlid-events.sql that incorporates the enqueue lock decoding. This script could be made better than it is, so please issue a pull request if you have some useful modifications.

Categories: DBA Blogs

Distinguish Real SQL Execution Plans from Fake Ones!

Thu, 2016-11-03 10:59

Distinguish Real Execution Plans from Fake Ones!

As an Oracle DBA, one of our daily tasks is to optimize bad SQL statements that are affecting the system and causing performance degradation. First we identify the culprit SQL, then we extract the execution plan, after that, we start the cycle of SQL tuning and optimization as appropriate.

 

There are many methods to extract the execution plan for a specific SQL statement, however, not all these methods will provide the actual or real plan that optimizer follows, rather than a suggestion or expectation of the execution plan, which is not always accurate and can be misleading.

 

In the following blog, I will demonstrate various execution plan extraction methods,  and create a scenario to test these methods then see which ones provide the actual execution plan that optimizer really follows to execute a specific SQL, and which methods provide a suggestion of the execution plan, that is not necessarily used by Oracle optimizer.

 

So first things first, what is an “execution plan”?

 

An execution plan is a set of steps that the optimizer performs when executing a SQL statement and performing an operation.

 

There are many ways to extract the execution plan for a specific SQL, like:

  • Enterprise Manager
  • SQL*Plus AUTOTRACE
  • EXPLAIN PLAN command
  • SQL Trace (event 10046) with tkprof
  • DBMS_XPLAN  package to view plans from:
  • Automatic Workload Repository
  • V$SQL_PLAN
  • SQL Tuning Sets
  • Plan table

 

Some of these methods will provide the “actual” execution plan, while other methods will provide a “suggestion” of the execution plan, that is, the steps that Oracle expects optimizer to follow in order to execute a SQL statement, which may not always be true.

 

In this test case, I will create a table with two columns, and insert skewed data into one column, then I will build an index and histogram on that column. I will then query that column using bind variables and see if Oracle will use the index or not.

 

It should be mentioned that any method that provides suggestion of the execution plan, is just a synonym of EXPLAIN PLAN command, and because this command does not use the feature of Bind Variable Peeking, it will not generate optimal plan for each different value in this test case, and that will be the crucial bit that this test case depends on.

So… without further ado, let’s see that in action:

 

>  Environment preparation for scenario testing:

[oracle@testhost ~]$ sqlplus / as sysdba


SQL> alter system flush shared_pool;

System altered.


SQL> alter system flush buffer_cache;

System altered.


SQL> drop user xp_test cascade;

User dropped.


SQL> grant dba to xp_test identified by xp_test;

Grant succeeded.


SQL> conn xp_test/xp_test


SQL> create table xp_table (emp_id varchar2(10), team varchar2(10)) pctfree 99;

Table created.


SQL> insert into xp_table select * from (SELECT 'EMP' || level , 'TEAM1' from dual connect by Level = 1000); 

1000 rows created. 


SQL> insert into xp_table select * from (SELECT 'EMP' || (level + 1000) , 'TEAM2' from dual connect by Level = 10); 

10 rows created. 


SQL> commit;

Commit complete.


SQL> create index index_test on xp_table (team);

Index created.


SQL> exec dbms_stats.gather_table_stats('XP_TEST','XP_TABLE',method_opt='FOR ALL COLUMNS SIZE AUTO FOR COLUMNS SIZE 254 TEAM',cascade=TRUE);

PL/SQL procedure successfully completed.


SQL> select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'XP_TABLE';

TABLE_NAME                   COLUMN_NAME                 HISTOGRAM
--------------------   ------------------------------ ---------------
XP_TABLE                   TEAM                           FREQUENCY
XP_TABLE                   EMP_ID                         NONE


SQL> select count(*),team from xp_table group by team order by 2 asc;

  COUNT(*)            TEAM
   ----------      ----------
    1000             TEAM1
     10              TEAM2

Ok, We have our environment ready for testing, let’s test each execution plan gathering method. First, let’s extract the actual execution plan:

 

> DBMS_XPLAN.DISPLAY_CURSOR:

 

Using this method, we will extract the execution plan directly from the shared pool, which will always be the real execution plan:

-- Define a bind variable:

SQL> variable x varchar2(10);  
SQL> exec :x:='TEAM2';

PL/SQL procedure successfully completed.


SQL> set lines 200    
SQL> col PLAN_TABLE_OUTPUT for a100
SQL> select count(emp_id) from xp_table where team = :x;

COUNT(EMP_ID)
-------------
     10


SQL> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID    31gbpz4ncsvp3, child number 0
-------------------------------------
select count(emp_id) from xp_table where team = :x

Plan hash value: 726202289
-------------------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time      |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      |      |    4 (100)|      |
|   1 |  SORT AGGREGATE          |          |    1 |    13 |           |      |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
|   2 |   TABLE ACCESS BY INDEX ROWID| XP_TABLE   |    10 |   130 |    4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN         | INDEX_TEST |    10 |      |    1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("TEAM"=:X)

20 rows selected.

We see clearly that optimizer accessed table using index range scan, which is good, so now are 100% sure that the real execution plan used an INDEX RANGE SCAN, let’s compare this result with other results.

 

> SQL*Plus AUTOTRACE:

 

Autotrace is very useful way to get SQL statistics, but will it provide the real execution plan?

SQL> set autotrace on explain;


SQL> select count(emp_id) from xp_table where team = :x;

COUNT(EMP_ID)
-------------
      10

Execution Plan
----------------------------------------------------------
Plan hash value: 3545047802

-------------------------------------------------------------------------------
| Id  | Operation       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |    13 |   102    (0)| 00:00:02 |
|   1 |  SORT AGGREGATE    |          |     1 |    13 |        |          |
|*  2 |   TABLE ACCESS FULL| XP_TABLE |   505 |  6565 |   102    (0)| 00:00:02 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("TEAM"=:X)

SQL> set autotrace off

Ops!, AUTOTRACE suggests that optimizer will use a full table scan to access the table, which is different from the execution plan provided in previous step, so AUTOTRACE will provide a suggestion of the execution plan, which may not always be true, now let’s continue with the next method.

 

> EXPLAIN PLAN:

 

Explain Plan command will provide the execution plan without even running the SQL, so we can derive easily that we will get a suggestion of the execution plan:

SQL> explain plan for select count(emp_id) from xp_table where team = :x;

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 3545047802

-------------------------------------------------------------------------------
| Id  | Operation       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |    13 |   102    (0)| 00:00:02 |
|   1 |  SORT AGGREGATE    |          |     1 |    13 |        |          |
|*  2 |   TABLE ACCESS FULL| XP_TABLE |   505 |  6565 |   102    (0)| 00:00:02 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

   2 - filter("TEAM"=:X)

14 rows selected.

Confirmed, this method also suggests that optimizer will do full table scan, which is not true as we know from the first method, let’s continue.

 

> SQL Trace (event 10046) with tkprof

 

SQL Trace (event 10046) with tkprof will always provide the real execution plan, optionally, we can use the keyword “explain” with tkprof to also include the suggested plan as well:

SQL> alter session set tracefile_identifier='xp_test';

Session altered.


SQL> alter session set events '10046 trace name context forever, level 20';

Session altered.


SQL> select count(emp_id) from xp_table where team = :x;

COUNT(EMP_ID)
-------------
       10

SQL> alter session set events '10046 trace name context off';

Session altered.


SQL> select value from v$diag_info where name like '%Trace%';

VALUE
------------------------------------------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_24595_xp_test.trc

SQL> !


$ tkprof /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_24595_xp_test.trc sys=no explain=xp_test/xp_test output=/tmp/with_explain


$ cat /tmp/with_explain.prf


Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=5 pr=0 pw=0 time=67 us)
        10         10         10   TABLE ACCESS BY INDEX ROWID XP_TABLE (cr=5 pr=0 pw=0 time=60 us cost=4 size=130 card=10)
        10         10         10    INDEX RANGE SCAN INDEX_TEST (cr=2 pr=0 pw=0 time=44 us cost=1 size=0 card=10)(object id 81349)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
     10    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'XP_TABLE' (TABLE)

As we can see, tkprof output shows both real and suggested plans.

So far, we have tested four execution plan methods, AUTOTRACE and EXPLAIN PLAN methods which will provide suggestions of the execution plan, DBMS_XPLAN.DISPLAY_CURSOR which will always provide the real execution plan, and SQL Trace which can provide both suggested and real execution plan.

 

> Creating SQL Plan Baseline and repeating the test for AUTOTRACE and EXPLAIN PLAN:

-- Checking for SQL plan baselines:


SQL> select sql_handle, plan_name, sql_text, enabled, accepted, fixed from dba_sql_plan_baselines;

no rows selected


--Let’s create a baseline:

SQL> show parameter baseline 

NAME                                  TYPE         VALUE
------------------------------------ ----------- ------------------------------
optimizer_capture_sql_plan_baselines  boolean       FALSE
optimizer_use_sql_plan_baselines      boolean       TRUE


SQL> alter session set optimizer_capture_sql_plan_baselines = TRUE; 

Session altered.


SQL> select count(emp_id) from xp_table where team = :x;

COUNT(EMP_ID)
-------------
       10


SQL> select sql_handle, plan_name, sql_text, enabled, accepted, fixed from dba_sql_plan_baselines;

SQL_HANDLE                PLAN_NAME                           SQL_TEXT                                              ENA    ACC   FIX
----------------------   ---------------------------------   ----------------------------------------------------  ------ ----- ------
SQL_65dc367505be1804      SQL_PLAN_6br1qfn2vw604bc04bcd8      select count(emp_id) from xp_table where team = :x    YES    YES    NO


SQL> alter session set optimizer_capture_sql_plan_baselines = FALSE;

Session altered.


SQL> alter system flush buffer_cache;

System altered.


SQL> alter system flush shared_pool;

System altered.

 

> SQL*Plus AUTOTRACE

 

Trying AUTOTRACE tool after creating SQL Plan Baseline:

SQL> set autotrace on explain; 


SQL> select count(emp_id) from xp_table where team = :x;

COUNT(EMP_ID)
-------------
      10


Execution Plan
----------------------------------------------------------
Plan hash value: 726202289

-------------------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time      |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |    1 |    13 |   129   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE          |          |    1 |    13 |           |      |
|   2 |   TABLE ACCESS BY INDEX ROWID| XP_TABLE   |   505 |  6565 |   129   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN         | INDEX_TEST |   505 |      |    2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("TEAM"=:X)

Note
-----
   - SQL plan baseline "SQL_PLAN_6br1qfn2vw604bc04bcd8" used for this statement

SQL> set autotrace off

Notice that AUTOTRACE this time is aware about the baseline, hence, it is showing that optimizer will perform index range scan. Next, Let’s try EXPLAIN PLAN command:

 

> EXPLAIN PLAN:

 

Trying EXPLAIN PLAN tool after creating SQL Plan Baseline:

SQL> explain plan for select count(emp_id) from xp_table where team = :x;

Explained.


SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 726202289

-------------------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time      |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |    1 |    13 |   129   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE          |          |    1 |    13 |           |      |
|   2 |   TABLE ACCESS BY INDEX ROWID| XP_TABLE   |   505 |  6565 |   129   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN         | INDEX_TEST |   505 |      |    2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("TEAM"=:X)

Note
-----
   - SQL plan baseline "SQL_PLAN_6br1qfn2vw604bc04bcd8" used for this statement

19 rows selected.

As we can see, EXPLAIN PLAN is aware of the baseline and showing that optimizer will perform index scan.

 

Summary:

There are many methods to extract the execution plan, some methods will provide the actual execution plan, while other methods will provide just a suggestion that optimizer may not follow in real time. In this test case we have tested four methods for execution plan extraction (DBMS_XPLAN, AUTOTRACE, EXPLAIN PLAN, and SQL Trace), we have then created an SQL Baseline and performed the test again on some methods to confirm that all methods are always aware of the SQL Baseline.
Categories: DBA Blogs

AWS DMS Tips for Oracle Migrations

Thu, 2016-10-27 10:54

Some time ago I published a post regarding the Database Migration Service (DMS) on AWS. The service had just appeared at that time and looked very promising. It is still appealing and has a good use and potential to mature and be better. I was able to test and try it in several cases moving data to AWS and between RDS databases inside AWS. During those exercises, I ran into some issues and wanted to make other people aware of some things to keep in mind when starting or planning to use the DMS. Most of my experience with DMS is related to migrations to and from Oracle databases. So the following tips are about Oracle migrations and replications.

Before planning any kind of a logical replication based on transaction logs please check what kind of data types you have and whether you have primary keys on all your replicated tables. The primary key existence in some cases is not only desirable but is required to properly replicate the objects. As an example, if you want to replicate a table with some lob objects using DMS you must have a primary key for the table because if you don’t your lob columns will be excluded from replication and you end up with null values instead of the lobs on the target database.

Check for datatypes used for the tables. All logical replications have some limitations in terms of replicated data types. You may find some of your tables can be replicated with some limitations or not replicated at all.

Let’s speak about DMS. When you plan your DMS replication tasks keep in mind that you can combine replication of several schemas to one task. It may significantly reduce load on the source system. Each migration or replication task may apply a considerable load to the source system. In my experience, we hit 100% CPU and max IO load running 8 replication tasks on the source m3.large system.

Remember you cannot change the created task. You are able to stop, start or resume the task but you cannot alter any rules or parameters for the task. It will maybe change soon but currently, it is not possible yet.

If you choose the truncate option for your migration and replication task it may change your metadata. You may find your indexes and constraints to be renamed and you even can lose some of the indexes. In my case, I got renamed primary key and unique key constraints and couple of indexes were lost. Please be careful. After that case, I tried to choose “do nothing” mode and do everything by myself preparing the target for data migration and replication.

You can use RDS snapshot to instantiate your data if you plan to setup a replication between two RDS instances of the same type. In our case, it was done to minimize downtime when the database was migrated to an encrypted storage. When you use RDS snapshot you can use snapshot creation time as “Custom CDC start time” for your DMS replication task.

If you use one universal user for your endpoints to replicate multiple schemas you will need to use transformation rules because the replication will try to use schema from endpoint as destination target by default. By other words, you set up a target endpoint using user “DMS” and try to replicate schema SCOTT it will use schema “DMS” as a destination by default if you don’t have the transformation rules.

You have to enable minimal supplemental logging on database level and supplemental logging for all columns explicitly for each replicated table. Even you enable supplemental logging for all columns on database level using “alter database” you still need to add it on table level. The DMS task will be aborted without it complaining about the lack of supplemental logging for all column for the table.

If you create a table in one of your replicated schema you need to add a clause for supplemental logging like “CREATE TABLE …ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS; ” . In that case, the table and data will be replicated.

It is better to use AWS cli to get most of the service since all new features and additions will be in cli first. As an example, if you want a debug logging for your replication you have to use AWS cli.

If you have only unique index on a table and it is function based index your data migration and replication task can fail. Also, it may fail if you have more than one unique index on a table.

I hit some issues with monitoring. The idea looks good but it requires some fixing. It looks like it doesn’t work correctly in Firefox and Safari. At least for me, it was not working right.

The status of a task may not tell you everything. Sometimes it shows state “ERROR” but nevertheless, it works and replicates data behind the scenes. So, it can be bit misleading. I look to statistics and monitoring pages for the task to get the full picture.

As a summary, I can say the service deserves attention and can be considered as a valuable option when you plan your migration strategy and AWS DMS team works hard to make it better. Happy migrations!

Categories: DBA Blogs

While upgrading to 12.1.0.2 I faced error ORA-01830 / ORA-06512

Thu, 2016-10-20 10:35

The other day I was running an upgrade for a client that is using ACLs ( Access Control Lists) from 11.2.0.3 to 12.1.0.2. If you have been doing upgrades to 12c, you know that when running the catctl.pl -n 4 catupgrd.sql it entails 73 steps. So this upgrade failed in step 65 with the following error (I have trimmed the output for reading purposes) :

Serial   Phase #:65 Files: 1 Calling sqlpatch with LD_LIBRARY_PATH=/u01/dev/product/12.1.0/lib; export LD_LIBRARY_PATH;/u01/dev/product/12.1.0/perl/bin/perl -I /u01/dev/product/12.1.0/rdbms/admin -I /u01/dev/product/12.1.0/rdbms/admin/../../sqlpatch /u01/dev/product/12.1.0/rdbms/admin/../../sqlpatch/sqlpatch.pl -verbose -upgrade_mode_only &gt; catupgrd_datapatch_upgrade.log 2&gt; catupgrd_datapatch_upgrade.err
returned from sqlpatch
    Time: 80s
Serial   Phase #:66 Files: 1     Time: 71s
Serial   Phase #:67 Files: 1     Time: 1s
Serial   Phase #:68 Files: 1     Time: 0s
Serial   Phase #:69 Files: 1     Time: 20s

Grand Total Time: 4946s

*** WARNING: ERRORS FOUND DURING UPGRADE ***
...
REASON:
catuppst.sql unable to run in Database: DEVSTAR Id: 0
        ERRORS FOUND: during upgrade CATCTL ERROR COUNT=5
------------------------------------------------------
Identifier XDB 16-09-25 12:27:05 Script = /u01/dev/product/12.1.0/rdbms/admin/
ERROR = [ORA-01830: date format picture ends before converting entire input string ORA-06512: at "SYS.XS_OBJECT_MIGRATION", line 167
ORA-06512: at line 28
ORA-06512: at line 69
]
------------------------------------------------------
------------------------------------------------------
Identifier XDB 16-09-25 12:27:05 Script = /u01/dev/product/12.1.0/rdbms/admin/
ERROR = [ORA-06512: at "SYS.XS_OBJECT_MIGRATION", line 167 ORA-06512: at line 28
ORA-06512: at line 69
]
STATEMENT = [as above]
------------------------------------------------------
------------------------------------------------------
Identifier XDB 16-09-25 12:27:05 Script = /u01/dev/product/12.1.0/rdbms/admin/
ERROR = [ORA-06512: at line 28 ORA-06512: at line 69
]
STATEMENT = [as above]
------------------------------------------------------
------------------------------------------------------
Identifier XDB 16-09-25 12:27:05 Script = /u01/dev/product/12.1.0/rdbms/admin/
ERROR = [ORA-06512: at line 69]
STATEMENT = [as above]
------------------------------------------------------
------------------------------------------------------
Identifier ORDIM 16-09-25 12:28:53 Script = /u01/dev/product/12.1.0/rdbms/admin/
ERROR = [ORA-20000: Oracle XML Database component not valid. Oracle XML Database must be installed and valid prior to Oracle Multimedia install, upgrade, downgrade, or patch.
ORA-06512: at line 3
]

And the worst part of it was that the upgrade also corrupted my database , also a good point to stress out , have a good backup before attempting to do an upgrade

Sun Sep 25 13:55:52 2016
Checker run found 59 new persistent data failures
Sun Sep 25 14:00:18 2016
Hex dump of (file 5, block 1) in trace file /u01/app/diag/rdbms/dev/dev/trace/de_ora_13476.trc
Corrupt block relative dba: 0x01400001 (file 5, block 1)
Bad header found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x01400001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa641
 computed block checksum: 0x0
Reading datafile '/u01/dev/oradata/dev/system_01.dbf' for corruption at rdba: 0x01400001 (file 5, block 1)
Reread (file 1, block 1) found same corrupt data (no logical check)

So what I had to do was a restore of my database before the upgrade, as I couldn’t even do a flashback due to the corrupt block.

But to fix this error, I had to apply the patch 20369415 to the 12c binaries before I ran the catupgrd.sql

[oracle@dev 20369415]$ opatch lsinventory | grep 20369415
Patch  20369415     : applied on Sun Sep 25 14:49:59 CDT 2016
     20369415

Once the patch was applied , I reran the upgrade, and now it finished successfully

Serial   Phase #:65      Files: 1     Time: 133s
Serial   Phase #:66      Files: 1     Time: 78s
Serial   Phase #:68      Files: 1     Time: 0s
Serial   Phase #:69      Files: 1     Time: 275s
Serial   Phase #:70      Files: 1     Time: 171s
Serial   Phase #:71      Files: 1     Time: 0s
Serial   Phase #:72      Files: 1     Time: 0s
Serial   Phase #:73      Files: 1     Time: 20s

------------------------------------------------------
Phases [0-73]         End Time:[2016_09_26 17:42:54]
------------------------------------------------------

Grand Total Time: 5352s

LOG FILES: (catupgrd*.log)
...
COMP_ID              COMP_NAME                                VERSION  STATUS
-------------------- ---------------------------------------- -------- ---------------
APEX                 Oracle Application Express               4.2.5.00 VALID
                                                              .08

OWB                  OWB                                      11.2.0.3 VALID
                                                              .0

AMD                  OLAP Catalog                             11.2.0.3 OPTION OFF
                                                              .0

SDO                  Spatial                                  12.1.0.2 VALID
                                                              .0

ORDIM                Oracle Multimedia                        12.1.0.2 VALID
                                                              .0

XDB                  Oracle XML Database                      12.1.0.2 VALID
                                                              .0

CONTEXT              Oracle Text                              12.1.0.2 VALID
                                                              .0

OWM                  Oracle Workspace Manager                 12.1.0.2 VALID
                                                              .0

CATALOG              Oracle Database Catalog Views            12.1.0.2 VALID
                                                              .0

CATPROC              Oracle Database Packages and Types       12.1.0.2 VALID
                                                              .0

JAVAVM               JServer JAVA Virtual Machine             12.1.0.2 VALID
                                                              .0

XML                  Oracle XDK                               12.1.0.2 VALID
                                                              .0

CATJAVA              Oracle Database Java Packages            12.1.0.2 VALID
                                                              .0

APS                  OLAP Analytic Workspace                  12.1.0.2 VALID
                                                              .0

XOQ                  Oracle OLAP API                          12.1.0.2 VALID
                                                              .0

Conclusion

This was a small post to make you aware that if you are using ACLs , you need to run the patch 20369415 to the 12c binaries so that you don’t have to face a possible database corruption and have a harder time upgrading your database.

Note: This post was originally posted in rene-ace.com

Categories: DBA Blogs

Oracle Service Secrets: Migrate Transparently

Mon, 2016-09-12 15:02

Databases or schemas tend to get moved around between different servers or even datacenters for hardware upgrades, consolidations or other migrations. And while the work that needs to be done is pretty straight forward for DBAs, I find the most annoying aspect of that is updating all client connect strings and tns entries used with new IP addresses and – if not using services – also the SID as the instance name might have changed.

That process can be simplified a lot when following a simple good practice of creating an extra service for each application or schema and along with that service also a DNS name for that IP. With that in place, a database can be migrated without the need to touch client connection parameters or tns aliases. All that is needed will be to migrate the database or schema, create the service name on the new instance and update the DNS record to the new machine.

Demo

Here is an example. I am migrating a schema from an 11g single instance on my laptop to a RAC database in the oracle public cloud. I am connecting to that database with blog_demo.pythian.com both as the hostname (faked through /etc/hosts instead of proper DNS for this demo) and the service name. As an application I am connecting to the database with sqlcl and a static connection string. Just remember that the whole, and only point of this demo is to migrate the schema without having to change that connect string.

brost$ ping -c 1 blog_demo.pythian.com
PING blog_demo.pythian.com (192.168.78.101): 56 data bytes
64 bytes from 192.168.78.101: icmp_seq=0 ttl=64 time=0.790 ms

brost$ ./sqlcl/bin/sql brost/******@blog_demo.pythian.com/blog_demo.pythian.com

SQLcl: Release 4.2.0.16.175.1027 RC on Mon Sep 05 17:50:11 2016

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL> select instance_name from v$instance;

INSTANCE_NAME   
----------------
ORCL        

Next I migrated my schema with datapump and imported to a PDB running on a 12c RAC database.

Then added the service name BLOG_DEMO to PDB1 on the database ORCL42.

$ srvctl add service -db orcl42 -pdb pdb1 -service blog_demo -preferred orcl421,orcl422
$ srvctl start service -db orcl42 -service blog_demo

Updated the DNS or as in this simplified demo my /etc/hosts and now I can connect with the same connection string. Note that the IP, the instance_name and the version have changed without the need to modify the connection string.

brost$ ping -c 1 blog_demo.pythian.com
PING blog_demo.pythian.com (140.86.42.42): 56 data bytes

brost$ ./sqlcl/bin/sql brost/******@blog_demo.pythian.com/blog_demo.pythian.com

SQLcl: Release 4.2.0.16.175.1027 RC on Mon Sep 05 18:05:11 2016

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Last Successful login time: Mon Sep 05 2016 18:04:50 +02:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics
and Real Application Testing options

SQL> select instance_name from v$instance;
INSTANCE_NAME   
----------------
orcl421

Note that with a proper DNS and a RAC target you would want to create A-records for the 3 SCAN IPs.

Other posts in this series

You can watch me talk briefly about this and other things that you can do with properly configured services in the video below or follow the links to other parts in this series.

tl;dr

When creating services for your applications to connect to a schema, also create a DNS entry for that and use this DNS name and the service for all client and application connections instead of using the hostname and SID. This might initially look like overhead but allows for flexibility when migrating schemas or databases to other systems. Updating DNS and creating a new service on the target machine can be changed in central places and saves updating potentially hundreds of client connect strings or tnsnames across the enterprise.

Categories: DBA Blogs

Oracle Service Secrets: quiesce tactically

Fri, 2016-09-02 10:18

In the last post of this series about Oracle net services, I talked about how services can help you identify performance issues faster and easier by tagging connections with service names. Today I am introducing you to the idea of temporarily disabling connections during maintenance with the help of services.

During deployments, testing or reorganizations it might be necessary to prevent clients from connecting to the database while still allowing access for DBAs to do their work. Some methods to do this include temporarily locking application user accounts or putting the database in quiesce mode. But with services, you now also have a more tactical approach to this issue.

My example assumes a single instance with two services DEMO_BATCH and DEMO_OLTP. And let’s assume that we need to temporarily disable batch services, maybe just to reduce system load due to those activities or maybe because we are reorganizing the objects used by the batch processes.

To disable a service in a single instance we can either remove it from the SERVICE_NAMES instance parameter or use the DBMS_SERVICE package.

SELECT NAME FROM V$ACTIVE_SERVICES;

NAME
----------------------------------------------------------------
DEMO_BATCH
DEMO_OLTP
ORCLXDB
ORCL.PYTHIAN.COM
SYS$BACKGROUND
SYS$USERS

exec DBMS_SERVICE.STOP_SERVICE('DEMO_BATCH');

PL/SQL procedure successfully completed.

New sessions using the service name will receive an ORA-12514 error when trying to connect:

brbook:~ brost$ ./sqlcl/bin/sql brost/******@192.168.78.101:1521/DEMO_BATCH.PYTHIAN.COM

SQLcl: Release 4.2.0.16.175.1027 RC on Thu Aug 18 13:12:27 2016

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

  USER          = brost
  URL           = jdbc:oracle:thin:@192.168.78.101:1521/DEMO_BATCH.PYTHIAN.COM
  Error Message = Listener refused the connection with the following error:
ORA-12514, TNS:listener does not currently know of service requested in connect descriptor
Existing sessions are allowed to continue

Note that stopping will only affect new connections. Existing sessions that used the DEMO_BATCH service are allowed to continue until they disconnect or you kill them. This gives you the flexibility of a grace period where you just wait for existing sessions to finish their work and disconnect by themselves.

SELECT NAME FROM V$ACTIVE_SERVICES WHERE NAME = 'DEMO_BATCH';
no rows selected

SELECT SERVICE_NAME, USERNAME FROM V$SESSION WHERE SERVICE_NAME='DEMO_BATCH';

SERVICE_NAME         USERNAME
-------------------- ------------------------------
DEMO_BATCH           BROST
Grid Infrastructure has option to force disconnects

If you are using grid infrastructure and manage services through srvctl this behaviour is basically the same but you get an extra “force” switch to also disconnect existing sessions while stopping a service.

[oracle@ractrial1 ~]$ srvctl stop service -db orcl42 -service racdemo_batch [-force]

[oracle@ractrial1 ~]$ srvctl stop service -h

Stops the service.

Usage: srvctl stop service -db <db_unique_name> [-service  "<service_name_list>"] [-serverpool <pool_name>] [-node <node_name> | -instance <inst_name>] [-pq] [-global_override] [-force [-noreplay]] [-eval] [-verbose]
    -db <db_unique_name>           Unique name for the database
    -service "<serv,...>"          Comma separated service names
    -serverpool <pool_name>        Server pool name
    -node <node_name>              Node name
    -instance <inst_name>          Instance name
    -pq                            To perform the action on parallel query service
    -global_override               Override value to operate on a global service.Ignored for a non-global service
    -force                         Disconnect all sessions during stop or relocate service operations
    -noreplay                      Disable session replay during disconnection
    -eval                          Evaluates the effects of event without making any changes to the system
    -verbose                       Verbose output
    -help                          Print usage
Conclusion

Creating extra services on a database allows you to stop and start them for maintenance which can be used as a convenient way to lock out only certain parts of an application while leaving user accounts unlocked to connect via different services.

Categories: DBA Blogs

The rlwrap utility for DBA.

Thu, 2016-08-18 08:36

I spend most of my time as a DBA in linux terminal and sqlplus. Everybody who works with oracle sqlplus knows about its power, but also about its limitations. For many years I have used the rlwrap utility developed by Hans Lub. It gives me command history, and the ability to edit my SQL Plus commands, and use auto completion if I set it up. In this post I will share some tips about installation and basic usage.

First we need to install the utility, and there are several options for that. We will check a few of them below. Please keep in mind that all of my examples are tested on Oracle Linux 6.

For the first one we need git, yum and automake packages. We will use the latest and greatest source code from the project site on GitHub: https://github.com/hanslub42/rlwrap and your standard Yum repository. Assuming you have connection to GitHub and your Yum repo.
Let’s run it step-by-step:

[root@sandbox ~]# yum install readline-devel
....
[root@sandbox ~]# yum install automake
....
[root@sandbox ~]# yum install git
....
[root@ovmcloud01 ~]# git clone https://github.com/hanslub42/rlwrap.git
Initialized empty Git repository in /root/rlwrap/.git/
remote: Counting objects: 1250, done.
remote: Total 1250 (delta 0), reused 0 (delta 0), pack-reused 1250
Receiving objects: 100% (1250/1250), 565.53 KiB, done.
Resolving deltas: 100% (867/867), done.
[root@ovmcloud01 ~]# cd rlwrap
[root@ovmcloud01 rlwrap]# autoreconf --install
configure.ac:32: installing `tools/config.guess'
configure.ac:32: installing `tools/config.sub'
configure.ac:34: installing `tools/install-sh'
configure.ac:34: installing `tools/missing'
src/Makefile.am: installing `tools/depcomp'
[root@ovmcloud01 rlwrap]# automake  --add-missing
[root@ovmcloud01 rlwrap]# ./configure
....
[root@ovmcloud01 rlwrap]# make install
....

That’s it. You have it installed in your system.

The second way is to compile it from source you have downloaded from http://utopia.knoware.nl/~hlub/uck/rlwrap/ . It may be useful if you don’t have connection to Yum and GitHub.
Keep in mind you will need GNU readline and ncurses libraries and headers installed in your system. So, we download the binaries, unpack it, compile and install.

[root@sandbox]$wget http://utopia.knoware.nl/~hlub/uck/rlwrap/rlwrap-0.42.tar.gz
....
[root@sandbox]$tar xfz rlwrap-0.42.tar.gz
[root@sandbox]$cd rlwrap-0.42
[root@sandbox]$./configure
[root@sandbox]$make install

The third way is to copy previously compiled rlwrap execution file and use it on a new system adding it to */bin directory in standard path.
It works if you have several similar, binary compatible systems and don’t want to spend time compiling the same binaries on each one.

[root@sandbox]$cd rlwrap-0.42
[root@sandbox]$ls -l src/rlwrap
-rwxr-xr-x. 1 root root 225023 Aug 16 12:49 src/rlwrap
[root@sandbox]$cp src/rlwrap /usr/local/bin/
[root@sandbox]$rlwrap --help
Usage: rlwrap [options] command ...

Options:
  -a[password prompt]        --always-readline[=password prompt]
  -A                         --ansi-colour-aware
.....

Of course you may consider to make your own rpm or use EPEL (Extra Packages for Enterprise Linux) yum repository and install it from there. Just keep in mind the version you get from EPEL may be slightly outdated.

[root@sandbox]$yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
[root@sandbox]$yum install rlwrap

Having the rlwrap installed you may find use for it.
Here some basic examples how you can use the utility:
Create an alias in your bashrc for sqlplus :

vi ~/.bashrc

and add

alias sqlp='rlwrap sqlplus'

The same you can do for rman :

alias rman='rlwrap rman'

For Oracle GoldenGate command line utility

alias ggsci='rlwrap ./ggsci' 

In rlwrap you can use ! and TAB to call list of commands or use prefix and CTRL+R to search for certain command in command history. Also you can create your own dictionary and use it for auto completion.
Let’s try to build some dictionary for auto-completion
I created a file “lsql.lst” with the following contents:

[oracle@sandbox ~]$ vi lsql.lst
~
select
from
where
and
update
insert
delete
tablespace
v$database

....
[oracle@sandbox ~]$alias sqlp="rlwrap -f lsql.lst sqlplus / as sysdba"
[oracle@sandbox ~]$ sqlp

SQL*Plus: Release 12.1.0.2.0 Production on Wed Aug 17 15:36:04 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
.............................................
cdb> desc v$t 

— here we are pressing TAB and getting list of suggestions:

cdb> desc v$t 
table       tablespace
cdb> desc v$tablespace
 Name														   Null?    Type
 ----------------------------------------------------------------------------------------------------------------- -------- ----------------------------------------------------------------------------
 TS#															    NUMBER

We can make it even better. Let’s upload name for all dba_views
In our sqlplus session we run :

cdb> spool lsql.lst append
cdb> select name from V$FIXED_TABLE;
......
cdb>spool off

logoff and logon again to reload the file and try :

cdb> select * from V$PA -- TAB
PARALLEL_DEGREE_LIMIT_MTH  PARAMETER                  PARAMETER2                 PARAMETER_VALID_VALUES     PATCHES
cdb> select * from V$B -- TAB
BACKUP                       BACKUP_CONTROLFILE_SUMMARY   BACKUP_DATAFILE_SUMMARY      BACKUP_SET                   BACKUP_SYNC_IO               BSP                          BUFFERED_PUBLISHERS
BACKUP_ARCHIVELOG_DETAILS    BACKUP_COPY_DETAILS          BACKUP_DEVICE                BACKUP_SET_DETAILS           BGPROCESS                    BTS_STAT                     BUFFERED_QUEUES
BACKUP_ARCHIVELOG_SUMMARY    BACKUP_COPY_SUMMARY          BACKUP_NONLOGGED             BACKUP_SET_SUMMARY           BH                           BT_SCAN_CACHE                BUFFERED_SUBSCRIBERS
BACKUP_ASYNC_IO              BACKUP_CORRUPTION            BACKUP_PIECE                 BACKUP_SPFILE                BLOCKING_QUIESCE             BT_SCAN_OBJ_TEMPS            BUFFER_POOL
BACKUP_COMPRESSION_PROGRESS  BACKUP_DATAFILE              BACKUP_PIECE_DETAILS         BACKUP_SPFILE_DETAILS        BLOCK_CHANGE_TRACKING        BUFFER                       BUFFER_POOL_STATISTICS
BACKUP_CONTROLFILE_DETAILS   BACKUP_DATAFILE_DETAILS      BACKUP_REDOLOG               BACKUP_SPFILE_SUMMARY        BMAPNONDURSUB                BUFFER2
cdb> select * from V$B

You can see we have all “V$” views and it can be extremely handy when you don’t really have any time to search for a view name and only vaguely remember that you have a view to look up for certain information.

The rlwrap may not be most sophisticated program but it can make your life much easier. There may be a more advanced tool for sqlplus like SQLcl that provides a lot more options. But—the beauty of rlwrap is in its “lightness” and ability to work not only with sqlplus, but with practically any command line tool.

Categories: DBA Blogs

ORA-15418: Appliance Mode Not Supported For Compatible.asm 12.1.0.0.0

Wed, 2016-07-20 09:45

The other day, I was upgrading a compatible.asm parameter to 12.1 on Exadata and I faced this error for the first time :

View the code on Gist.

Indeed, a diskgroup can have this parameter set to TRUE or FALSE :

View the code on Gist.

I then found this note on Metalink : EXADATA : What Is Oracle ASM appliance.mode Attribute (Doc ID 1664261.1) which explains that starting from Exadata 11.2.0.3.3, “The Oracle ASM appliance.mode attribute improves disk rebalance completion time  when dropping one or more Oracle ASM disks. This means that redundancy is restored faster after a failure.

Wow, that looks like a pretty cool feature! But it seems that (sadly) we cannot set a 12.1 compatible if this feature is enabled.

Let’s give it a try and deactivate it to set my compatible.asm to 12.1 :

View the code on Gist.

It works ! Now that this compatible.asm is set to 12.1, could we enable that appliance.mode feature again ?

View the code on Gist.

Sadly, no. I hit one of the restrictions; indeed, “The Oracle ASM disk group attribute compatible.asm is set to release 11.2.0.4, 12.1.0.2 and later. (Appliance mode is not valid with 12.1.0.1)” — then I guess that even of the documentation does not say so, Appliance mode is not valid with a compatible.asm set to 12.1.0.0.0 either.
Even if it is very weird that Oracle dev “forgot” (?) this cool feature when they released 12c on Exadata, they hopefully “released” it again with 12.1.0.2. I’ll follow up on that when moving this compatible.asm to 12.1.0.2 !

Categories: DBA Blogs

Pages