Pythian Group

Subscribe to Pythian Group feed
Updated: 10 hours 48 min ago

Log Buffer #513: A Carnival of the Vanities for DBAs

Mon, 2017-06-12 08:57

This Log Buffer Edition includes blog posts from Oracle, SQL Server and MySQL.

Oracle:

In 12.2 you can: Convert a simple table to partitioned with multi-column automatic list partitions, partially indexed, with read only segments, filtering out unwanted data, online in one operation.

12c How to UN Expire Password for CDB Users

rman auxiliary (for duplicate)

How long will Oracle APEX remain an included feature of the Oracle Database?

How to easily delete files in the Oracle Cloud using CloudBerry Explorer

SQL Server:

Understanding OPENJSON and FOR JSON in Azure and SQL 2016

How to setup Machine Learning Services in SQL Server 2017 CTP2

The OUTPUT Clause for the MERGE Statements

Simple script to backup all SQL Server databases

Working with Windows Containers and Docker: Into your Stride

MySQL:

MySQL on Docker: Running Galera Cluster on Kubernetes

Summary of recent performance tests for MySQL 5.6, 5.7 and 8

A Quick Look at Parallel Rsync and How it Can Save a System Hours

Docker, MySQL and Experience in Containers

HopsFS running on top of MySQL Cluster 7.5 wins IEEE Scale Challenge 2017

Categories: DBA Blogs

Log Buffer #512: A Carnival of the Vanities for DBAs

Mon, 2017-06-05 09:50

This Log Buffer Edition covers Oracle, SQL Server and MySQL.

Oracle:

Upgrade Existing TDE to Use New Unified Key Management in 12c Upgraded Database (non-cdb)

Instrumentation … not just for debugging

12.2 Index Deferred Invalidation (Atomica)

Collation in 12cR2 – when AA equals Å (or not?)

Configuring Oracle Traffic Director 12c with WebGate

SQL Server:

PowerShell Scripting to Replicate Cloud & SaaS Data to SQL Server

Comparing and Synchronizing Two Folders with Azure

How to Handle Hybrid Cloud Database Security with SQL Secure

Use T-SQL to Find Folders Using the Most Disk Space

Generate charts rapidly with SQL Server using R and T-SQL

MySQL:

Hybrid Data Encryption by Example using MySQL Enterprise Edition

Announcing Scotch Box 3.0 and Scotch Box Pro

Testing MySQL 8.0 – let me create a ton of undo files

MariaDB 10.1.24 and Connector/C 2.3.3 now available

Migration from MySQL Master-Slave pair to MySQL InnoDB Cluster: howto

Categories: DBA Blogs

Upgrade Existing TDE to Use New Unified Key Management in 12c Upgraded Database (non-cdb)

Tue, 2017-05-30 10:04

I’m really excited to be writing this post and I’m hoping it serves as helpful content. When reviewing the new unified key management in RDMS 12c, I came across old commands like ‘ALTER SYSTEM’ to manage the TDE keys that are still supported.  As a proof, I didn’t see any pre-reqs that exists on TDE part in the following upgrade related MOS note.

Complete Checklist for Manual Upgrades to Oracle Database 12c Release 1 (12.1) (Doc ID 1503653.1)

I had a test database which was running on RDBMS 11.2.0.4 version along with TDE configured (segments located in dedicated tablespace created with encryption clause). I upgraded this database into version 12.1.0.2  without any issues using the above note. No issues were found while accessing the encrypted segments using the old keys with autologin option existing on the “/etc/oracle/WALLETS/Test/” location on the server.

I wanted to use the new unified key management on this upgraded database. I checked MOS and surprisingly, no direct notes to fulfill this purpose. After doing multiple tests, I was able to get this method working.

1. First, I copied my existing keys into “/home/oracle/keys” location as a backup. I confirmed my existing TDE keys work fine.

SQL> select count(1) from test.tstbl;

COUNT(1)
———-
261440

2. I brought down my database and removed the keys from “/etc/oracle/WALLETS/Test/” location. I then started the database instance and created the dummy keystore using the new syntax.

SQL> startup
ORACLE instance started.

Total System Global Area 524288000 bytes
Fixed Size 2926320 bytes
Variable Size 197134608 bytes
Database Buffers 318767104 bytes
Redo Buffers 5459968 bytes
Database mounted.
Database opened.
SQL> select * from v$encryption_wallet;

WRL_TYPE
——————–
WRL_PARAMETER
——————————————————————————–
STATUS WALLET_TYPE WALLET_OR FULLY_BAC
—————————— ——————– ——— ———
CON_ID
———-
FILE
/etc/oracle/WALLETS/Test/
NOT_AVAILABLE UNKNOWN SINGLE UNDEFINED
0
SQL> select key_id,activation_time from v$encryption_keys;

no rows selected

SQL> ADMINISTER KEY MANAGEMENT CREATE KEYSTORE ‘/etc/oracle/WALLETS/Test’ IDENTIFIED BY “<keystore_password>”;

keystore altered.

SQL> ADMINISTER KEY MANAGEMENT SET KEYSTORE OPEN IDENTIFIED BY “<keystore_password>”;

keystore altered.

SQL> select key_id,activation_time from v$encryption_keys;

no rows selected

SQL> select * from v$encryption_wallet;

WRL_TYPE
——————–
WRL_PARAMETER
——————————————————————————–
STATUS WALLET_TYPE WALLET_OR FULLY_BAC
—————————— ——————– ——— ———
CON_ID
———-
FILE
/etc/oracle/WALLETS/Test/
OPEN_NO_MASTER_KEY PASSWORD SINGLE UNDEFINED
0

3. Next, I imported the keys from the files copied as per MOS note 2254960.1, but still I saw the same “OPEN_NO_MASTER_KEY” status on the data dictionary.

How to Merge a TDE Wallet From 11gR2 Into a New 12c Database Keystore? (Doc ID 2254960.1)

SQL> ADMINISTER KEY MANAGEMENT MERGE KEYSTORE ‘/home/oracle/keys’ IDENTIFIED BY “<11g_wallet_password>” INTO EXISTING KEYSTORE ‘/etc/oracle/WALLETS/Test’ IDENTIFIED BY “<keystore_password>” WITH BACKUP;

keystore altered.

SQL> !ls -ltr /etc/oracle/WALLETS/Test
total 16
-rw-r–r– 1 oracle oinstall 3112 May 4 23:16 ewallet.p12
-rw-r–r– 1 oracle oinstall 2408 May 4 23:16 ewallet_2017050503161274.p12

SQL> select * from v$encryption_wallet;

WRL_TYPE
——————–
WRL_PARAMETER
——————————————————————————–
STATUS WALLET_TYPE WALLET_OR FULLY_BAC
—————————— ——————– ——— ———
CON_ID
———-
FILE
/etc/oracle/WALLETS/Test/
OPEN_NO_MASTER_KEY PASSWORD SINGLE UNDEFINED
0
SQL> select key_id,activation_time from v$encryption_keys;

no rows selected

4. I restarted my database instance and saw CLOSED status, which is expected as autologin was not enabled yet.

SQL> shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 524288000 bytes
Fixed Size 2926320 bytes
Variable Size 197134608 bytes
Database Buffers 318767104 bytes
Redo Buffers 5459968 bytes
Database mounted.
Database opened.
SQL> select key_id,activation_time from v$encryption_keys;

no rows selected

SQL> select * from v$encryption_wallet;

WRL_TYPE
——————–
WRL_PARAMETER
——————————————————————————–
STATUS WALLET_TYPE WALLET_OR FULLY_BAC
—————————— ——————– ——— ———
CON_ID
———-
FILE
/etc/oracle/WALLETS/Test/
CLOSED UNKNOWN SINGLE UNDEFINED
0
SQL> !ls -ltr /etc/oracle/WALLETS/Test/
total 16
-rw-r–r– 1 oracle oinstall 3112 May 4 23:16 ewallet.p12
-rw-r–r– 1 oracle oinstall 2408 May 4 23:16 ewallet_2017050503161274.p12

SQL> select count(1) from test.tstbl;
select count(1) from test.tstbl
*
ERROR at line 1:
ORA-28365: wallet is not open

5. The only option available was to open the wallet is using the old syntax.

SQL> ALTER SYSTEM SET ENCRYPTION WALLET OPEN IDENTIFIED BY “<11g_wallet_password>”;

System altered.

SQL> select * from v$encryption_wallet;

WRL_TYPE
——————–
WRL_PARAMETER
——————————————————————————–
STATUS WALLET_TYPE WALLET_OR FULLY_BAC
—————————— ——————– ——— ———
CON_ID
———-
FILE
/etc/oracle/WALLETS/Test/
OPEN PASSWORD SINGLE NO
0
SQL> select count(1) from test.tstbl;

COUNT(1)
———-
261440

6. I configured the AUTOLOGIN feature using the new syntax and restarted the instance many times, no further issues reported with the keys availability.

SQL> ADMINISTER KEY MANAGEMENT CREATE AUTO_LOGIN KEYSTORE FROM KEYSTORE ‘/etc/oracle/WALLETS/Test’ IDENTIFIED BY “<keystore_password>”;

keystore altered.

SQL> shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 524288000 bytes
Fixed Size 2926320 bytes
Variable Size 197134608 bytes
Database Buffers 318767104 bytes
Redo Buffers 5459968 bytes
Database mounted.
Database opened.
SQL> select * from v$encryption_wallet;

WRL_TYPE
——————–
WRL_PARAMETER
——————————————————————————–
STATUS WALLET_TYPE WALLET_OR FULLY_BAC
—————————— ——————– ——— ———
CON_ID
———-
FILE
/etc/oracle/WALLETS/Test/
OPEN AUTOLOGIN SINGLE NO
0
SQL> select count(1) from test.tstbl;

COUNT(1)
———-
261440

SQL> !ls -ltr /etc/oracle/WALLETS/Test
total 24
-rw-r–r– 1 oracle oinstall 3112 May 4 23:16 ewallet.p12
-rw-r–r– 1 oracle oinstall 2408 May 4 23:16 ewallet_2017050503161274.p12
-rw-r–r– 1 oracle oinstall 3157 May 4 23:21 cwallet.sso

Verification:
——————
SQL> select key_id,activation_time from v$encryption_keys;

KEY_ID
——————————————————————————
ACTIVATION_TIME
—————————————————————————
AQ**************************************AA

SQL> select name,utl_raw.cast_to_varchar2( utl_encode.base64_encode(’01’||substr(mkeyid,1,4))) || utl_raw.cast_to_varchar2( utl_encode.base64_encode(substr(mkeyid,5,length(mkeyid)))) masterkeyid_base64 FROM (select t.name, RAWTOHEX(x.mkid) mkeyid from v$tablespace t, x$kcbtek x where t.ts#=x.ts#);

NAME MASTERKEYID_BASE64
—————————— ————————————————————
TDE_TBS AQ*********************

The keys generated during older version of Oracle RDMS can be made active using the new TDE unified key management interface on RDMS 12c.

Categories: DBA Blogs

Automating Password Rotation for Oracle Databases

Fri, 2017-05-26 14:03

Password rotation is not the most exciting task in the world, and that’s exactly why it’s a perfect candidate for automation. Automating routine tasks like this are good for everyone – DBAs can work on something that’s more exciting, companies save costs as less time is spent on changing the passwords, and there’s no place for human error, either. At Pythian, we typically use Ansible for task automation, and I like it mainly because of its non-intrusive configuration (no agents need to be installed on the target servers), and its scalability (tasks are executed in parallel on the target servers). This post will briefly describe how I automated password rotation for oracle database users using Ansible.

Overview

This blog post is not an intro to what is Ansible and how to use it, but it’s rather an example of how a simple task can be automated using Ansible in a way that’s scalable, flexible and easily reusable, and also provides the ability for other tasks to pick up the new passwords from a secure password store.

  • Scalability – I’d like to take advantage of Ansible’s ability of executing tasks on multiple servers at the same time. For example, in a large environments of tens or hundreds of machines, a solution that executes password change tasks serially would not be suitable. This would be an example of a “serial” task (it’s not a real thing, but just an illustration that it “hardcodes” a few “attributes” (environment file, the username and the hostname), and creating a separate task for every user/database you’d want to change the password for would be required:
    - hosts: ora-serv01
      remote_user: oracle
      tasks:
      - name: change password for SYS
        shell: | 
          . TEST1.env && \
          sqlplus / as sysdba @change_pasword.sql SYS \
          \"{{lookup('password','/dev/null length=8')}}\"
    
  • Flexible – I want to be able to adjust the list of users for which the passwords are changed, and the list of servers/databases that the user passwords are changed for in a simple way, that doesn’t include changing the main task list.
  • Reusable – this comes together with flexibility. The idea is that the playbook would be so generic, that it wouldn’t require any changes when it’s implemented in a completely separate environment (i.e. for another client of Pythian)
  • Secure password store – the new passwords are to be generated by the automated password rotation tool, and a method of storing password securely is required so that the new passwords could be picked up by the DBAs, application owners or the next automated task that would reconfigure the application
The implementation Prerequisites

I chose to do the implementation using Ansible 2.3, because it introduces the passwordstore lookup, which enables interaction with the pass utility (read more about it in Passwordstore.org). pass is very cool. It store passwords in gpg-encrypted files, and it can also be configured to automatically update the changes to a git repository, which relieves us of the headache of password distribution. The password can be retrieved from git on the servers that need the access to the new passwords.

Ansible 2.3 runs on python 2.6, unfortunately, the passwordstore lookup requires Python 2.7, which can be an issue if the control host for Ansible runs on Oracle Linux 6 or RHEL 6, as they don’t provide Python 2.7 in the official yum repositories. Still, there are ways of getting it done, and I’ll write another blog post about it.

So, what we’ll need is:

  • Ansible 2.3
  • jmespath plugin on Ansible control host (pip install jmespath)
  • jinja2 plugin on Ansible control host (I had to update it using pip install -U jinja2 in few cases)
  • Python 2.7 (or Python 3.5)
  • pass utility
The Playbook

This is the whole list of files that are included in the playbook:

./chpwd.yml
./inventory/hosts
./inventory/orcl1-vagrant-private_key
./inventory/orcl2-vagrant-private_key
./roles/db_users/files/change_password.sql
./roles/db_users/files/exists_user.sql
./roles/db_users/defaults/main.yml
./roles/db_users/tasks/main.yml

Let’s take a quick look at all of them:

  • ./chpwd.yml – is the playbook and (in this case) it’s extremely simple as I want to run the password change against all defined hosts:
    $ cat ./chpwd.yml
    ---
    
      - name: password change automation
        hosts: all
        roles:
          - db_users
    
  • ./inventory/hosts, ./inventory/orcl1-vagrant-private_key, ./inventory/orcl2-vagrant-private_key – these files define the hosts and the connectivity. In this case we have 2 hosts – orcl1 and orcl2, and we’ll connect to vagrant user using the private keys.
    $ cat ./inventory/hosts
    [orahosts]
    orcl1 ansible_host=127.0.0.1 ansible_port=2201 ansible_ssh_private_key_file=inventory/orcl1-vagrant-private_key ansible_user=vagrant
    orcl2 ansible_host=127.0.0.1 ansible_port=2202 ansible_ssh_private_key_file=inventory/orcl2-vagrant-private_key ansible_user=vagrant
  • ./roles/db_users/files/change_password.sql – A sql script that I’ll execute on the database to change the passwords. It takes 2 parameters the username and the password:
    $ cat ./roles/db_users/files/change_password.sql
    set ver off pages 0
    alter user &1 identified by "&2";
    exit;
  • ./roles/db_users/files/exists_user.sql – A sql script that allows verifying the existence of the users. It takes 1 argument – the username. It outputs “User exists.” when the user is there, and “User {username} does not exist.” – when it’s not.
    $ cat ./roles/db_users/files/exists_user.sql
    set ver off pages 0
    select 'User exists.' from all_users where username=upper('&1')
    union all
    select 'User '||upper('&1')||' does not exist.' from (select upper('&1') from dual minus select username from all_users);
    exit;
  • ./roles/db_users/defaults/main.yml – is the default file for the db_users role. I use this file to define the users for each host and database for which the passwords need to be changed:
    $ cat ./roles/db_users/defaults/main.yml
    ---
    
      db_users:
        - name: TEST1
          host: orcl1
          env: ". ~/.bash_profile && . ~/TEST1.env > /dev/null"
          pwdstore: "orcl1/TEST1/"
          os_user: oracle
          become_os_user: yes
          users:
            - dbsnmp
            - system
        - name: TEST2
          host: orcl2
          env: ". ~/.bash_profile && . ~/TEST2.env > /dev/null"
          pwdstore: "orcl2/TEST2/"
          os_user: oracle
          become_os_user: yes
          users:
            - sys
            - system
            - ctxsys
        - name: TEST3
          host: orcl2
          env: ". ~/.bash_profile && . ~/TEST3.env > /dev/null"
          pwdstore: "orcl2/TEST3/"
          os_user: oracle
          become_os_user: yes
          users:
            - dbsnmp

    In this data structure, we define everything that’s needed to be known to connect to the database and change the passwords. each entry to the list contains the following data:

    • name – just a descriptive name of the entry in this list, normally it would be the name of the database that’s described below.
    • host – the host on which the database resides. It should match one of the hosts defined in ./inventory/hosts.
    • env – how to set the correct environment to be able to connect to the DB (currently it requires sysdba connectivity).
    • pwdstore – the path to the folder in the passwordstore where the new passwords will be stored.
    • os_user and become_os_user – these are used in case sudo to another user on the target host is required. In a typical configuration, I connect to the target host using a dedicated user for ansible, and then sudo to the DB owner. if ansible connects to the DB onwer directly, then become_os_user should be set to “no”.
    • users – this is the list of all users for which the passwords need to be changed.

    As you see, this structure greatly enhances the flexibility and reusability, because adding new databases, hosts or users to the list would be done by a simple change to the “db_users:” structure in this defaults file. In this example, dbsnmp and system passwords are rotated for TEST1@orcl1, sys, system and ctxsys passwords are rotated for TEST2@orcl2, and dbsnmp on TEST3@orcl2

  • ./roles/db_users/tasks/main.yml – this is the task file of the db_users role. The soul of the playbook and the main part that does the password change depending on the contents in the defaults file described above. Instead of pasting the whole at once, I’ll break it up task by task, and will provide some comments about what’s being done.
    • populate host_db_users – This task simply filters the whole db_users data structure that’s defined in the defaults file, and creates host_db_users fact with only the DBs that belong to the host the task is currently run on. Using the ansible “when” conditional would also be possible to filter the list, however in such case there’s a lot of “skipped” entries displayed when the task is executed, so I prefer filtering the list before it’s even passed to the Ansible task.
      ---
      
        - name: populate host_db_users
          set_fact: host_db_users="{{ db_users | selectattr('host','equalto',ansible_hostname) | list }}"
      
    • create directory for target on db hosts – for each unique combination of os_user and become_os_user on the target host, and “ansible” directly is created. A json_query is used here, to filter just the os_user and become_os_user attributes that are needed. It would also work with with_items: "{{ host_db_users }}", but in that case, the outputs become cluttered as the attributes are displayed during the execution.
        - name: create directory for target on db hosts
          file:
            path: "ansible"
            state: directory
          become_user: "{{ item.os_user }}"
          become: "{{ item.become_os_user }}"
          with_items: "{{ host_db_users | json_query('[*].{os_user: os_user, become_os_user: become_os_user }') | unique | list }}"
      
    • copy sql scripts to db_hosts – the missing scripts are copied from Ansible control host to the target “ansible” directories. “with_nested” is the method to create a loop in Ansible.
        - name: copy sql scripts to db_hosts
          copy:
            src="{{ item[1] }}"
            dest=ansible/
            mode=0644
          become_user: "{{ item[0].os_user }}"
          become: "{{ item[0].become_os_user }}"
          with_nested:
            - "{{ host_db_users | json_query('[*].{os_user: os_user, become_os_user: become_os_user }') | unique | list }}"
            - ['files/change_password.sql','files/exists_user.sql']
      
    • verify user existence – I’m using a shell module to execute the sql script after setting the environment. The outputs are collected in “exists_output” variable. This task will not fail and will not show as “changed” because of failed_when and changed_when settings of “false”.
        - name: verify user existence
          shell: |
             {{ item[0].env }} && \
             sqlplus -S / as sysdba \
             @ansible/exists_user.sql {{ item[1] }}
          register: exists_output
          become_user: "{{ item[0].os_user }}"
          become: "{{ item[0].become_os_user }}"
          with_subelements:
            - "{{ host_db_users |json_query('[*].{env: env, os_user: os_user, users: users, become_os_user: become_os_user }') }}"
            - users
          failed_when: false
          changed_when: false
      
    • User existence results – this task will fail when any of the users didn’t exist, and will display which user it was. This is done in a separate task to produce cleaner output, and in case it’s not wanted to fail if any of the users don’t exist (continue to change passwords for the existing users), this task can simply be commented or the “failed_when: false” can be uncommented.
        - name: User existence results
          fail: msg="{{ item }}"
          with_items: "{{ exists_output.results|rejectattr('stdout','equalto','User exists.')|map(attribute='stdout')|list }}"
          #failed_when: false
      
    • generate and change the user passwords – finally, this is the task that actually changes the passwords. The successful password change is detected by checking the output from the sqlscript, which should produce “User altered.” The rather complex use of lookups is there for a reason: the passwordstore lookup can also generate passwords, but it’s not possible to define the character classes that the new password should contain, however the “password” lookup allows defining these. Additionally, the 1st character is generated only containing “ascii_letters”, as there are usually some applications that “don’t like” passwords that start with numbers (this is why generating the 1st letter of the password is separated from the remaining 11 characters. And lastly, the “passwordstore” lookup is used with the “userpass=” parameter to pass and store the generated password into the passwordstore (and it also keeps the previous passwords). This part could use some improvement as in some cases different rules for the generated password complexity may be required. The password change outputs are recorded in “change_output” that’s checked in the last task.
        - name: generate and change the user passwords
          shell: |
             {{ item[0].env }} && \
             sqlplus -S / as sysdba \
             @ansible/change_password.sql \
             {{ item[1] }} \"{{ lookup('passwordstore',item[0].pwdstore + item[1] + ' create=true overwrite=true userpass=' +
                                       lookup('password','/dev/null chars=ascii_letters length=1') +
                                       lookup('password','/dev/null chars=ascii_letters,digits,hexdigits length=11')) }}\"
          register: change_output
          become_user: "{{ item[0].os_user }}"
          become: "{{ item[0].become_os_user }}"
          with_subelements:
            - "{{ host_db_users |json_query('[*].{env: env, os_user: os_user, users: users, pwdstore: pwdstore, become_os_user: become_os_user}') }}"
            - users
          failed_when: false
          changed_when: "'User altered.' in change_output.stdout"
      
    • Password change errors – The “change_output” data are verified here, and failed password changes are reported.
         # fail if the password change failed.
        - name: Password change errors
          fail: msg="{{ item }}"
          with_items: "{{ change_output.results|rejectattr('stdout','equalto','\nUser altered.')|map(attribute='stdout')|list }}"
      
It really works!

Now, when you know how it’s built – it’s time to show how it works!
Please pay attention to the following:

  • The password store is empty at first
  • The whole password change playbook completes in 12 seconds
  • The tasks on both hosts are executed in parallel (see the order of execution feedback for each task)
  • The passwordstore contains the password entries after the playbook completes, and they can be retrieved by using the pass command
$ pass
Password Store

$ time ansible-playbook -i inventory/hosts chpwd.yml

PLAY [pasword change automation] *******************************************************

TASK [Gathering Facts] *****************************************************************
ok: [orcl1]
ok: [orcl2]

TASK [db_users : populate host_db_users] ***********************************************
ok: [orcl1]
ok: [orcl2]

TASK [db_users : create directory for target on db hosts] ******************************
changed: [orcl1] => (item={'become_os_user': True, 'os_user': u'oracle'})
changed: [orcl2] => (item={'become_os_user': True, 'os_user': u'oracle'})

TASK [db_users : copy sql scripts to db_hosts] *****************************************
changed: [orcl1] => (item=[{'become_os_user': True, 'os_user': u'oracle'}, u'files/change_password.sql'])
changed: [orcl2] => (item=[{'become_os_user': True, 'os_user': u'oracle'}, u'files/change_password.sql'])
changed: [orcl1] => (item=[{'become_os_user': True, 'os_user': u'oracle'}, u'files/exists_user.sql'])
changed: [orcl2] => (item=[{'become_os_user': True, 'os_user': u'oracle'}, u'files/exists_user.sql'])

TASK [db_users : verify user existance] ************************************************
ok: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'env': u'. ~/.bash_profile && . ~/TEST2.env > /dev/null'}, u'sys'))
ok: [orcl1] => (item=({'become_os_user': True, 'os_user': u'oracle', 'env': u'. ~/.bash_profile && . ~/TEST1.env > /dev/null'}, u'dbsnmp'))
ok: [orcl1] => (item=({'become_os_user': True, 'os_user': u'oracle', 'env': u'. ~/.bash_profile && . ~/TEST1.env > /dev/null'}, u'system'))
ok: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'env': u'. ~/.bash_profile && . ~/TEST2.env > /dev/null'}, u'system'))
ok: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'env': u'. ~/.bash_profile && . ~/TEST2.env > /dev/null'}, u'ctxsys'))
ok: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'env': u'. ~/.bash_profile && . ~/TEST3.env > /dev/null'}, u'dbsnmp'))

TASK [db_users : User existance results] ***********************************************

TASK [db_users : generate and change the user passwords] *******************************
changed: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'pwdstore': u'orcl2/TEST2/', 'env': u'. ~/.bash_profile && . ~/TEST2.env > /dev/null'}, u'sys'))
changed: [orcl1] => (item=({'become_os_user': True, 'os_user': u'oracle', 'pwdstore': u'orcl1/TEST1/', 'env': u'. ~/.bash_profile && . ~/TEST1.env > /dev/null'}, u'dbsnmp'))
changed: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'pwdstore': u'orcl2/TEST2/', 'env': u'. ~/.bash_profile && . ~/TEST2.env > /dev/null'}, u'system'))
changed: [orcl1] => (item=({'become_os_user': True, 'os_user': u'oracle', 'pwdstore': u'orcl1/TEST1/', 'env': u'. ~/.bash_profile && . ~/TEST1.env > /dev/null'}, u'system'))
changed: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'pwdstore': u'orcl2/TEST2/', 'env': u'. ~/.bash_profile && . ~/TEST2.env > /dev/null'}, u'ctxsys'))
changed: [orcl2] => (item=({'become_os_user': True, 'os_user': u'oracle', 'pwdstore': u'orcl2/TEST3/', 'env': u'. ~/.bash_profile && . ~/TEST3.env > /dev/null'}, u'dbsnmp'))

TASK [db_users : Password change errors] ***********************************************

PLAY RECAP *****************************************************************************
orcl1                      : ok=6    changed=3    unreachable=0    failed=0
orcl2                      : ok=6    changed=3    unreachable=0    failed=0

real    0m12.418s
user    0m8.590s
sys     0m3.900s

$ pass
Password Store
|-- orcl1
|   |-- TEST1
|       |-- dbsnmp
|       |-- system
|-- orcl2
    |-- TEST2
    |   |-- ctxsys
    |   |-- sys
    |   |-- system
    |-- TEST3
        |-- dbsnmp

$ pass orcl1/TEST1/system
HDecEbjc6xoO
lookup_pass: First generated by ansible on 26/05/2017 14:28:50
Conclusions

For past 2 months I’ve been learning Ansible and trying it for various DBA tasks. It hasn’t always been a smooth ride, as I had to learn quite a lot, because I wasn’t exposed much to beasts like jinja2, json_query, YAML, python (very handy for troubleshooting) and Ansible itself before. I feel that my former PL/SQL coder’s experience had created some expectations from Ansible, that turned out not to be true. The biggest challenges to me were getting used to the linear execution of the playbook (while with PL/SQL I can call packages, functions, etc. to process the data “outside” the main linear code line), and the lack of execution feedback, because one has to learn creating Ansible tasks in a way that they either succeed or fail (no middle states like ‘this is a special case – process it differently’), as well as the amount of visual output is close to none – which does make sense to some degree, it’s “automation” after all, right? Nobody should be watching :)
A separate struggle for me was working with the complex data structure that I created for storing the host/database/user information. It’s a mix of yaml “dictionary” and “list”, and it turned out to be difficult to process it in a way I wanted – this is why I used the json_query at times (although not in a very complex way in this case). There are probably simpler ways I didn’t know of (didn’t manage finding), and I’d be glad if you’d let me know of possible improvements or even other approaches to such tasks that you have worked on and implemented.
Despite all the complaining above, I think it’s really worth investing time in automating tasks like this, it really works and once done it doesn’t require much attention. Happy Automating!

Categories: DBA Blogs

Log Buffer #511: A Carnival of the Vanities for DBAs

Tue, 2017-05-23 09:50

This Log Buffer Edition covers Oracle, SQL Server, MySQL.

Oracle:

A Sneak Peek at Oracle’s Chatbot Cloud Service and 5 Key Factors Necessary for Bot ROI

Oracle JET Hybrid – NavDrawer Template Menu/Header Structure

Oracle Enterprise Linux 6.6 AMI Available on AWS

Datascape Podcast Episode 9 – What’s Up with Oracle These Days?

Sequential Asynchronous calls in Node.JS – using callbacks, async and ES6 Promises

SQL Server:

Fixing an SSRS Password Error while Changing Credentials

Azure DWH Part 8: Accessing Azure SQL Data Warehouse with C#

Personal Data, Privacy, and the GDPR

Performance Myths : Truncate Can’t Be Rolled Back

Troubleshooting CPU Performance on VMware

MySQL:

MySQL Shell: eye candy for a future release !

MySQL 8.0: It’s doxygen time for MTR

How to login in MariaDB with OS user without password

MySQL Enterprise backup : How to setup a slave for a standalone server with zero downtime?

Command Line Aficionados: Introducing s9s for ClusterControl

Categories: DBA Blogs

Datascape Podcast Episode 9 – What’s Up with Oracle These Days?

Thu, 2017-05-18 09:06

On today’s episode of the Datascape Podcast we are going to talk about the Oracle Database. With the hype around cloud and cloud first initiatives, automation, DevOps, and very specialized cloud-based databases, I wondered what was going on with one of the datascape’s old faithful technologies, Oracle Database.

Joining us is Simon Pane, who is an Oracle consultant and expert. Simon is very much immersed in the community and truly has his feet on the ground. He’s been working in the Oracle DBA space as a consultant for almost his entire career, with a focus around Oracle’s core database technologies. Simon often speaks at various conferences around North America and Europe. He was recently inducted as a board member to Independent Oracle User Group (IOUG), one of the main Oracle user groups.

You can find it on iTunes, Google Play, Sound Cloud, and just about anywhere that shares podcasts.

Key Points From This Episode:

  • Learn more about which types of companies and organizations use the Oracle Database.
  • Understand the different barriers of adoption for using the Oracle Database.
  • Simon shares more about the average size implementation that they see at Oracle.
  • Discover why the main use case for Oracle XE is for teaching and training purposes.
  • Learn more about the current version of Oracle, and the new exciting features.
  • Understand why it is complicated to upgrade an Oracle database, and preparation needed.
  • Simon shares the top features and benefits of the Oracle database.
  • Find out what the ODA and Exadata Appliances are, and the differences between them.
  • Learn more about the Oracle Cloud and how they are working to become a major player.
  • Simon shares his thoughts on Oracle as a product, and whether or not it is dying/dead.
  • Hear about the project that Simon is most proud of, the best tool he uses, and much more!

Links Mentioned in Today’s Episode:

Oracle Database
Oracle Database Appliance
Oracle Exadata
DBA Course
Simon on Twitter
More about Simon
ODTUG
ODTUG Kscope17 Conference

Categories: DBA Blogs

Oracle E-Business Suite and Java Web Start. Finally!

Thu, 2017-04-27 20:36

Great things always happen overnight. That’s probably the case for European Oracle Apps DBAs in the same situation, like myself.

This morning I read Steven Chan’s latest blog post about Java Web Start certification with Oracle E-Business Suite and its support release to the public. We heard about it during some informal discussions at Collaborate 17 conference.  Everything is ready, the documentation is prepared, and they are just waiting for final bug related to Workflow Activity Monitor to be addressed…

…and about twenty days later,  it was released.

This is a superb evolutionary event! And, of course, I made time to fit in some good introductory, hands-on experience.

 

Initial thoughts while going through Doc ID 2188898.1

 

1. Java Runtime version for client.

  • JRE 8 Update 131 b31 or JRE 8 Update 121 b33 are required (as a minimum).
    This isn’t clearly visible – these are special update releases (look at b31 and b33), available only through My Oracle Support download via Patch 25767257 and 25450542. A public release containing the support is only scheduled for next quarter and is set to be released with Update 141.
  • JRE 6 and JRE 7 are out of scope! If you don’t have JRE 8 support there in your system, now is the right time to think about it.

2. Support for R12.2. Also there is a support for R12.1.3. No 11i.

Patching requirement is minimal (three small patches for AD/TXK/FND), but you need to be on latest AD/TXK.C.8 and ATG_PF.C.6, and new PSU overlay for 10.1.2 Oracle Home which nicely supersedes the previous overlay released.

There are eight functional module patches posted so far. There are still known issues mentioned for other modules. These are less popular modules, and are to be implemented and added to the list at some point, I believe.
I would include this list in your patch maintenance, especially Oracle Workflow related for Activity Monitor.

s_forms_launch_method new context variable is implemented which sets the required profile options at Site level. However, you can still configure exceptions for users via User Level profile option values. Both methods are available. You can still leave browser plugin method as default, and enable JWS only for particular users, let’s say, for testing purposes.

3. “Only one instance per Java applet can be running on the client PC.”

Yes, it’s only possible to have one environment opened at the same time. This is a JWS limitation.  Even IE’s noframemerging or “hey, I can use my second browser” is not an option here.

4. Chrome full usage support is finally available there for Forms.

Basically, browser support is not dependent on browser plugin availability. There is also Microsoft Edge support to mention and we can finally start using 64-bit Java, as there is no longer browser plugin dependency.

5. Known issues section.

From management and housekeeping perspective, docs are mentioning that the process is self cleanable, like frmservlet.jnlp downloaded files should be automatically cleared from disk after launch. There is a known issues section mentioning some nuances, open issues and workarounds for specific browsers and functionality areas in EBS. As always, it will be kept updated and will be periodically reviewed by many of us Apps DBAs for news.

 

Now to talk about my hands-on…

I used my R12.2.6 Vision lab instance built on AWS.  The patching exercise took something like 30 minutes in total (10.1.2 patch, ebs patches, JAR regeneration). But I was doing this in hotpatch mode and not through online patch cycle, (yes, not the right way, but my goal was to avoid ADOP time consuming tasks.
I also installed the required JRE 8 update 131 b31 64-bit version on my laptop.

I made a huge mistake starting this testing on my MacOS, which introduced a number of issues that almost led me to a huge “why Oracle?” facepalm result. And it’s not just Gatekeeper security feature known issue mentioned for Safari on MacOS. But I’ll talk about this later.

Brought up my Windows 7 VM.
All IE 11, Firefox’s latest public update, and Chrome just worked like a charm. I didn’t experience any issues like I faced previously on my Mac. “Save File” to Downloads folder and then a double-click, or “Open with” to open the applet immediately. And your Forms session is up. Look at the screenshot – CHROME!

Applet window is transparently opening.

The example below is from Workflow Activity Monitor using Firefox. No extra browser tab / window opened in my case, although there was a mention of known issue with Firefox.

Applet files downloaded are automatically cleared once the applet is loaded, you will not find them on the disk. Chrome, as an example, updates the status of each item under Downloads tab.

So again.. Windows platform is covered well. What about the rest?

Linux is not a certified platform for Oracle E-Business Suite end user desktops. Although I was successfully using it in my experience, and it should work, the Oracle team still is not testing and certifying it. You may use and play around it at your own risk, and it shouldn’t be the right production direction. I personally haven’t tested it yet for JWS, but my guess is that it will have the same problem as MacOS is having. Maybe a to-do for this blog post update at some point.

MacOS… Steven Chan’s blog post, as mentioned above, states that “Safari on MacOS” is not certified because of MacOS Gatekeeper security feature that is “making the Java Web Start user experience very challenging”. That’s fine. We know a workaround to go to System Preferences app and click on Open Anyway button, though it’s required every time we launch the Forms session.

But… we don’t even reach this point. In the example below, I will outline a sequence of “nuances” I faced. It will be based on Safari screenshots, as only Safari is officially certified on MacOS. Absolutely the same issues I faced in Firefox on MacOS, and in Chrome on same MacOS.

First – we are trying to open Forms, but just getting this familiar screen.

Why? Because the URL still goes to browser plugin mode – “/forms/frmservlet?config=browserMode&appletmode=…”

We can go and set the ICX_FORMS_LAUNCHER profile option to “/forms/frmservlet?config=jws” on Site level, as MOS note outlines.

This works, but it will be required every time we run AutoConfig, as the profile option is always being reset to just “/forms/frmservlet” on Site level, and Forms opening process is supposed to follow FND_ENABLE_JAVA_WEB_START value direction.

After the manual fix this is what’s happening. Safari is downloading the applet.

We launch it and face a known Gatekeeper issue (only now).

Applet is loaded, but not Forms.

I would expect Safari to cover this itself, but in my case I have Firefox configured as default Web browser in the system.
That caused Firefox tab to be opened (called by initial applet) and a second applet download to be requested.

Once the second applet is up, I finally get the Forms session running. Sort of, it’s a similar flow that is happening with a browser plugin, but it is just killing the user experience.
Initially second applet was blocked by same Gatekeeper, and I had to use the known workaround. But at a later testing MacOS is always blocking only the first applet while launching Forms, second – not anymore. Maybe, I suspect the issue is with a miss of jndi extension for the file.

I got the Forms running. But what a process it was…
And nothing is cleared. My Downloads folder is full of these fndservlet.jndi files downloaded.

To summarize

This is a great moment and long expected feature that got finally released by the Oracle EBS team. But is it ready?

I would say, if your company is following strict control standards what software is used, including the browsers, you are, most probably, using same IE on Windows – this feature will work well. Forms will work even with good old browser plugin mode, your current JRE 6 or JRE 7, and you shouldn’t care and rush with this implementation.
Same comments about Firefox ESR – it will work as before. But not for a long time period until next ESR release branch merge.

If your company is following BYOD approach, multiple platforms in use like MacOS, you seek for Chrome browser support – at this stage only Windows platform is covered well. Others – are not officially certified, including “Safari on MacOS”. You can still use it, but you need to accept the “head ache” process as described by the example above. And that’s not about Safari only, as other browsers, like Firefox and Chrome, behaves in the same way on MacOS.

And not all Apps DBAs will agree to manually fix ICX_FORMS_LAUNCHER every time AutoConfig is executed. Though there might be User level exceptions for a defined end user base list.

Categories: DBA Blogs

Simple Steps to Perform oPatch Maintenance with Ansible

Mon, 2017-04-24 10:44

Like the Universe, IT growth seems to be infinite, we always have more environments, more servers, more users, more disk usage, more databases to manage and it won’t stop. In fact, we are pretty sure that this expansion is going to be faster and faster.

We then have to adapt to this new, mutating IT environment being more productive in order to manage more and more targets in less time.

How to achieve this goal? Like human beings have always done from the early days – by using tools and by making better tools with the tools we have.

1/ The Ansible Tool

 

1.1/ A Word on Ansible

Ansible is an open source IT automation tool that was launched in early 2013 and bought by Red Hat in 2015. The most recent 2.3 version was released few days ago.

1.2/ Why Ansible?

Other the automation tools are professed to be easy, fast, able to manage thousands of thousands of targets, etc… so why Ansible instead of Puppet or Chef? For me, it’d because Ansible is agentless and does everything through standard SSH (or Paramiko which is a Python SSH implementation).

Indeed, ‘no agent’ really means easy to deploy, no agent to maintain (!), and it is very secure since it uses SSH. I am accustomed to working with companies that have tough security processes and challenging processes for any kind of installations. Be sure that it is easier to quickly deploy everything with these features:

  • Is it secure? Yes, it goes through SSH.
  • Anything to install on the targets? No.
  • Do you need root access? No, as long as what I need to do is doable with no root privilege.
  • Can it go through sudo? Yes, no worries.
  • What do you need then? An SSH key deployed on the targets (which also means that it is very easy to unsetup, you just have to remove that SSH key from the target)

For more information on the differences between Ansible, Puppet and Chef, just perform an online search.  You will find many in-depth comparatives.

2/ Manage oPatch with Ansible

To illustrate how quick and easy it is to use Ansible, I will demonstrate how to update oPatch with Ansible. oPatch is a very good candidate for Ansible as it needs to be frequently updated, exists in every Oracle home and also needs to be current every time you apply a patch (and for those who read my previous blogs, you know that I like to update opatch :))

2.1/ Install Ansible

The best way to install Ansible is to first refer to the official installation documentation .  There you will find the specific commands for your favorite platform (note that Ansible is not designed for Windows).

2.2/ Configure Ansible

To start, Ansible has to know the hosts you want to manage in a “host” file like:

oracle@control:~/work$ cat hosts_dev
[loadbalancer]
lb01

[database]
db01
db02 ansible_host=192.168.135.101
oracle@control:~/work$

We can split the hosts by group like [loadbalancer], [database] to have various hosts group. It is also possible that the host you are running Ansible on cannot resolve a host. We can then use the ansible_host parameter to specify the IP for it like I did for the db02 server. In fact, ansible_host defines the host Ansible will connect to and the name at the start of the line is an alias used if ansible_host is not defined

Note that I named the hosts file “hosts_dev” in my example. This was done so I would not use the default ansible hosts file which make it more modular. We then have to tell Ansible that we want to use this file instead of the default file in the ansible.cfg configuration file.

oracle@control:~/work$ cat ansible.cfg
[defaults]
inventory=./hosts_dev
oracle@control:~/work$

Please remember that Ansible uses SSH connectivity so you’ll need to exchange the SSH key of your “control” server to your targets. More extensive documentation on the subject can be found online. Here is an example with ssh-copy-id (if you don’t know the target user password, conduct a Google search for authorized_keys and you will find how to exchange an SSH key when you don’t know the target user password):

  oracle@control:~$ ssh-keygen                          # This will generate your SSH keys

  ... press ENTER at all prompts) ...

  oracle@control:~$ ssh-copy-id oracle@db01
  ...
  Are you sure you want to continue connecting (yes/no)? yes
  ...
  oracle@db01's password:                             # You will be prompted for the target password once
  ...
  Now try logging into the machine, with:   "ssh 'oracle@db01'"
  and check to make sure that only the key(s) you wanted were added.

  oracle@control:~$ ssh ansible@db01                   # Try to connect now
  Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 3.13.0-112-generic x86_64)
  Last login: Thu Apr 20 02:17:24 2017 from control
  oracle@db01:~$                                       # We are now connected with no password

 

2.3/ A First Playbook

A playbook is a collection of Ansible commands that are used to orchestrate what you want to do. Ansible uses the YAML language (please have a look at the official YAML website) for this purpose.

Let’s start with a first easy playbook that checks if the /etc/oratab file exists on my [database] hosts:

oracle@control:~/work$ cat upgrade_opatch.yml
---
- hosts: database                              # Specify only the hosts contained in the [database] group
  tasks:
  - name: Check if /etc/oratab exists          # A name for the task
    stat:                                      # I will use the stat module to check if /etc/oratab exists
      path: /etc/oratab                        # The file or directory I want to check the presence
    register: oratab                           # Put the return code in a variable named "oratab"

  - debug:                                     # A debug task to show an error message if oratab does not exist
      msg: "/etc/oratab does not exists"       # The debug message
    when: oratab.stat.exists == false          # The message is printed only when the /etc/oratab file does not exist

oracle@control:~/work$

Let’s run it now (we use ansible-playbook to run a playbook):

oracle@control:~/work$ ansible-playbook upgrade_opatch.yml

PLAY [database] ***************************************************************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
ok: [db02]
ok: [db01]

TASK [Check if /etc/oratab exists] ********************************************************************************************************************************************************************************
ok: [db02]
ok: [db01]

TASK [debug] ******************************************************************************************************************************************************************************************************
skipping: [db01]
ok: [db02] => {
    "changed": false,
    "msg": "/etc/oratab does not exists"
}

PLAY RECAP ********************************************************************************************************************************************************************************************************
db01                       : ok=2    changed=0    unreachable=0    failed=0
db02                       : ok=3    changed=0    unreachable=0    failed=0

oracle@control:~/work$

Since I removed /etc/oratab from db02 on purpose, I received the “/etc/oratab does not exists” error message (as expected).

Before going further, let’s add a test to see if unzip exists (we’ll need unzip to unzip the opatch zipfile). Put the db02’s oratab file back where it should be and run the playbook again:

  oracle@control:~/work$ cat upgrade_opatch.yml
  ---
  - hosts: database
    tasks:
    - name: Check if /etc/oratab exists
      stat:
        path: /etc/oratab
      register: oratab

    - debug:
        msg: "/etc/oratab does not exists"
      when: oratab.stat.exists == false

    - name: Check if unzip exists (if not we wont be able to unzip the opatch zipfile)
      shell: "command -v unzip"
      register: unzip_exists

    - debug:
        msg: "unzip cannot be found"
      when: unzip_exists == false
  oracle@control:~/work$ ansible-playbook upgrade_opatch.yml

  PLAY [database] ***************************************************************************************************************************************************************************************************

  TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
  ok: [db02]
  ok: [db01]

  TASK [Check if /etc/oratab exists] ********************************************************************************************************************************************************************************
  ok: [db01]
  ok: [db02]

  TASK [debug] ******************************************************************************************************************************************************************************************************
  skipping: [db01]
  skipping: [db02]

  TASK [Check if unzip exists (if not we wont be able to unzip the opatch zipfile)] *********************************************************************************************************************************
  changed: [db02]
  changed: [db01]

  TASK [debug] ******************************************************************************************************************************************************************************************************
  skipping: [db01]
  skipping: [db02]

  PLAY RECAP ********************************************************************************************************************************************************************************************************
  db01                       : ok=3    changed=1    unreachable=0    failed=0
  db02                       : ok=3    changed=1    unreachable=0    failed=0

  oracle@control:~/work$

Please note that I used the shell built-in module to test if unzip is present or not.

2.4/ Upgrade oPatch

To upgrade oPatch, we need to copy the zipfile to the target Oracle home and then unzip it — easy and straightforward. Let’s ask Ansible to do it for us.

First, let’s use the copy module to copy the oPatch zipfile to the target Oracle home:

- name: Copy the opatch zipfile to the target oracle home
   copy:
     src: p6880880_112000_Linux-x86-64.zip
     dest: /u01/oracle/11204

Unzip the zipfile in the target Oracle home (I use the shell module to unzip instead of the unarchive module on purpose. This will trigger a warning during the playbook execution, but I am not a big fan of the unarchive module… we could discuss that later on):

  - name: Upgrade opatch
    shell: unzip -o /u01/oracle/11204/p6880880_112000_Linux-x86-64.zip -d /u01/oracle/11204
    register: unzip
    failed_when: unzip.rc != 0

Let’s cleanup the zipfile we copied earlier using the file module (note that this is the keyword state: absent which will remove the file), we do not want to leave any leftovers:

  - name: Cleanup the zipfile from the target home
    file:
      name: /u01/oracle/11204/p6880880_112000_Linux-x86-64.zip
      state: absent

Now review the whole playbook:

  oracle@control:~/work$ cat upgrade_opatch.yml
---
- hosts: database
  tasks:
  - name: Check if /etc/oratab exists
    stat:
      path: /etc/oratab
    register: oratab

  - debug:
      msg: "/etc/oratab does not exists"
    when: oratab.stat.exists == false

  - name: Check if unzip exists (if not we wont be able to unzip the opatch zipfile)
    shell: "command -v unzip"
    register: unzip_exists

  - debug:
      msg: "unzip cannot be found"
    when: unzip_exists == false

  - name: Copy the opatch zipfile to the target oracle home
    copy:
      src: p6880880_112000_Linux-x86-64.zip
      dest: /u01/oracle/11204

  - name: Upgrade opatch
    shell: unzip -o /u01/oracle/11204/p6880880_112000_Linux-x86-64.zip -d /u01/oracle/11204
    register: unzip
    failed_when: unzip.rc != 0

  - name: Cleanup the zipfile from the target home
    file:
      name: /u01/oracle/11204/p6880880_112000_Linux-x86-64.zip
      state: absent

oracle@control:~/work$

and execute it:

oracle@control:~/work$ ansible-playbook upgrade_opatch.yml

PLAY [database] ***************************************************************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
ok: [db02]
ok: [db01]

TASK [Check if /etc/oratab exists] ********************************************************************************************************************************************************************************
ok: [db01]
ok: [db02]

TASK [debug] ******************************************************************************************************************************************************************************************************
skipping: [db01]
skipping: [db02]

TASK [Check if unzip exists (if not we wont be able to unzip the opatch zipfile)] *********************************************************************************************************************************
changed: [db02]
changed: [db01]

TASK [debug] ******************************************************************************************************************************************************************************************************
skipping: [db01]
skipping: [db02]

TASK [Copy the opatch zipfile to the target oracle home] **********************************************************************************************************************************************************
changed: [db01]
changed: [db02]

TASK [Upgrade opatch] *********************************************************************************************************************************************************************************************
 [WARNING]: Consider using unarchive module rather than running unzip

changed: [db01]
changed: [db02]

TASK [Cleanup the zipfile from the target home] *******************************************************************************************************************************************************************
changed: [db02]
changed: [db01]

PLAY RECAP ********************************************************************************************************************************************************************************************************
db01                       : ok=6    changed=4    unreachable=0    failed=0
db02                       : ok=6    changed=4    unreachable=0    failed=0

oracle@control:~/work$

We now have a playbook that can update all your oPatches in a blink!

Please note that this example is a very basic one since this is to give an overview on how to manage oPatch with Ansible.
Many features could be implemented here (and are implemented in the code we use here at Pythian) like:

  • Check the list of Oracle homes on each server — there are often many.
  • Check the version of each Oracle home’s oPatch.
  • Manager different oPatch versions : 11, 12 and 13.
  • Use the Ansible roles to make the code more modular and reusable.
  • Upgrade opatch only if it needs to and more…

I hope you enjoyed this Ansible overview!

Categories: DBA Blogs

Error ORA-01033 After Doing a Switchover in a 12.1 RAC Environment

Thu, 2017-04-13 08:29

The other day I did a switchover in a RAC environment , which went pretty smooth , but after doing the switchover in the primary, I kept getting the following error:

select dest_name,status,error from gv$archive_dest_status where dest_id=2;

DEST_NAME
--------------------------------------------------------------------------------
STATUS	  ERROR
--------- -----------------------------------------------------------------
LOG_ARCHIVE_DEST_2
ERROR	  ORA-01033: ORACLE initialization or shutdown in progress

LOG_ARCHIVE_DEST_2
ERROR	  ORA-01033: ORACLE initialization or shutdown in progress

I went and checked the standby, and saw the standby was in recover mode and waiting for the redo log

PROCESS STATUS	     CLIENT_P CLIENT_PID	  THREAD#	 SEQUENCE#	     BLOCK#    ACTIVE_AGENTS	 KNOWN_AGENTS
------- ------------ -------- ---------- ---------------- ---------------- ---------------- ---------------- ----------------
ARCH	CONNECTED    ARCH     44474			0		 0		  0		   0		    0
RFS	IDLE	     ARCH     133318			0		 0		  0		   0		    0
RFS	IDLE	     ARCH     50602			0		 0		  0		   0		    0
ARCH	CLOSING      ARCH     44470			1	     21623	      14336		   0		    0
ARCH	CLOSING      ARCH     44476			1	     21624		  1		   0		    0
ARCH	CLOSING      ARCH     44472			2	     19221	      96256		   0		    0
RFS	IDLE	     LGWR     133322			1	     21625	      17157		   0		    0
RFS	IDLE	     LGWR     50620			2	     19222	      36611		   0		    0
MRP0	WAIT_FOR_LOG N/A      N/A			2	     19222	      36617		  0		   0

My first train of thought was that the password file was incorrect, so I recreated them and copied them from the primary to the standby nodes, but I still kept getting the same error. I reviewed the environment with the scripts in DOC ID 1581388.1 and everything seemed alright. It really kept bugging me that the logs were not being applied even though the logs were being shipped to the standby (so it did have to do with the password file), but what really bothered me, was that I had just recreated the password file in $ORACLE_HOME/dbs and I still kept getting the same error.

So after a while of troubleshooting, I found that in the new primary the password file was residing in an ASM Diskgroup, and that was the main culprit. This meant that I had to copy the password file from the ASM diskgroup in the primary to the standby.
Primary

[oracle@localhost trace]$ srvctl config database -d renedb
Database unique name: renedb
Database name: 
Oracle home: /u01/app/oracle/product/12.1.0.2/db_1
Oracle user: oracle
Spfile: +DATA1/renedb/spfilerenedb.ora
Password file: +DATA1/renedb/PASSWORD/pwrenedb
Domain: 
Start options: open
Stop options: immediate
Database role: PHYSICAL_STANDBY
Management policy: AUTOMATIC
Server pools: 
Disk Groups: ARCH1,DATA1,REDO
Mount point paths: 
Services: 
Type: RAC
Start concurrency: 
Stop concurrency: 
OSDBA group: dba
OSOPER group: oper
Database instances: renedb1,renedb2
Configured nodes: localhost,localhost
Database is administrator managed
[oracle@localhost trace]$ exit
-bash-4.1$ sudo su - grid
[sudo] password for pythian: 
[grid@localhost ~]$ . oraenv
ORACLE_SID = [+ASM1] ? 
The Oracle base remains unchanged with value /u01/app/grid
[grid@localhost ~]$ asmcmd
ASMCMD> pwcopy +DATA1/renedb/PASSWORD/pwrenedb /tmp/pwrenedb
copying +DATA1/renedb/PASSWORD/pwrenedb -> /tmp/pwrenedb
ASMCMD> exit

Standby

[oracle@localhost dbs]$ scp 10.10.0.1:/tmp/pwrenedb /tmp/pwrenedb_stby
pwrenedb_stby_phdb                                                                                                                                                                                                    100% 7680     7.5KB/s   00:00    
[oracle@localhost dbs]$ exit
logout
[pythian@localhost ~]$ sudo su - grid
[sudo] password for pythian: 
Last login: Fri Mar 31 21:55:53 MST 2017
[grid@localhost ~]$ . oraenv
ORACLE_SID = [grid] ? +ASM1
The Oracle base has been set to /u01/app/grid
[grid@localhost ~]$ asmcmd
ASMCMD> mkdir DATA/RENEDB/PASSWORD
ASMCMD> pwcopy /tmp/pwrenedb_stby_phdb +DATA/RENEDB/PASSWORD/pwrenedb_stby
copying /tmp/pwrenedb_stby_phdb -> +DATA/RENEDB/PASSWORD/pwrenedb_stby
ASMCMD> exit
[grid@localhost ~]$ exit
logout
[pythian@localhost ~]$ sudo su - oracle
Last login: Sat Apr  1 01:35:46 MST 2017 on pts/4
The Oracle base has been set to /u01/app/oracle
[oracle@localhost dbs]$ srvctl modify database -d renedb_stby -pwfile +DATA/RENEDB/PASSWORD/pwrenedb_stby
[oracle@localhost dbs]$ srvctl config  database -d renedb_stby
Database unique name: renedb_stby
Database name: 
Oracle home: /u01/app/oracle/product/12.1.0/dbhome_1
Oracle user: oracle
Spfile: /u01/app/oracle/product/12.1.0/dbhome_1/dbs/spfilerenedb_stby.ora
Password file: +DATA/RENEDB/PASSWORD/pwrenedb_stby
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: 
Disk Groups: ARCH,DATA,REDO
Mount point paths: 
Services: 
Type: RAC
Start concurrency: 
Stop concurrency: 
OSDBA group: dba
OSOPER group: oper
Database instances: renedb_stby1,renedb_stby2
Configured nodes: *******,***********
Database is administrator managed

Once I did this, the standby started applying the redo logs and after the gap was closed the Primary switchover status was “TO STANDBY”
Primary

Primary Site last generated SCN

*******************************

DB_UNIQUE_NAME	SWITCHOVER_STATUS	  CURRENT_SCN
--------------- -------------------- ----------------
renedb	TO STANDBY		 134480468945

Standby

Data Guard Apply Lag

********************

NAME	     LAG_TIME		  DATUM_TIME	       TIME_COMPUTED
------------ -------------------- -------------------- --------------------
apply lag    +00 00:00:00	  04/01/2017 04:05:51  04/01/2017 04:05:52

1 row selected.


Data Guard Gap Problems

***********************

no rows selected

PROCESS STATUS	     CLIENT_P CLIENT_PID	  THREAD#	 SEQUENCE#	     BLOCK#    ACTIVE_AGENTS	 KNOWN_AGENTS
------- ------------ -------- ---------- ---------------- ---------------- ---------------- ---------------- ----------------
ARCH	CONNECTED    ARCH     44474			0		 0		  0		   0		    0
RFS	IDLE	     ARCH     133318			0		 0		  0		   0		    0
RFS	IDLE	     ARCH     50602			0		 0		  0		   0		    0
ARCH	CLOSING      ARCH     44470			1	     21623	      14336		   0		    0
ARCH	CLOSING      ARCH     44476			1	     21624		  1		   0		    0
ARCH	CLOSING      ARCH     44472			2	     19221	      96256		   0		    0
RFS	IDLE	     LGWR     133322			1	     21625	      17157		   0		    0
RFS	IDLE	     LGWR     50620			2	     19222	      36611		   0		    0
MRP0	APPLYING_LOG N/A      N/A			2	     19222	      36617		  33		   33

9 rows selected.

Conclusion
In 12.1 it is recommended as per DOC ID 1984091.1, to have the password file in ASM diskgroups. So once I did this, I was able to workaround error ORA-01033 and able to sleep well!

Note.-  This was originally published in rene-ace.com

Categories: DBA Blogs

Bushy Join Trees in Oracle 12.2

Tue, 2017-04-04 10:05

There are multiple optimizer features introduced in the Oracle Database 12.2 Release. Bushy Join Trees is the one that I’ll be demonstrating in this post.

First, I’ll create four tables with two indexes:

create table t1 as select rownum n1, rownum n2 from dual connect by  level <= 1000000;
create table t2 as select rownum n1, rownum n2 from dual connect by  level <= 100;
create table t3 as select rownum n1, rownum n2 from dual connect by  level <= 1000000;
create table t4 as select rownum n1, rownum n2 from dual connect by  level <= 100;

create index idx_t1 on t1(n1);
create index idx_t3 on t3(n1);

Now, I’ll run the next query:

select * from t1, t2, t3, t4 where t1.n1 = t2.n1 and t3.n1 = t4.n1 and t1.n2=t3.n2;

The most efficient “Bushy” execution plan for this query looks like the next one:

    -----HJ----- 
    |           |
    |           |
---NL---    ---NL---
|      |    |       |
T1     T2   T3      T4

This plan joins T1 and T2 tables by Nested Loop using idx_t1 index, joins T3 and T4 tables by Nested Loop using idx_t3 index and finally joins results of the previous joins by the Hash Join.
But Oracle has never been able to generate such execution plan automatically. You had to rewrite this query with subqueries and bunch of hints in order to force this kind of execution plan.

The following example shows the typical execution plan that Oracle can generate:

Execution Plan
----------------------------------------------------------
Plan hash value: 1007837908

-----------------------------------------------------------------------------------------
| Id  | Operation		       | Name	| Rows	| Bytes | Cost (%CPU)| Time	|
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	       |	|     1 |    32 |   888   (1)| 00:00:01 |
|*  1 |  HASH JOIN		       |	|     1 |    32 |   888   (1)| 00:00:01 |
|*  2 |   HASH JOIN		       |	|   100 |  2600 |   885   (1)| 00:00:01 |
|   3 |    NESTED LOOPS 	       |	|   100 |  1600 |   303   (0)| 00:00:01 |
|   4 |     NESTED LOOPS	       |	|   100 |  1600 |   303   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL	       | T2	|   100 |   600 |     3   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN	       | IDX_T1 |     1 |	|     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS BY INDEX ROWID| T1	|     1 |    10 |     3   (0)| 00:00:01 |
|   8 |    TABLE ACCESS FULL	       | T3	|  1000K|  9765K|   579   (1)| 00:00:01 |
|   9 |   TABLE ACCESS FULL	       | T4	|   100 |   600 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T3"."N1"="T4"."N1")
   2 - access("T1"."N2"="T3"."N2")
   6 - access("T1"."N1"="T2"."N1")

We can see that a full T3 table scan and T3 table can be significantly large.

Oracle 12.2 has introduced new BUSHY_JOIN hint and bunch of hidden “_optimizer_bushy” parameters:
_optimizer_bushy_join
_optimizer_bushy_fact_min_size
_optimizer_bushy_fact_dim_ratio
_optimizer_bushy_cost_factor.

_optimizer_bushy_join parameter is ‘off’ by default and you have to set it to ‘on’ or to use a BUSHY_JOIN hint.

Let’s try with a hint:

select   /*+ qb_name(main) BUSHY_JOIN(@"MAIN" ( "T1"@"MAIN"   "T2"@"MAIN" )) */
 * from t1, t2, t3, t4 where t1.n1 = t2.n1 and t3.n1 = t4.n1 and t1.n2=t3.n2;

100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1929967733

----------------------------------------------------------------------------------------------------
| Id  | Operation		       | Name		   | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	       |		   |   100 |  6800 |   606   (0)| 00:00:01 |
|*  1 |  HASH JOIN		       |		   |   100 |  6800 |   606   (0)| 00:00:01 |
|   2 |   NESTED LOOPS		       |		   |   100 |  1600 |   303   (0)| 00:00:01 |
|   3 |    NESTED LOOPS 	       |		   |   100 |  1600 |   303   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL	       | T4		   |   100 |   600 |	 3   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN	       | IDX_T3 	   |	 1 |	   |	 2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID | T3		   |	 1 |	10 |	 3   (0)| 00:00:01 |
|   7 |   VIEW			       | VW_BUSHY_D96D1B60 |   100 |  5200 |   303   (0)| 00:00:01 |
|   8 |    NESTED LOOPS 	       |		   |   100 |  1600 |   303   (0)| 00:00:01 |
|   9 |     NESTED LOOPS	       |		   |   100 |  1600 |   303   (0)| 00:00:01 |
|  10 |      TABLE ACCESS FULL	       | T2		   |   100 |   600 |	 3   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN	       | IDX_T1 	   |	 1 |	   |	 2   (0)| 00:00:01 |
|  12 |     TABLE ACCESS BY INDEX ROWID| T1		   |	 1 |	10 |	 3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ITEM_1"="T3"."N2")
   5 - access("T3"."N1"="T4"."N1")
  11 - access("T1"."N1"="T2"."N1")

We can see VW_BUSHY_D96D1B60 internal view at step 7, and this is a definitely a “bushy” plan. The feature still is not enabled by default, but you don’t need to rewrite the query for a proper plan.

Categories: DBA Blogs

Log Buffer #508: A Carnival of the Vanities for DBAs

Mon, 2017-04-03 08:41

This Log Buffer Edition covers Oracle, SQL Server and MySQL.

Oracle:

Compiling views: When the FORCE Fails You

Goldengate 12c Troubleshooting XAGENABLE

A performance Deep Dive into Tablespace Encryption

EBS Release 12 Certified with Safari 10 and MacOS Sierra 10.12

Oracle Database 12c (12.2.0.1.0) on VirtualBox

SQL Server:

A Single-Parameter Date Range in SQL Server Reporting Services

Generating Plots Automatically From PowerShell and SQL Server Using Gnuplot

Justify the Remote Access Requirement for Biztalk Log Shipping

Building Better Entity Framework Applications

Performing a Right and Comprehensive Age Calculation

MySQL:

Basics of MySQL Administration and Best Practices

Bonanza Cuts Load in Half with VividCortex

Experiments with MySQL 5.7’s Online Buffer Pool Resize

MySQL 8.0 Collations: The Devil is in the Details.

How to Encrypt MySQL Backups on S3

Categories: DBA Blogs

How to Patch an Exadata (Part 6) – Timing

Tue, 2017-03-28 15:25

Quick links to Part 1 / Part 2 / Part 3 / Part 4 / Part 5 / Part 6
 

 

6: Timing

Now that we know how to patch every component and the different options available to do so (rolling, non-rolling), which one is the best? How much time does it take?

The answer is obviously “it depends” but I will try to bring few insights so you can have a bright answer when your manager inevitably asks you “How long will that patch be? I need to negotiate the window maintenance with the business… they aren’t happy…” ;)
 
 
Here is a summary of the length of the patch application in a Rolling fashion and in a Non-Rolling fashion (as well as the downtime for each method). Please note that I put in green what I recommend.
 

Cells

  • Rolling : 1h30 x number of cells
  • Rolling downtime : 0 minute
  • Non-rolling : 2h (1h30 to patch a cell + 30 minutes to stop and start everything before and after the patch)
  • Non-rolling downtime : 2h

Note : Refer to my notes at the end of this page about this choice
 
IB Switches

  • Rolling : 45 minutes per switch then 1h30 total
  • Rolling downtime : 0 minute
  • Non-rolling : not available
  • Non-rolling downtime : not available

Note: There’s no non-rolling method for the IB Swicthes then here the choice is an easy one!
 
Database Servers

Note: Refer to my notes at the end of this page about this choice
 
Grid

Note: No green color here? To patch the grid, I recommend to go for a mix like:

  • Rebalance the services away from node 1
  • Patch the node 1
  • Verify that everything is well restarted on the node 1
  • Move all the services to the node 1 (if it is possible that only one node can handle the whole activity – but usually we patch during a quiet period)
  • Apply the patch in a non-rolling method (for the Grid it means launching the patch manually in parallel on the remaining nodes)
  • Once the grid has been patched on all the nodes, restart all the services as they were before the patch

 
Databases Oracle homes

  • Rolling: 20 – 30 minutes per node + ~ 20 minutes per database for the post installation steps
  • Rolling downtime:

    – Can be 0 minute if you rebalance the services before patching a node (as described here for the Grid patching, you can apply the same concept for the database patching as well) + ~ 20 minutes per database for the post installation steps.

    Please note that if you have 30 databases sharing the same ORACLE_HOME, you won’t be able to easily apply 30 post-install steps at the same time then the 30th database will suffer a bigger outage than the 1st one you restart on the patched ORACLE_HOME. This is why I strongly recommend the use of this quicker method.

    – An ~ 20 minutes downtime per database you can chose when using the quicker way !

  • Non-rolling: 20 – 30 minutes
  • Non-rolling downtime: 20 – 30 minutes for all the databases running on the patched Oracle home + ~ 20 minutes per database for the post installation steps. Note that if you have 30 databases sharing the same ORACLE_HOME, you won’t be able to apply 30 post-install steps at the same time then the 30th database will suffer a bigger outage than the 1st one you restart on the patched ORACLE_HOME.

Note: In this instance, I will definitely go for the quicker way ! : clone the Oracle home you want to patch to another one, apply the patch and move the databases one by one to the new patched Oracle home
 
 

Notes on my recommendations

Yes, I always prefer the rolling method for the Infrastructure components (Grid and Database Servers). This is because I can mitigate the outage and I’m also sure to avoid any outage created by the patch or anything preventing for example a reboot as we do not reboot those servers frequently.

Imagine if you go for a cell rolling upgrade and one cell does not reboot after the patch. You’ll have no issue here as the patch will stop automatically; everything will work as before with one cell down, no one will notice anything, you are still supported as it is supported to run different version across different servers. You can then quietly check the troubleshooting section of this blog or go to the pool while Oracle finds a solution for you.

It happened to us on production (it didn’t happen on the DEV on QA Exadatas before…), we warned the client and it took few days to Oracle to provide an action plan. All ran perfectly during a week with a cell down, we then applied the Oracle action plan during the next week-end and could properly finish the patch. The result here is that we applied the patch successfully. We had an issue that caused no outage nor performance degradation and we still fit in the maintenance window – very good job from a client and process point of view !

But if you go for a non-rolling cell patching and all your cells (or few of them) do not reboot after the patch, then you are in trouble and you will lose ten times the time you think you could have won by doing a non-rolling manner. You will most likely have a failed patch outside of the maintenance window, a Root Cause Analysis to provide to the process guys and you probably won’t patch this Exadata any more for a while as the client will be… hmmm… a bit chilly about that question in the future.

And this risk is the same for the databases servers.
 
I do not say that the Bundle won’t work and create a big outage, it is just all about risk mitigation. But remember: “highest level of patch = highest level of bug” :)
 
 
 
If you’ve reached this point, I hope that you enjoyed this Odyssey into the Exadata patching world as much as I enjoy working with it on a daily basis!
 
 

Quick links to Part 1 / Part 2 / Part 3 / Part 4 / Part 5 / Part 6
 

Categories: DBA Blogs

How to Patch an Exadata (Part 1) – Introduction and Prerequisites

Tue, 2017-03-28 12:30

Once you have installed your new Exadata machine will come a time where you’ll be asked :

shouldn’t we patch the Exadata” ?

And the answer is “yes, definitely“.

 
Indeed, Oracle releases huges (~ 10 GB) “Quarterly Full Stack” patches (aka Bundles) every quarter (for example : Patch 24436624 – Quarterly Full Stack Download For Oracle Exadata (Oct 2016 – 12.1.0.2)); these Bundles contain all the patches for all the components that make an Exadata. You will need (almost :)) nothing else to be able to patch your whole Exadata.
 
Even if it looks a tough operation at first sight, it is not that much. And this blog’s aim is to clearly describe every step to make it easier for all of us. Let’s start with a preview of this patching with the order we will be proceeding and the tools we will be using :
 
 

 
 
As it is quite a long odyssey, I will split this blog in different parts which are also a logic order to patch all the components :
 
0/ An advice

1/ General Information

2/ Some prerequisites it is worth doing before the maintenance

3/ The patching procedure

3.1/ Patching the cells (aka Storage servers)
3.2/ Patching the IB switches
3.3/ Patching the Database servers (aka Compute Nodes)
3.4/ Patching the Grid Infrastructure
3.5/ Patching the databases ORACLE_HOMEs

4/ The Rollback procedure

4.1/ Cell Rollback
4.2/ DB nodes Rollback
4.3/ IB Switches Rollback

5/ Troubleshooting

5.1/ Cell patching issue
5.2/ CRS does not restart issue
5.3/ A procedure to add instances to a database
5.4/ OPatch resume

6/ Timing

 

 
 
 

0/ An advice

First of all, please strongly keep in mind this advice :

Do NOT continue to the next step before a failed step is properly resolved.

Indeed, everything that needs to be redundant is redundant and it is supported to run different versions between servers. In the MOS note “Exadata Patching Overview and Patch Testing Guidelines (Doc ID 1262380.1)“, we can read that :
 

It is supported to run different Exadata versions between servers. For example, some storage servers may run 11.2.2.4.2 while others run 11.2.3.1.1, or all storage servers may run 11.2.3.1.1 while database servers run 11.2.2.4.2. However, it is highly recommended that this be only a temporary configuration that exists for the purpose and duration of rolling upgrade.

 
Then if when patching your cells one cell is not rebooting, stop here, do not continue, do not force patch the next one. Indeed, everything will still be working fine and in a supported manner with one cell down (I did it on production, no user could notice anything), it will most likely not be the case with 2 cells down. If this kind of issue happens, have a look at the troubleshooting section of this blog and open a MOS Sev 1.

 
 
 

1/ General Information

Some information you need to know before starting to patch your Exadata :

  • It is better to have a basic understanding of what is an Exadata before jumping to this patch procedure
  • This procedure does not apply to an ODA (Oracle Database Appliance)
  • I will use the /patches/OCT2016_bundle_patch FS to save the Bundle in the examples of this blog
  • I use the “DB node” term here, it means “database node“, aka “Compute node“; the nodes where the Grid Infrastructure and the database are running, I will also use the db01 term for the database node number 1, usually named db01
  • I use the “cell” word aka “storage servers“, the servers that manage your storage. I will also use cel01 for the storage server number 1, usually named cel01
  • It is good to have the screen utility installed; if not, use nohup
  • Almost all the procedure will be executed as root
  • I will patch the IB Switches from the DB node 1 server
  • I will patch the cells from the DB node 1 server
  • I will patch the DB nodes from the cel01 server

 
 
 

1/ Some prerequisites it is worth doing before the maintenance 1.1/ Download and unzip the Bundle

Review the Exadata general note (Exadata Database Machine and Exadata Storage Server Supported Versions (Doc ID 888828.1)) to find the latest Bundle, download, unzip it; be sure that every directory is owned by oracle:dba to avoid any issue in the future :

/patches/OCT2016_bundle_patch >
oracle@myclusterdb01) ls -ltr
total 9609228
-rw-r--r-- 1 oracle oinstall 560430690 Nov 16 18:24 p24436624_121020_Linux-x86-64_10of10.zip
-rw-r--r-- 1 oracle oinstall 1030496554 Nov 16 18:26 p24436624_121020_Linux-x86-64_1of10.zip
-rw-r--r-- 1 oracle oinstall 1032681260 Nov 16 18:27 p24436624_121020_Linux-x86-64_2of10.zip
-rw-r--r-- 1 oracle oinstall 1037111138 Nov 16 18:29 p24436624_121020_Linux-x86-64_3of10.zip
-rw-r--r-- 1 oracle oinstall 1037009057 Nov 16 18:31 p24436624_121020_Linux-x86-64_4of10.zip
-rw-r--r-- 1 oracle oinstall 1037185003 Nov 16 18:33 p24436624_121020_Linux-x86-64_5of10.zip
-rw-r--r-- 1 oracle oinstall 1026218494 Nov 16 18:35 p24436624_121020_Linux-x86-64_6of10.zip
-rw-r--r-- 1 oracle oinstall 1026514887 Nov 16 18:36 p24436624_121020_Linux-x86-64_7of10.zip
-rw-r--r-- 1 oracle oinstall 1026523343 Nov 16 18:39 p24436624_121020_Linux-x86-64_8of10.zip
-rw-r--r-- 1 oracle oinstall 1025677014 Nov 16 18:41 p24436624_121020_Linux-x86-64_9of10.zip

/patches/OCT2016_bundle_patch >
oracle@myclusterdb01) for I in `ls p24436624_121020_Linux-x86-64*f10.zip`
do
unzip $I
done
Archive: p24436624_121020_Linux-x86-64_10of10.zip
 inflating: 24436624.tar.splitaj
...
Archive: p24436624_121020_Linux-x86-64_9of10.zip
 inflating: 24436624.tar.splitai

/patches/OCT2016_bundle_patch >
oracle@myclusterdb01) cat *.tar.* | tar -xvf -
24436624/
24436624/automation/
24436624/automation/bp1-out-of-place-switchback.xml
24436624/automation/bp1-auto-inplace-rolling-automation.xml

...

 
 

1.2/ SSH keys

For this step, if you are not confident with the dbs_group, cell_group, etc… files,  here is how to create them as I have described it in this post (look for “dbs_group” in the post).

[root@myclusterdb01 ~]# ibhosts | sed s'/"//' | grep db | awk '{print $6}' | sort > /root/dbs_group
[root@myclusterdb01 ~]# ibhosts | sed s'/"//' | grep cel | awk '{print $6}' | sort > /root/cell_group
[root@myclusterdb01 ~]# cat /root/dbs_group ~/cell_group > /root/all_group
[root@myclusterdb01 ~]# ibswitches | awk '{print $10}' | sort > /root/ib_group
[root@myclusterdb01 ~]#

We would need few SSH keys deployed in order to ease the patches application :

  • root ssh keys deployed from the db01 server to the IB Switches (you will have to enter the root password once for each IB Switch)
[root@myclusterdb01 ~]# cat ~/ib_group
myclustersw-ib2
myclustersw-ib3
[root@myclusterdb01 ~]# dcli -g ~/ib_group -l root -k -s '-o StrictHostKeyChecking=no'
root@myclustersw-ib3's password:
root@myclustersw-ib2's password:
myclustersw-ib2: ssh key added
myclustersw-ib3: ssh key added
[root@myclusterdb01 ~]#
  • root ssh keys deployed from the cel01 server to all the database nodes (you will have to enter the root password once for each database server)
[root@myclustercel01 ~]# cat ~/dbs_group
myclusterdb01
myclusterdb02
myclusterdb03
myclusterdb04
[root@myclustercel01 ~]# dcli -g ~/dbs_group -l root -k -s '-o StrictHostKeyChecking=no'
root@myclusterdb01's password:
root@myclusterdb03's password:
root@myclusterdb04's password:
root@myclusterdb02's password:
myclusterdb01: ssh key added
myclusterdb02: ssh key added
myclusterdb03: ssh key added
myclusterdb04: ssh key added
[root@myclustercel01 ~]#
  • root ssh keys deployed from the db01 server to all the cells (you will have to enter the root password once for each cell)
[root@myclusterdb01 ~]# dcli -g ~/cell_group -l root hostname
myclustercel01: myclustercel01.mydomain.com
myclustercel02: myclustercel02.mydomain.com
myclustercel03: myclustercel03.mydomain.com
myclustercel04: myclustercel04.mydomain.com
[root@myclusterdb01 ~]# dcli -g ~/cell_group -l root -k -s '-o StrictHostKeyChecking=no'
root@myclustercel04's password:
...
root@myclustercel03's password:
myclustercel01: ssh key added
...
myclustercel06: ssh key added
[root@myclusterdb01 ~]#
1.3/ Upgrade opatch

It is highly recommended to upgrade opatch before any patching activity and this Bundle is not an exception. Please find the procedure to quickly upgrade opatch with dcli in this post.

Please note that upgrading opatch will also allow you to be ocm.rsp-free !

 
 

1.4/ Run the prechecks

It is very important to run those prechecks and take a good care of the outputs. They have to be 100% successful to ensure a smooth application of the patches.

  • Cell patching prechecks (launch them from the DB Node 1 as you will patch them from here)
[root@myclusterdb01 ~]# cd /patches/OCT2016_bundle_patch/24436624/Infrastructure/12.1.2.3.3/ExadataStorageServer_InfiniBandSwitch/patch_12.1.2.3.3.161013/
[root@myclusterdb01 ~]# ./patchmgr -cells ~/cell_group -patch_check_prereq -rolling

 
 

  • DB Nodes prechecks (launch them from the cel01 server as you will patch them from here)

As we will use the cell node 1 server to patch the databases servers, we first need to copy patchmgr and the ISO file to this server

[root@myclusterdb01 ~]#  scp /patches/OCT2016_bundle_patch/24436624/Infrastructure/SoftwareMaintenanceTools/DBServerPatch/5.161014/p21634633_121233_Linux-x86-64.zip root@myclustercel01:/tmp/.                    # This is patchmgr
[root@myclusterdb01 ~]#  scp /patches/OCT2016_bundle_patch/24436624/Infrastructure/12.1.2.3.3/ExadataDatabaseServer_OL6/p24669306_121233_Linux-x86-64.zip root@myclustercel01:/tmp/.                               # This is the ISO file, do NOT unzip it
[root@myclusterdb01 ~]#  ssh root@myclustercel01
[root@myclustercel01 ~]#  cd /tmp
[root@myclustercel01 ~]#  nohup unzip p21634633_121233_Linux-x86-64.zip &
[root@myclustercel01 ~]# cd /tmp/dbserver_patch_5.5.161014
[root@goblxdex02cel01 dbserver_patch_5.161014]# ./patchmgr -dbnodes ~/dbs_group -precheck -iso_repo /tmp/p24669306_121233_Linux-x86-64.zip -target_version 12.1.2.3.3.161013
[root@goblxdex02cel01 dbserver_patch_5.161014]#

Note : if you have some NFS mounted, you will have some error messages, you can ignore them at this stage, we will umount the NFS before patching the DB nodes
 
 

  • IB Switches prechecks (launch them from the DB Node 1 as you will patch them from here)
[root@myclusterdb01]# cd /patches/OCT2016_bundle_patch/24436624/Infrastructure/12.1.2.3.3/ExadataStorageServer_InfiniBandSwitch/patch_12.1.2.3.3.161013/
[root@myclusterdb01]# patch_12.1.2.3.3.161013]# ./patchmgr -ibswitches ~/ib_group -upgrade -ibswitch_precheck

 
 

  • Grid Infrastructure prechecks
[root@myclusterdb01]# . oraenv <<< +ASM1
[root@myclusterdb01]# $ORACLE_HOME/OPatch/opatchauto apply /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103 -oh /u01/app/12.1.0.2/grid -analyze
[root@myclusterdb01]#

Notes :

  • You will most likely see some warnings here, check the logfiles and they will probably be due to some patches that will be rolled back as they will not be useful any more.

 
 

[root@myclusterdb01]# $ORACLE_HOME/OPatch/opatch prereq CheckSystemSpace -phBaseFile /tmp/patch_list_gihome.txt

With the following /tmp/patch_list_gihome.txt file (check the README as the patch numbers will change with the versions)

[root@myclusterdb01]#cat /tmp/patch_list_gihome.txt
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/21436941
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24007012
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24846605
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24340679
[root@myclusterdb01]#

 
 

  • Database patch prechecks
[oracle@myclusterdb01]$ . oraenv <<< A_DATABASE_WITH_THE_ORACLE_HOME_YOU_WANT_TO_PATCH
[oracle@myclusterdb01]$ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24340679 $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24846605 $ORACLE_HOME/OPatch/opatch prereq CheckSystemSpace -phBaseFile /tmp/patch_list_dbhome.txt

The file /tmp/patch_list_dbhome.txt containing (check the README, the patch numbers will change depending on the versions) :

/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24340679
/patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103/24846605

 
 

  • OJVM prechecks
[oracle@myclusterdb01]$ cd /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018OJVMPSU/24315824
[oracle@myclusterdb01]$ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -ph ./
[oracle@myclusterdb01]$

- Do a lsinventory -all_nodes before patching and save the output somewhere

[oracle@myclusterdb01]$ $ORACLE_HOME/OPatch/opatch lsinventory -all_nodes

 
 

  • Check disk_repair_time and set it to 24h

Oracle recommends to set this parameter to 8h. As we had issues in the past with a very long cell patching, we now use to set this parameter to 24h as Oracle has recommended us.
Please note that this prerequisite is only needed for a rolling patch application.

SQL> select dg.name as diskgroup, a.name as attribute, a.value from v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and (a.name like '%repair_time' or a.name = 'compatible.asm');

DISKGROUP ATTRIBUTE VALUE
-------------------- ---------------------------------------- ----------------------------------------
DATA disk_repair_time 3.6h
DATA compatible.asm 11.2.0.2.0
DBFS_DG disk_repair_time 3.6h
DBFS_DG compatible.asm 11.2.0.2.0
RECO_ disk_repair_time 3.6h
RECO compatible.asm 11.2.0.2.0

6 rows selected.

SQL> connect / as sysasm
Connected.
SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'disk_repair_time' = '24h' ;

Diskgroup altered.

SQL> ALTER DISKGROUP DBFS_DG SET ATTRIBUTE 'disk_repair_time' = '24h' ;

Diskgroup altered.

SQL> ALTER DISKGROUP RECO SET ATTRIBUTE 'disk_repair_time' = '24h' ;

Diskgroup altered.

SQL>

 
If one of this precheck points a problem, resolve it before heading to the next steps.
 
 
Now that everything is downloaded, unzipped, updated, we can safely jump to the patching procedure in part 2 !
 
 

Quick links to Part 1 / Part 2 / Part 3 / Part 4 / Part 5 / Part 6
 

Categories: DBA Blogs

Log Buffer #507: A Carnival of the Vanities for DBAs

Mon, 2017-03-20 07:58

This Log Buffer edition covers Oracle, SQL Server and MySQL blog posts with a touch of Cloud.

Oracle:

Provisioning EBS on Oracle Cloud: Latest Enhancements

ORA-54002 when trying to create Virtual Column using REGEXP_REPLACE on Oracle 12cR2

Business rules, common sense and query performance

Problem with V$RECOVERY_AREA_USAGE view and FRA space not being reclaimed

Updated SQLcl: 2 New Commands, Bug Fixes, and 12cR2 Oh My!

SQL Server:

Windowing Functions: Tell me when that changes

SQL Server Bulk Insert Row Terminator Issues

Introducing DLM Techniques for a Multi-Database Multi-Server System

Ding – The World’s Largest Mobile Top-up Network Streamlines QA with SQL Server Containers

Enable Trace Flags in SQL Server

MySQL:

Migrating MySQL database from Amazon RDS to DigitalOcean

Monitoring Databases: A Product Comparison

New collations in MySQL 8.0.0

How to store date, time, and timestamps in UTC time zone with JDBC and Hibernate

MySQL Connector/NET 7.0.7 m6 development has been released

Categories: DBA Blogs

The Side Effects of Drop ‘Unused’ Index

Tue, 2017-03-07 10:54

In this blog post I’ll demonstrate a scenario where dropping a potentially ‘unused’ index can have a negative influence on the optimizer’s cardinality estimation. Having columns with logical correlation and skewed data can cause difficulties for the optimizer when calculating cardinality. This post will also address what we can do to help the optimizer with better cardinality estimates.

The inspiration for this post was derived from a recent index usage review. One of the requirements was to determine how to confirm which unused indexes qualify to be dropped. I decided to do some tests regarding extended statistics and the effect of potentially dropping an ‘unused’ index. You will observe what kind of result may be seen from the drop of an index which has not been used. It’s important to remember that it does not apply in all cases.  Occasionally, even if the index is used, it doesn’t mean that it’s needed.

This is more or less linked to columns with skewed data and which might have logical relationship.
Hopefully, it can help you answer some of the following questions:

  1. Is the optimizer using the indexes behind scenes?
  2. While there are methods to determine if an index has been used in an execution plan, can an index be dropped on this basis only?
  3. If we drop composite index (constructed from correlated columns), can we do anything to avoid performance degradation?

Before we start with the use case, let’s briefly review some concepts.

The basic formula for selectivity is 1/NDV. The cardinality (CDN) is calculated as selectivity * total number of rows.

The selectivity of a join is defined as the selectivity of the most selective join column adjusted by the proportion of not null values in each join column.

Join Selectivity:
Sel = 1/max[NDV(t1.c1),NDV(t2.c2)] *
           ( (Card t1 - # t1.c1 NULLs) / Card t1) *
           ( (Card t2 - # t2.c2 NULLs) / Card t2)

Join Cardinality:
Card(Pj) = Card(T1) * Card(T2) * Sel(Pj)

In Oracle’s Doc ID 68992.1 you can find a more detailed explanation about different selectivity calculations based on different predicates. For simplicity, I will use equality predicate.

This blog post is divided in three sections.

  1. Use case where we demonstrate how drop of an “unused” index can mess up optimizer cardinality calculation.
  2. How to help optimizer for better cardinality estimation with extended statistics.
  3. More explanation on column correlation (CorStregth).
Preparation

For the first and second part I’ll be using the following configuration.

drop table t1 purge;
drop table t2 purge;
drop table t3 purge;

create table t1
as
select rownum pk,
    mod(level,  20) n1,
    mod(level, 200) n2,
    mod(level,1000) n3,
    lpad('x',42,'x') c1
from dual
connect by level <= 10000
/

create table t2
as
select rownum pk,
    mod(level,  20) n1,
    mod(level, 200) n2,
    mod(level,1000) n3,
    lpad('x',42,'x') c1
from dual
connect by level <= 10 
/ 

CREATE TABLE t3
AS
SELECT rownum pk, lpad('x',rownum,'x') c1
FROM dual
CONNECT BY level <= 2000 / exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'for all columns size 1');
exec dbms_stats.gather_table_stats(user,'T2',method_opt=>'for all columns size 1');
exec dbms_stats.gather_table_stats(user,'T3',method_opt=>'for all columns size 1');


alter table t1 add constraint t1pk primary key(pk);
alter table t2 add constraint t2pk primary key(pk);
alter table t3 add constraint t3pk primary key(pk);

create index t1ix on t1(n1,n2);
create index t2ix on t2(n1,n2);

It creates two tables (T1 and T2) with composite index on N1 and N2 columns for both tables.
The statistics are as follows:

--Column statistics
TABLE_NAME	     COLUMN_NAME	  NUM_DISTINCT	  COL_SEL
-------------------- -------------------- ------------ ----------
T1		     C1 			     1		1
T1		     N1 			    20	      .05
T1		     N2 			   200	     .005
T1		     N3 			  1000	     .001
T2		     C1 			     1		1
T2		     N1 			    10	       .1
T2		     N2 			    10	       .1
T2		     N3 			    10	       .1

--Index statistics
TABLE_NAME	     INDEX_NAME 	    NUM_ROWS DISTINCT_KEYS    IND_SEL
-------------------- -------------------- ---------- ------------- ----------
T1		     T1IX		       10000	       200	 .005
T2		     T2IX			  10		10	   .1

Table T1 has 10000 rows and T2 only 10 rows. I’ve created two composite indexes for T1 and T2 since I am using both (N1,N2) columns in selection. As you can see (from the data dictionary) we have 200 distinct values for T1IX index and 10 for T2IX index.
e.g. Based on the formula we can easily derive selectivity as 1/NDV, for T1IX index we have 1/200 = 0.005.

Drop of an “unused” index can mess up optimizer’s cardinality estimation

The query I’m going to use in this case is:

select t1.c1,t2.c1
from t1,t2
where t1.n1 = t2.n1
and t1.n2 = t2.n2
/

Let’s first check the amount of the cardinality (CDN) before we do any changes. The column we’re interested in is ROWS, that’s where cardinality estimation is shown.

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

Based on the formula provided, you may have expected to see cardinality (CDN) of 25. Since, we don’t have nulls we can do the calculation as 1/max(t1.n1,t2.n1) * 1/max(t1.n2,t2.n2) = 1/20 * 1/200 = 0.05 * 0.005 = 0.00025. We came up with selectivity of 0.00025.
Now, if we now do the join cardinality calculation 10 (T1) * 10000 (T2) * selectivity 0.00025 = 1000000 * 0.00025 = 25

But instead of 25, the optimizer calculated cardinality (CDN) of 500. In order to better understand the calculation, let’s review the 10053 event trace file.

  ColGroup (#1, Index) T1IX
    Col#: 1 2    CorStregth: 20.00
  ColGroup (#1, Index) T2IX
    Col#: 1 2    CorStregth: 10.00
Join ColGroups for T1[T1] and T2[T2] : (#1, #1)

What’s interesting here is the Join ColGroups analysis. Since, we have full match on the join with the indexed columns, the optimizer is considering Join ColGroups between T1 and T2. It will use ColGroups statistics to calculate join cardinality (CDN).

Col#: column position (self explanatory)
CorStregth: correlation strength. Will come back later on correlation strength (CorStregth).

Within the trace file in Nested Loop (NL) section (remember join cardinality is only listed in NL section) you should see something like:

Join selectivity using 1 ColGroups: 0.005000 (sel1 = 0.000000, sel2 = 0.000000)
Join Card:  500.000000 = outer (10.000000) * inner (10000.000000) * sel (0.005000)
Join Card - Rounded: 500 Computed: 500.000000

Based on the join selectivity formula, the optimizer chooses T1IX index (ColGroup) statistics for selectivity calculation, hence 1/NDV (T1IX index) = 1/200 = 0.005.
Afterwards, join cardinality is calculated as num_rows (T1) * num_rows (T2) * selectivity (0.005) or 10 * 10000 * 0.005 = 100000 * 0.005 = 500.

If you’re not aware of such cases, you may decide to drop an index based solely on the execution plan information. Even worse, the developers can easily decide to drop an index without knowing the possible effect.

Now, let’s observe the cardinality estimation after T1IX index is dropped.

SQL> drop index t1ix;

Index dropped.

...

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

After we dropped T1IX the cardinality dropped from 500 to 25! This can have big impact on the performance. If we add additional tables to the join the optimizer can possibly re-order tables, switch Hash Join to Nested Loops or Merge Joins.

So, why did the cardinality (CDN) change after “unused” index was dropped?

Even if the index was not used, it doesn’t mean that its statistics were also not used. That’s the case here. The optimizer used the NDV of the column group created by the T1IX index to calculate the join selectivity.

Lets again check the 10053 event trace file, now where the query has been executed without T1IX index.

  
ColGroup (#1, Index) T2IX
    Col#: 1 2    CorStregth: 10.00
Join ColGroups for T1[T1] and T2[T2] : Using cdn sanity check

...

ColGroup cardinality sanity check: ndv for  T1[T1] = 4000.000000  T2[T2] = 10.000000
Join selectivity using 1 ColGroups: 2.5000e-04 (sel1 = 0.000000, sel2 = 0.000000)
Join Card:  25.000000 = outer (10.000000) * inner (10000.000000) * sel (2.5000e-04)
Join Card - Rounded: 25 Computed: 25.000000

We don’t have full match for Join ColGroups (we’re missing the statistics from T1IX index). The optimizer decided to perform cardinality (CDN) sanity check where (after the sanity check) it ended up with a number of distinct values (NDV) of 4000.

How did the optimizer calculate 4000 ?

Since we don’t have column group, the optimizer did a basic calculation of N1 (sel) * N2 (sel) = 0.00025. Having the (N1,N2) selectivity, we can extract the NDV as 1/0.00025 = 4000.
The representation in the trace file is slightly different as 2.5000e-04 = 2.5000 * (1/10000) = 0.00025

Having incorrect cardinality estimate the optimizer can make wrong decision for join methods, as we can see in the following example when we join T3 table.

--query
select t1.c1,t2.c1,t3.c1
from t1,t2,t3
where t1.n1 = t2.n1
and t1.n2 = t2.n2
and t1.pk = t3.pk
/

Execution Plan
----------------------------------------------------------
Plan hash value: 1184213596

----------------------------------------------------------------------------
| Id  | Operation	    | Name | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |	   |   500 |   541K|   120   (0)| 00:00:01 |
|*  1 |  HASH JOIN	    |	   |   500 |   541K|   120   (0)| 00:00:01 |
|*  2 |   HASH JOIN	    |	   |   500 | 51500 |	30   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T2   |	10 |   490 |	 3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1   | 10000 |   527K|	27   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL | T3   |  2000 |  1964K|	90   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."PK"="T3"."PK")
   2 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

SQL> drop index t1ix; 

Index dropped.

Execution Plan
----------------------------------------------------------

Plan hash value: 142045272

-------------------------------------------------------------------------------------
| Id  | Operation		     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |	    |	 25 | 27725 |	 35   (0)| 00:00:01 |
|   1 |  NESTED LOOPS		     |	    |	 25 | 27725 |	 35   (0)| 00:00:01 |
|   2 |   NESTED LOOPS		     |	    |	 25 | 27725 |	 35   (0)| 00:00:01 |
|*  3 |    HASH JOIN		     |	    |	 25 |  2575 |	 30   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL	     | T2   |	 10 |	490 |	  3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL	     | T1   | 10000 |	527K|	 27   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN	     | T3PK |	  1 |	    |	  0   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID| T3   |	  1 |  1006 |	  1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")
   6 - access("T1"."PK"="T3"."PK")

Another example with different data set where there is a change in the row source used for building a hash table.

--query
SELECT t1.c1, t2.c1
FROM t1,t2,t3
WHERE t1.n1 = t2.n1
AND t1.n2 = t2.n2
AND t1.n1 = t3.pk
/

Execution Plan
----------------------------------------------------------
Plan hash value: 2855555999

-------------------------------------------------------------------------------
| Id  | Operation	       | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  5000K|  1525M|  2766	(1)| 00:00:01 |
|*  1 |  HASH JOIN	       |      |  5000K|  1525M|  2766	(1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL    | T2   |  1000 |   251K|    14	(0)| 00:00:01 |
|*  3 |   HASH JOIN	       |      |  1000K|    59M|  2739	(1)| 00:00:01 |
|   4 |    INDEX FAST FULL SCAN| T3PK |  2000 |  8000 |     3	(0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL   | T1   |  1000K|    55M|  2734	(1)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")
   3 - access("T1"."N1"="T3"."PK")

SQL> drop index t1ix; 

Index dropped.

...

Execution Plan
----------------------------------------------------------
Plan hash value: 1993272941

------------------------------------------------------------------------------
| Id  | Operation	      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |	 250K|	  76M|	2754   (1)| 00:00:01 |
|*  1 |  HASH JOIN	      |      |	 250K|	  76M|	2754   (1)| 00:00:01 |
|   2 |   INDEX FAST FULL SCAN| T3PK |	2000 |	8000 |	   3   (0)| 00:00:01 |
|*  3 |   HASH JOIN	      |      |	 250K|	  75M|	2750   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T2   |	1000 |	 251K|	  14   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL  | T1   |	1000K|	  55M|	2734   (1)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T3"."PK")
   3 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

Although this will not apply in all cases, be careful before deciding to drop an index. Although it may seem like a good idea initially, it can have big impact with wrong cardinality estimation. This could then lead to incorrect execution plan decisions, poor performance and unhappy customers.

So, what is the solution?
We don’t want huge indexes to be (DML) maintained if not used, right?

This brings me to the second part, Extended Statistics.

Extended Statistics

Oracle introduced extended statistics with Oracle 11g. Its main functionality is to highlight column group relation. The extended statistics helps the optimizer become more familiar with logical relation between column groups, otherwise unknown information. The extended statistics are represented as virtual (hidden) columns on which statistics are gathered.

When we had T1IX index, the optimizer used its NDV to calculate correct cardinality. The index statistics were used for the column group of N1 and N2 columns. However, after we dropped the index (even though it was never used), we’ve made a mess of the optimizer with its cardinality estimates.

Extended statistics are maintained by DBMS_STATS package, function CREATE_EXTENDED_STATS. Let’s create extended statistics and see the cardinality estimation.

SQL> select dbms_stats.create_extended_stats(null,'T1','(N1,N2)') from dual
  2  /

DBMS_STATS.CREATE_EXTENDED_STATS(NULL,'T1','(N1,N2)')
------------------------------------------------------------
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS

SQL> set autotrace trace exp
SQL> 
SQL> select t1.c1,t2.c1
  2  from t1,t2
  3  where t1.n1 = t2.n1
  4  and t1.n2 = t2.n2
  5  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |    25 |  2475 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

I know, you’re thinking “It didn’t help” and you’re right. The optimizer did not use the column group (extension). If you look a little deeper, USER_TAB_COL_STATISTICS you will see that the virtual column is not created, only the extension definition was created.

SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/

COLUMN_NAME				 NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
N1						   20 NONE
N2						  200 NONE
N3						 1000 NONE
C1						    1 NONE

SQL> col extension_name for a40
SQL> SELECT extension_name, extension
FROM   user_stat_extensions
WHERE  table_name = 'T1'
/

EXTENSION_NAME				 EXTENSION
---------------------------------------- --------------------
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS		 ("N1","N2")

This behavior is a bit expected, because DBMS_STATS.CREATE_EXTENDED_STATS will only create an extension. It won’t gather statistics for the extension. If you check the 10053 event trace file you will see NO STATISTICS (using defaults).

Column (#5): SYS_STUBZH0IHA7K$KEBJVXO5LOHAS(NUMBER)  NO STATISTICS (using defaults)

In order to fix this, we need to re-gather statistics on the table.

SQL> exec dbms_stats.gather_table_stats(null,'T1',method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/

COLUMN_NAME				 NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
N1						   20 NONE
N2						  200 NONE
N3						 1000 NONE
C1						    1 NONE
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS			  200 NONE


SQL> select t1.c1,t2.c1
from t1,t2
where t1.n1 = t2.n1
and t1.n2 = t2.n2
/

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|*  1 |  HASH JOIN	   |	  |   500 | 49500 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T2   |    10 |   490 |	3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   | 10000 |   488K|    26   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="T2"."N1" AND "T1"."N2"="T2"."N2")

Now the optimizer calculated the correct cardinality. If you check the trace file more extensively, you should see the same join cardinality information comparable to when we had the index in place. The difference would only be that this time it’s using the extension statistics.

...
  ColGroup (#1, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2    CorStregth: 20.00
  ColGroup (#1, Index) T2IX
    Col#: 1 2    CorStregth: 10.00

...

Join selectivity using 1 ColGroups: 0.005000 (sel1 = 0.000000, sel2 = 0.000000)
Join Card:  500.000000 = outer (10.000000) * inner (10000.000000) * sel (0.005000)
Join Card - Rounded: 500 Computed: 500.000000

With extended statistics we were able to help the optimizer to calculate cardinality (CDN) of 500. It used extension NDV in the same way it used T1IX index statistics.

However, what if T1 table was a big table and you already gathered statistics before creating the extension (column group)? Re-gathering can be time-consuming and have negative impact on the performance. To answer this question, we have to employ the METHOD_OPT approach from DBMS_STATS.GATHER_TABLE_STATS.
Creating extension via METHOD_OPT will automatically gather statistics on the column group.

SQL>
SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/  

COLUMN_NAME	     NUM_DISTINCT HISTOGRAM
-------------------- ------------ ---------------
C1				1 NONE
N3			     1000 NONE
N2			      200 NONE
N1			       20 NONE

SQL> exec dbms_stats.gather_table_stats(null,'T1',method_opt=>'for all columns size 1 for columns size 1 (N1,N2)');

PL/SQL procedure successfully completed.

SQL> col column_name for a40
SQL> SELECT column_name, num_distinct, histogram
FROM   user_tab_col_statistics
WHERE  table_name = 'T1'
/

COLUMN_NAME				 NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS			  200 NONE
C1						    1 NONE
N3						 1000 NONE
N2						  200 NONE
N1						   20 NONE

In this case an extension was automatically created and in the same time statistics were gathered.

If you find yourself in a situation were you used DBMS_STATS.CREATE_EXTENDED_STATS after gathering statistics, there are multiple ways to gather statistics on already existing extensions.

1) DBMS_STATS.GATHER_TABLE_STATS
(it will include all virtual columns)

2) METHOD_OPT=>’for columns size … (column group)’
(you can list the actual column group as you do for CREATE_EXTENDED_STATS)

3) METHOD_OPT=>’for columns size … extension_name’
(you can specify the extension name obtained from DBA|ALL|USER_STAT_EXTENSIONS dictionary view)

4) METHOD_OPT=>’for all hidden columns’
(hidden columns are by default included in default DBMS_STATS.GATHER_TABLE_STATS configuration)

What’s the difference between METHOD_OPT and DBMS_STATS.CREATE_EXTENDED_STATS?

I believe it behaves in that manner because METHOD_OPT is a part of a procedure that is designed for statistics gathering. Extended statistics (column groups) were introduced with 11g and are additional functionality to METHOD_OPT. The main functionality of it is to gather statistics, hence gathering statistics via METHOD_OPT creates the extension and gathers statistics at the same time.
Perhaps Oracle could have used different naming, something like CREATE_COLUMN_GROUP instead of CREATE_EXTENDED_STATS (which is misleading).

Column Correlation (CorStreght)

Column correlation represents how much the columns within a given group have mutual relation. It’s calculated as product of NDV of all columns in the group divided with the NDV of the column group. This is clearly explained in MultiColumn/Column Group Statistics – Additional Examples (Doc ID 872406.1).

Product of individual NDVs / NDV for the column group.

If the product of all columns NDV is smaller than CG NDV, the optimizer will ignore CG (extended) stats. Based on the tests I performed, the CorStregth is taken into consideration only when the optimizer has to decide between multiple column groups for which Full Match can be constructed. In the other cases, when Partial Match is used the optimizer it picked the column group with smaller NDV.

Here are two different cases; one where Full Match is constructed and another for Partial Match:

Column Group – Full Match

Let’s examine the first case where we can construct Full Match.

COLUMN_NAME		NUM_DISTINCT
---------------------------------------- ------------
N1						   20
N2						  200
N3						 1000
C1						    1
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS			  200 ---> (N1,N2)
SYS_STUXQEJ8WOFTK5R5EYM4DF_60V			 1000 ---> (N2,N3)

select *
from t1
where t1.n1 = 10
and t1.n2 = 40
and t1.n3 = 80
/

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |     5 |   270 |   228   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1	 |     5 |   270 |   228   (1)| 00:00:01 |
--------------------------------------------------------------------------

Trace file
...
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 20 Nulls: 0 Density: 0.050000 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#6): SYS_STUXQEJ8WOFTK5R5EYM4DF_60V(NUMBER)
    AvgLen: 12 NDV: 1000 Nulls: 0 Density: 0.001000
  Column (#5): SYS_STUBZH0IHA7K$KEBJVXO5LOHAS(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  ColGroup (#1, VC) SYS_STUXQEJ8WOFTK5R5EYM4DF_60V
    Col#: 2 3    CorStregth: 200.00
  ColGroup (#2, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2    CorStregth: 20.00
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.001000

We have T1 table with two column groups on (N1,N2) and (N2,N3).
In this case the product of the individual columns N1,N2,N3 NDV is much higher than column groups NDV, hence the optimizer will calculate the cardinality (CDN) based on the extensions. Having a selection with (N1,N2,N3) the optimizer can perform Full Match for both of the column groups. In this scenario, the cardinality can be expressed as:

1) Using CG #1 (N2,N3): 1/CG(NDV) * 1/N1(NDV) = 1/1000 * 1/20 = 0.001 * 0.05 = 0.00005 * 100000 = 5

2) Using CG #2 (N1,N2): 1/CG(NDV) * 1/N3(NDV) = 1/200 * 1/1000 = 0.005 * 0.0001 = 0.000005 * 100000 = 0.5 (rounded to 1)

The following instance demonstrates where the optimizer decided to use CG (N2,N3) or SYS_STUXQEJ8WOFTK5R5EYM4DF_60V because it has higher CorStregth of 200 (vs. 20 for CG (N1,N2)).
To show how CorStregth can influence the optimizer, I’ve changed N1 distinct value to 210 to increase CG (N1,N2) CorStregth.
Here, case column group (N1,N2) was used in cardinality estimates.

exec dbms_stats.set_column_stats(null,'T1','N1',distcnt=>210);

Execution Plan
----------------------------------------------------------

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation	   | Name | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	  |	1 |    11 |   228   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |	  |	1 |    11 |	       |	  |
|*  2 |   TABLE ACCESS FULL| T1   |	1 |    11 |   228   (1)| 00:00:01 |
---------------------------------------------------------------------------



--Trace file
...
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 210 Nulls: 0 Density: 0.004762 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#6): SYS_STUXQEJ8WOFTK5R5EYM4DF_60V(NUMBER)
    AvgLen: 12 NDV: 1000 Nulls: 0 Density: 0.001000
  Column (#5): SYS_STUBZH0IHA7K$KEBJVXO5LOHAS(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  ColGroup (#2, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2    CorStregth: 210.00
  ColGroup (#1, VC) SYS_STUXQEJ8WOFTK5R5EYM4DF_60V
    Col#: 2 3    CorStregth: 200.00
  ColGroup Usage:: PredCnt: 3  Matches Full: #2  Partial:  Sel: 0.005000

We can see that the optimizer chose CG #2 based on CorStregth.

Column Group – Partial Match

This second case shows what happens when Full Match can’t be constructed and the optimizer chooses the column group with smaller NDV. I’ve built the same T1 table, but now with Column Groups (N1,N2,N3) and (N1,N2,C1).

COLUMN_NAME		 NUM_DISTINCT
---------------------------------------- ------------
N1						   20
N2						  200
N3						 1000
C1						    1
SYS_STU2NGW2O#$ZX2CDZYOMNMFP64			  200
SYS_STUOYQUEIAZ7FI9DV53VLN$$$0			 1000


--Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |   500 | 27000 |   228   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1	 |   500 | 27000 |   228   (1)| 00:00:01 |
--------------------------------------------------------------------------



--Trace file
...
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 20 Nulls: 0 Density: 0.050000 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#6): SYS_STUOYQUEIAZ7FI9DV53VLN$$$0(NUMBER)
    AvgLen: 12 NDV: 1000 Nulls: 0 Density: 0.001000
  Column (#5): SYS_STU2NGW2O#$ZX2CDZYOMNMFP64(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#4): C1(VARCHAR2)
    AvgLen: 43 NDV: 1 Nulls: 0 Density: 1.000000
  ColGroup (#1, VC) SYS_STUOYQUEIAZ7FI9DV53VLN$$$0
    Col#: 1 2 3    CorStregth: 4000.00
  ColGroup (#2, VC) SYS_STU2NGW2O#$ZX2CDZYOMNMFP64
    Col#: 1 2 4    CorStregth: 20.00
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #2 (1 2 )  Sel: 0.005000
  Table: T1  Alias: T1

As CG #1 has way higher CorStregth of 4000 (vs. 20) the optimizer used CG #2. Let’s decrease CG #2 NDV to 100 to see whether it will influence the optimizer to use CG #1.

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation	  | Name | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	 |  1000 | 50000 |   228   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1	 |  1000 | 50000 |   228   (1)| 00:00:01 |
--------------------------------------------------------------------------



--Trace file
...
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#1): N1(NUMBER)
    AvgLen: 3 NDV: 20 Nulls: 0 Density: 0.050000 Min: 0.000000 Max: 19.000000
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 200 Nulls: 0 Density: 0.005000 Min: 0.000000 Max: 199.000000
  Column (#6): SYS_STUOYQUEIAZ7FI9DV53VLN$$$0(NUMBER)
    AvgLen: 12 NDV: 100 Nulls: 0 Density: 0.010000
  Column (#5): SYS_STU2NGW2O#$ZX2CDZYOMNMFP64(NUMBER)
    AvgLen: 12 NDV: 200 Nulls: 0 Density: 0.005000
  Column (#3): N3(NUMBER)
    AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 0.000000 Max: 999.000000
  Column (#4): C1(VARCHAR2)
    AvgLen: 43 NDV: 1 Nulls: 0 Density: 1.000000
  ColGroup (#1, VC) SYS_STUOYQUEIAZ7FI9DV53VLN$$$0
    Col#: 1 2 3    CorStregth: 40000.00
  ColGroup (#2, VC) SYS_STU2NGW2O#$ZX2CDZYOMNMFP64
    Col#: 1 2 4    CorStregth: 20.00
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #1 (1 2 ) Sel: 0.010000
  Table: T1  Alias: T1

Since CG #1 has smaller NDV (100) (vs. #2, 200), it calculated the cardinality (CDN) based on CG #1 as:
1/NDV (CG) * Original Cardinality = 1/100 * 100000 = 0.01 * 100000 = 1000

Summary

There are some situations where dropping a potentially ‘unused’ index can have a negative influence on the optimizer’s cardinality estimation. Consider using extended statistics to help the optimizer in order to become more familiar with columns logical relationship. The extended statistics are represented with virtual (hidden) column. Regarding statistics gathering, note that DBMS_STATS.CREATE_EXTENDED_STATS won’t gather statistics automatically, use METHOD_OPT instead. If you have skewed data, you may want to consider creating proper histograms. When histograms are on base columns, you’ll also need to create histograms on the column group in order to be taken into consideration. This is because Oracle gives priority to histograms. You can use column groups extended statistics to preserve cardinality stability as consequence from dropping an index. While it might not be appropriate for all cases, it’s a viable option for the optimizers help in good cardinality estimation.

Lastly, you can use DBMS_STATS.SEED_COL_USAGE procedure to monitor columns relation in filter predicates, join predicates and group by clauses. A report can be generated with DBMS_STATS.REPORT_COL_USAGE where its information can be used for appropriate column groups identification. I look forward to exploring more on that in a future post.

Categories: DBA Blogs

Reading Execution Plans for Parallel DML

Thu, 2017-02-23 20:36

Parallel execution in Oracle seems to be a topic of myths, mysteries and misunderstandings. In this post I hope to clear up at least one of them by diving into a discussion about parallel DML. Actually, I want to show you two things that can go wrong with it.

My demo setup and case are pretty simple, I just create two copies of dba_objects:

SQL> create table a as select * from dba_objects;

Table created.

SQL> create table b as select * from dba_objects;

Table created.

Now let’s delete from one of the tables with a subselect to the other one, and let’s make it parallel. When, if and how it makes sense to actually use parallel DML for a speedup is not the point of this post. In fact, this is a poor example as the overhead of extra communication and spinning up the processes just made the actual delete run slower.

SQL> delete /*+ PARALLEL(a 42) */ from a where object_id in (select object_id from b);

91277 rows deleted.

SQL> rollback;

And at first glance the execution plan for it seems to confirm that the operation was performed in parallel:

PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID	1q2m84vq594pj, child number 0
-------------------------------------
delete /*+ PARALLEL(a 42) */ from a where object_id in (select
object_id from b)

Plan hash value: 1475376193

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation	 	   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT	   |	      |       |       |   855 (100)|	      |        |      | 	   |
|   1 |  DELETE 		   | A	      |       |       | 	   |	      |        |      | 	   |
|   2 |   PX COORDINATOR	   |	      |       |       | 	   |	      |        |      | 	   |
|   3 |    PX SEND QC (RANDOM)	   | :TQ10002 | 91277 |   891K|   855	(1)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |
|*  4 |     HASH JOIN SEMI BUFFERED|	      | 91277 |   891K|   855	(1)| 00:00:01 |  Q1,02 | PCWP | 	   |
|   5 |      PX RECEIVE 	   |	      | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,02 | PCWP | 	   |
|   6 |       PX SEND HASH	   | :TQ10000 | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,00 | P->P | HASH	  |
|   7 |        PX BLOCK ITERATOR   |	      | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,00 | PCWC | 	   |
|*  8 | 	TABLE ACCESS FULL  | A	      | 91277 |   445K|   427	(1)| 00:00:01 |  Q1,00 | PCWP | 	   |
|   9 |      PX RECEIVE 	   |	      | 91278 |   445K|   427	(1)| 00:00:01 |  Q1,02 | PCWP | 	   |
|  10 |       PX SEND HASH	   | :TQ10001 | 91278 |   445K|   427	(1)| 00:00:01 |  Q1,01 | S->P | HASH	  |
|  11 |        PX SELECTOR	   |	      |       |       | 	   |	      |  Q1,01 | SCWC | 	   |
|  12 | 	TABLE ACCESS FULL  | B	      | 91278 |   445K|   427	(1)| 00:00:01 |  Q1,01 | SCWP | 	   |
--------------------------------------------------------------------------------------------------------------------

But a closer look reveals that the actual delete that we wanted to run in parallel is performed only after (above and to the left in the execution plan) the parallel part coordinator is done with the query. Also, because we only specified parallelism for Table A, the scan of Table B is being performed sequentially. The results are then distributed to parallel workers, and hash joined against the parallel query of A. Those results are buffered and returned to the coordinator which then serially performs the actual deletes. Sounds silly? It is. And the second issue is that even though the hint gets ignored for the DELETE part, it is still applied to the query and join, forcing a half-parallel operation.

Enable parallel DML mode

The issue to be aware of with parallel DML in Oracle is that it is disabled by default and you have to explicitly enable it in the session before running parallel DML operations.

And at least DBMS_XPLAN in 12c will also warn you about not using parallel DML in it’s output:

Note
-----
   - Degree of Parallelism is 42 because of hint
   - PDML is disabled in current session

So let’s try the same thing again, but enable parallel DML this time:

SQL> alter session enable parallel dml;

Session altered.

SQL> delete /*+ PARALLEL(a 42) */ from a where object_id in (select object_id from b);

91277 rows deleted.

SQL> rollback;

PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID	1q2m84vq594pj, child number 1
-------------------------------------
delete /*+ PARALLEL(a 42) */ from a where object_id in (select
object_id from b)

Plan hash value: 2691386263

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation			 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |	 TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT		 |	    |	    |	    |	438 (100)|	    |	    |	    |		 |
|   1 |  PX COORDINATOR 		 |	    |	    |	    |		 |	    |	    |	    |		 |
|   2 |   PX SEND QC (RANDOM)		 | :TQ10003 | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,03 | P->S | QC (RAND)  |
|   3 |    DELETE			 | A	    |	    |	    |		 |	    |  Q1,03 | PCWP |		 |
|   4 |     PX RECEIVE			 |	    | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,03 | PCWP |		 |
|   5 |      PX SEND HASH (BLOCK ADDRESS)| :TQ10002 | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,02 | P->P | HASH (BLOCK|
|*  6 |       HASH JOIN SEMI BUFFERED	 |	    | 91277 |	891K|	438   (1)| 00:00:01 |  Q1,02 | PCWP |		 |
|   7 |        PX RECEIVE		 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|   8 | 	PX SEND HASH		 | :TQ10000 | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | P->P | HASH	 |
|   9 | 	 PX BLOCK ITERATOR	 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWC |		 |
|* 10 | 	  TABLE ACCESS FULL	 | A	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWP |		 |
|  11 |        PX RECEIVE		 |	    | 91278 |	445K|	427   (1)| 00:00:01 |  Q1,02 | PCWP |		 |
|  12 | 	PX SEND HASH		 | :TQ10001 | 91278 |	445K|	427   (1)| 00:00:01 |  Q1,01 | S->P | HASH	 |
|  13 | 	 PX SELECTOR		 |	    |	    |	    |		 |	    |  Q1,01 | SCWC |		 |
|  14 | 	  TABLE ACCESS FULL	 | B	    | 91278 |	445K|	427   (1)| 00:00:01 |  Q1,01 | SCWP |		 |
--------------------------------------------------------------------------------------------------------------------------

Now we see the DELETE operation below the PX COORDINATOR, which means it gets executed in parallel by the parallel workers. B is still read serially because we only specified table A in the hint. Let me just add this 100% parallel plan for the sake of completeness…

PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID	02w21a0rgz02m, child number 1
-------------------------------------
delete /*+ PARALLEL(42) */ from a where object_id in (select object_id
from b)

Plan hash value: 149866034

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation			 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |	 TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT		 |	    |	    |	    |	 23 (100)|	    |	     |	    |		 |
|   1 |  PX COORDINATOR 		 |	    |	    |	    |		 |	    |	     |	    |		 |
|   2 |   PX SEND QC (RANDOM)		 | :TQ10003 | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,03 | P->S | QC (RAND)  |
|   3 |    DELETE			 | A	    |	    |	    |		 |	    |  Q1,03 | PCWP |		 |
|   4 |     PX RECEIVE			 |	    | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,03 | PCWP |		 |
|   5 |      PX SEND HASH (BLOCK ADDRESS)| :TQ10002 | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,02 | P->P | HASH (BLOCK|
|*  6 |       HASH JOIN SEMI BUFFERED	 |	    | 91277 |	891K|	 23   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|   7 |        PX RECEIVE		 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|   8 | 	PX SEND HASH		 | :TQ10000 | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | P->P | HASH	 |
|   9 | 	 PX BLOCK ITERATOR	 |	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWC |		 |
|* 10 | 	  TABLE ACCESS FULL	 | A	    | 91277 |	445K|	 11   (0)| 00:00:01 |  Q1,00 | PCWP |		 |
|  11 |        PX RECEIVE		 |	    | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,02 | PCWP |		 |
|  12 | 	PX SEND HASH		 | :TQ10001 | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,01 | P->P | HASH	 |
|  13 | 	 PX BLOCK ITERATOR	 |	    | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,01 | PCWC |		 |
|* 14 | 	  TABLE ACCESS FULL	 | B	    | 91278 |	445K|	 11   (0)| 00:00:01 |  Q1,01 | PCWP |		 |
--------------------------------------------------------------------------------------------------------------------------
Conclusion

Keep in mind that parallel DML has to be enabled at the session level specifically. And when reviewing execution plans with parallel DML, make sure the update, insert or delete operation is below the PX coordinator or else you will serialize again before performing the actual operation.

Categories: DBA Blogs

Oracle New Public Cloud Licensing Policy – Good or Bad?

Wed, 2017-02-22 11:05

A little while ago after a question from a customer about supporting Oracle products on the Google Cloud Platform (GCP) I decided to take a look for any updates to the Oracle public cloud support policies. The document can be easily found on the Oracle website. I quickly noticed some significant changes in the new version of the document. More specifically, I’m referring to the changes that came with the latest version of that document dated January 23, 2017.

In this blog I will attempt to summarize the findings and my thoughts about the matter. Before proceeding any further, let’s begin with a safe harbor notice (as Oracle does) and mention that I do not work for Oracle and I am not a licensing expert. Some of my observations may be incorrect and everyone reading this post is strongly encouraged to make a fully informed decision only after consultation with an Oracle sales representative. After reading the Oracle licensing policy, you should bear in mind that the document provides only guidelines and it is published for “education purposes only”.

So, what do we have in the new edition? The document shares details about the Oracle licensing policy on public cloud environments including AWS EC2, AWS RDS and Microsoft Azure platforms. Alas, still no mention of Google Cloud Platform (GCP). It leaves GCP out of charted territory and even though it doesn’t explicitly prohibit you from moving your Oracle products to GCP, it makes it difficult to estimate the impact and cost.

The first paragraph has a link to listing of all Oracle products where the policy applies. Notably, the document explicitly lists almost all Oracle Database Enterprise Edition options and packs except Oracle RAC and Multitenant. If the absence of Oracle RAC may have some technical justifications, then the exclusion of the Multitenant option doesn’t make too much sense for me.

The next paragraph reveals a lot of changes. The new version of the document officially recognizes AWS vCPU as a thread, not as a core. Prior to January 23 2017, we used to have an AWS published document showing vCores by instance type for licensing calculations, and it was widely-used even though it was never officially provided by Oracle. People used number of cores from the document and applied the Oracle multi-core factor table on top of the cores count. There was never a similar document for Azure consequently considered a vCPU as a vCore and the same calculation using the multi-core factor table. The new version of the Oracle document now states explicitly that the two vCPU on AWS have the same licensing cost as one vCPU on Azure.

It’s explicitly stated in the document that either two vCPU from AWS or one vCPU from Azure are equivalent to one Oracle CPU license. Another statement confirms that from now on, the Oracle multi-core factor no longer applies for the mentioned public cloud environments. This can be a serious impact to people migrating or planning to migrate to AWS or Azure using Bring Your Own License (BYOL) policy. They may now find themselves in a difficult position. Either you plan your migration to a smaller environment or increase your licensing cost. In this case, it’s important to keep in mind that the size of AWS EC2 instance may have direct impact not only to CPU power but to maximum available I/O on the storage layer.

Additionally, there is now a section containing formally defined rules for Oracle Database Standard Edition and Standard Edition 2. According to the paper, we count every four vCPUs on AWS or two vCPUs on Azure as one socket. This means that you cannot have more than 8 AWS vCPUs for a Standard Edition 2 (SE2) license. Standard Edition (SE) allows you to have a 16 vCPU machine, and hence still provides more opportunities when sizing EC2 or RDS . Unfortunately, the SE is only available up to version 12.1.0.1 and support for that release is coming to an end. So what can still be used for an Oracle SE2 on AWS? We can get one of the *.2xlarge instances on EC2 or pick up a similarly sized RDS instance. Is that going to be big enough? That depends on your workload profile, but again, keep in mind IO limitations per instance type. It is going to be a maximum of 8000 IOPS per the instance. Not a small number, but you will need to measure and identify your IO requirements before going to the cloud.

On one hand, the new policies are way more clear and direct than they used to be and I believe that the clarification is good. It is always easier to plan your implementations and budget when you are confident of what to expect. On the other hand, it looks like we have to pay twice as much in licensing fees when moving to AWS or Azure when compared with any bare metal or OVM environment on premises. Will it make Oracle products more attractive for customers? I have some doubts that it will. Will it make the Oracle Cloud a more interesting target platform for cloud migrations? Possibly. That is likely the main goal of Oracle but we’ve yet to see if it works out for them as expected. I liked it when Oracle made Oracle Express Edition (XE) available for everyone for free and when Oracle Standard Edition came with the RAC option at no additional costs. While I don’t have any official numbers, I think that the Express Edition and RAC included with SE turned many customers onto Oracle products. However, I’m afraid that the new licensing policy for cloud may do the opposite and turn some people away from Oracle and consequently play out really badly for Oracle in the long term.

Categories: DBA Blogs

Log Buffer #505: A Carnival of the Vanities for DBAs

Tue, 2017-02-21 15:16

This Log Buffer Edition searches through various blogs of Oracle, SQL Server and MySQL and picks a few contemporary ones.

Oracle:

Comma separated search and search with check-boxes in Oracle APEX

Once you have defined your users for your Express Cloud Service, all users with the role of Database Developer or higher can access the database Service Console.

Big Data Lite 4.7.0 is now available on OTN!

Install and configure Oracle HTTP Server Standalone

Can I Customize EBS on Oracle Cloud?

SQL Server:

vCenter Server fails to start, Purge and Shrink Vcenter SQL Database

Introducing a DevOps Workgroup to Farm Credit Services of America

Scaling out SSRS on SQL Server Standard Edition

Using AT TIME ZONE to fix an old report

How to import data to Azure SQL Data Warehouse using SSIS

MySQL:

MySQL Bug 72804 Workaround: “BINLOG statement can no longer be used to apply query events”

Sysadmin 101: Troubleshooting

Making MaxScale Dynamic: New in MariaDB MaxScale 2.1

Improving the Performance of MariaDB MaxScale

Group Replication: Shipped Too Early

Categories: DBA Blogs

Backup Oracle Databases to AWS S3

Tue, 2017-02-21 10:17

There are different options for backing up Oracle databases to Cloud, but using Oracle Secure Backup module to take backups into AWS S3 is one of the most efficient methods in terms of costs and backup/restore performance.

In this post I will show you how to install, configure and use Oracle Secure Backup to take your Oracle database backups to AWS S3. This method can be used for Oracle database version 9.2 or higher.

In this example, database version is 12c and platform is Linux x86_64.

Oracle Secure Backup module must be installed into database Oracle Home. Using installed libraries you can then take backups via RMAN into AWS S3 the same way you backup to sbt_tape.

Requirements:

1- An AWS account and an IAM user with access to S3:

For setting up backups to AWS you will require an AWS account and an IAM user with full access to AWS S3. During setup Access Keys and Secret Access Key of this IAM user will be used. There is no need to have access to AWS Console.

You can use AWS Free tire for test purposes.

2- Oracle Secure Backup module for AWS:
You can download Oracle Secure Backup module for AWS from here

3- OTN account:
During installation you need to provide an OTN account.

4- Java 1.7 or higher:
Java 1.7 or higher must be installed on your server before you can proceed.

Installation:

1- Create Oracle Wallet Directory:

If Oracle Wallet directory does not exist, create one. This folder will be used to store AWS Access Keys and Secret Access Key.
Create this directory in $ORACLE_HOME/dbs/:


   $ cd $ORACLE_HOME/dbs/
   $ mkdir osbws_wallet

2- Download osbws_installer.zip from the link provided above and put in your installation folder, in this example /mnt/stage/osb , unzip the compressed file and you will have two files as shown below:


   $ pwd
   /mnt/stage/osb
   $ unzip osbws_installer.zip
   Archive:  osbws_installer.zip
     inflating: osbws_install.jar
     inflating: osbws_readme.txt
   $ ls
   osbws_installer.zip  osbws_install.jar  osbws_readme.txt

3- Install OSB Cloud Module for Amazon S3 into your Oracle Home:


   $ cd /mnt/stage/osb
   $ java -jar osbws_install.jar -AWSID XxXxX -AWSKey XxXxX -walletDir $ORACLE_HOME/osbws_wallet -libDir $ORACLE_HOME/lib -location ap-southeast-2 -awsEndPoint  s3-ap-southeast-2.amazonaws.com  -otnUser bakhshandeh@pythian.com -otnPass

Parameters that you will need to set for installation are as below:


  -AWSID:       AWS Access Key

  -AWSKey:      AWS Secret Access Key

  -walletDir:   Location where Backup Module will store AWS keys

  -libDir:      Location where Backup Module libraries will be installed

  -location:    This is AWS S3 location where you want to put your backups into. 
                Value for this parameter must be a valid Region from Amazon Regions.
                In this example "ap-southeast-2" which is region for "Asia Pacific (Sydney)" has been used

  -awsEndPoint: This should be valid end-point from location AWS region specified by "location" parameter
                In this example "s3-ap-southeast-2.amazonaws.com" has been used which is one of the end-points in ""Asia Pacific (Sydney)""

  -otnUser:     OTN Account

  -otnPass:     OTN Password

In my example I did not pass any value for -otnPass parameter and this was the only workaround I found for the error noted below during my tests:


   Downloading Oracle Secure Backup Web Service Software Library from file osbws_linux64.zip.
   Error: Library download failed. Please check your network connection to Oracle Public Cloud.

When I encountered this error I could only fix the issue by passing no value for otnPass, but it might work for you.

Running Backup using RMAN:

Installation will create a file in $ORACLE_HOME/dbs which is usually named osb<SID>.ora and you need to use full path of this file in your allocate channel command in RMAN.

In my example SID is KAMRAN


   $ cd $ORACLE_HOME/dbs
   $ pwd
   /apps/oracle/product/12.1.0.2/db_1/dbs
   $ ls -al osb*.ora
   -rw-r--r-- 1 oracle oinstall 194 Jan  5 11:31 osbwsKAMRAN.ora
   $

Content of this file is as below:


   $ cat osbwsKAMRAN.ora
   OSB_WS_HOST=http://s3-ap-southeast-2.amazonaws.com
   OSB_WS_LOCATION=ap-southeast-2
   OSB_WS_WALLET='location=file:/apps/oracle/product/12.1.0.2/db_1/dbs/osbws_wallet CREDENTIAL_ALIAS=gordon-s_aws'
   $

This file can be used for any other database in same Oracle Home. For this reason,I renamed it to osbwsCONFIG.ora so that name is generic and there is no dependency to any of databases.

mv osbwsKAMRAN.ora osbwsCONFIG.ora

I will use osbwsCONFIG.ora in RMAN channel settings.

Now you just need to allocate a channel for your backup/restore commands as below using above file as below:


   allocate channel c1 device type sbt parms='SBT_LIBRARY=libosbws.so,SBT_PARMS=(OSB_WS_PFILE=/apps/oracle/product/12.1.0.2/db_1/dbs/osbwsCONFIG.ora)';

This is a complete example which shows backup piece details and how they have been located in AWS S3 regions you specified during installation:


   $ . oraenv
   KAMRAN
   $ rman target /

   Recovery Manager: Release 12.1.0.2.0 - Production on Wed Dec 28 11:21:48 2016

   Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

   connected to target database: KAMRAN (DBID=283560064)

   RMAN>run{
   2> allocate channel c1 device type sbt parms='SBT_LIBRARY=libosbws.so,SBT_PARMS=(OSB_WS_PFILE=/apps/oracle/product/12.1.0.2/db_1/dbs/osbwsCONFIG.ora)';
   3> backup datafile 1;
   4> }

   released channel: ORA_DISK_1
   allocated channel: c1
   channel c1: SID=374 instance=KAMRAN device type=SBT_TAPE
   channel c1: Oracle Secure Backup Web Services Library VER=3.16.11.11

   Starting backup at 28-DEC-16
   channel c1: starting full datafile backup set
   channel c1: specifying datafile(s) in backup set
   input datafile file number=00001 name=+DATA/KAMRAN/DATAFILE/system.258.887023011
   channel c1: starting piece 1 at 28-DEC-16
   channel c1: finished piece 1 at 28-DEC-16
   piece handle=09rojka7_1_1 tag=TAG20161228T112807 comment=API Version 2.0,MMS Version 3.16.11.11
   channel c1: backup set complete, elapsed time: 00:00:45
   channel c1: starting full datafile backup set
   channel c1: specifying datafile(s) in backup set
   including current control file in backup set
   including current SPFILE in backup set
   channel c1: starting piece 1 at 28-DEC-16
   channel c1: finished piece 1 at 28-DEC-16
   piece handle=0arojkbl_1_1 tag=TAG20161228T112807 comment=API Version 2.0,MMS Version 3.16.11.11
   channel c1: backup set complete, elapsed time: 00:00:07
   Finished backup at 28-DEC-16
   released channel: c1

   RMAN> list backup tag TAG20161228T112807;


   List of Backup Sets
   ===================


   BS Key  Type LV Size       Device Type Elapsed Time Completion Time
   ------- ---- -- ---------- ----------- ------------ ---------------
   9       Full    741.75M    SBT_TAPE    00:00:38     28-DEC-16
           BP Key: 9   Status: AVAILABLE  Compressed: NO  Tag: TAG20161228T112807
           Handle: 09rojka7_1_1   Media: s3-ap-southeast-2.amazonaws.com/oracle-data-gordonsm-ap1
     List of Datafiles in backup set 9
     File LV Type Ckp SCN    Ckp Time  Name
     ---- -- ---- ---------- --------- ----
     1       Full 58915843   28-DEC-16 +DATA/KAMRAN/DATAFILE/system.258.887023011

   BS Key  Type LV Size       Device Type Elapsed Time Completion Time
   ------- ---- -- ---------- ----------- ------------ ---------------
   10      Full    22.50M     SBT_TAPE    00:00:01     28-DEC-16
           BP Key: 10   Status: AVAILABLE  Compressed: NO  Tag: TAG20161228T112807
           Handle: 0arojkbl_1_1   Media: s3-ap-southeast-2.amazonaws.com/oracle-data-gordonsm-ap1
     SPFILE Included: Modification time: 26-DEC-16
     SPFILE db_unique_name: KAMRAN
     Control File Included: Ckp SCN: 58915865     Ckp time: 28-DEC-16

   RMAN>
Some performance statistics:

I used a 340G database for testing performance and tried full backups into AWS S3 using different number of channels.
First, I allocated two channels and backup to AWS was complete in 48 minutes. I then tried four channels, and the backup to AWS was completed in 27 minutes.

I predicted that by increasing the number of channels to eight, would make backup complete faster. Surprisingly, with 8 channels backup completed in 27 minutes (which was exactly the same result when I used four channels).
So in my case, the optimum number of channels for taking backups to AWS S3 was four.

I should mention that same database when backed up to NFS disks using four channels it completed in 22 minutes, so backup time of 27 minutes to AWS was acceptable.

Restore was even faster. I tried restore without recovering the database, same 340G database full restore of databases from AWS backups completed in 22 minutes which again is acceptable.

Categories: DBA Blogs

Log Buffer #504: A Carnival of the Vanities for DBAs

Tue, 2017-02-14 19:15

This edition of Log Buffer covers Oracle, Microsoft SQL Server, and MySQL.

Oracle:

Data Pump or Data Pain Part02 – tablespace/user

Monday Spotlight: It’s About Security – Oracle Exadata SL6

0001_specify_date_format_mask.md copy

OGG Custom Adapters: How to include a unique identifier for every record in custom adapter?

One of the optimizer enhancements that appeared in 12.2 for SQL is the “band join” that makes certain types of merge join much more efficient.

Microsoft SQL Server:

How to import data to Azure SQL Data Warehouse using SSIS

Using Index-Organized Tables in Oracle

Query Performance Tuning – A Methodical Approach

SQL Server Temporal Tables: How-To Recipes

Fix Commvault Backup Failures After an Upgrade

MySQL:

MySQL may return results in non-deterministic order with ‘order by’

Shinguz: MySQL and MariaDB authentication against pam_unix

MySQL in the Cloud – Pros and Cons of Amazon RDS

With 500+ VividCortex Users, Shopify Eliminates High Latency Queries From Redis and MySQL

An Introduction to MariaDB’s Data at Rest Encryption (DARE)

Categories: DBA Blogs

Pages