Skip navigation.

Yann Neuhaus

Syndicate content
All blog entries from
Updated: 8 hours 57 min ago

SQL Server 2014: Readable Secondary Replica cannot access In-Memory objects

Sun, 2014-02-09 21:09

In-Memory OLTP is one of the best new functionalities of SQL Server 2014 - in the same way as AlwaysOn with SQL Server 2012. Indeed, In-Memory OLTP is fully integrated with AlwaysOn and also with AlwaysOn Availability Group. During a test of AlwaysOn Availability Group under SQL Server 2014 CTP2, I discovered that a Readable Secondary Replica was not able to access the In-Memory OLTP objects. I have a Readable Secondary Replica named pc1 with a database containing two tables: one disk table and one In-Memory table.




When I try to query the disk table to discover how many rows it contains, I receive the following result:




But if I try to access my In-Memory table, it fails...




Apparently, it is impossible to access In-Memory objects from a Readable Secondary replica.

Bad news! The Readable Secondary feature is very interesting for reporting and if we cannot query all objects from our database, it is really restricted.

After some research I found the solution. A trace flag will solve this issue. To enable it, you have two solutions, as usual:


Via the SQL Server Configuration Manager

  • Open it
  • Navigate to SQL Server Services
  • Right click on your SQL Server instance name and select Properties
  • In pane "Startup Parameters", specify a startup parameter, here: -T9989
  • Click the Add button and Apply
  • You will have to restart your SQL Server Services



Via script

  • Open a new query screen
  • Type: DBCC TRACEON(9989,-1) to enable the trace flag for the complete instance
  • Type: DBCC TRACEON(9989) to enable the trace flag for the current session
  • Execute the script


Now, our new trace flag is enabled and we are able to query the Memory Optimized Tables and also use the Natively Compiled Stored Procedure.




Hopefully there is a way to bypass this restriction and use the full functionality of Readable Secondary.

I will try to go through all new available trace flags for SQL Server 2014 in my following blog postings ;-)

Oracle is hanging? Don't forget hanganalyze !

Thu, 2014-02-06 20:16
sqlplus / as sysdba
 oradebug setmypid
 oradebug unlimit
 oradebug hanganalyze 3
 oradebug dump ashdumpseconds 30
 oradebug systemstate 266
 oradebug tracefile_name

Your Oracle database - production DB, of course - is hanging. All users are blocked. You quickly check the obvious suspects (archivelog destination full, system swapping, etc.) but it's something else. Even you, the Oracle DBA, cannot do anything: any select is hanging. And maybe you're even not able to connect with a simple 'sqlplus / as sysdba'.

What do you do ? There may be several ways to investigate deeper (strace or truss for example) but it will take time. And your boss is clear: the only important thing is to get the production running again as soon as possible. No time to investigate. SHUTDOWN ABORT and restart.

Ok, but now that everything is back to normal, your boss rules has changed: the system was down for 15 minutes. We have to provide an explanation. Root Cause Analysis.

But how will you investigate now ? You have restarted everything, so all V$ information is gone. You have Diagnostic Pack ? But the system was hanged: no ASH information went to disk. You can open an SR but what information will you give?

Hang Analyze

The next time it happens, you need to have a way to get some information that can be analyzed post mortem. But you need to be able to do that very quickly just before your boss shouts 'shutdown abort now'. And this is why I've put it at the begining of the post, so that you can find it quickly if you need it...

That takes only a few seconds to generate all post-mortem necessary information. If you can take 1 more minute, you will even be able to read the first lines of hanganalyze output, and you will be able to identify a true hanging situation and maybe just kill the root of the blocking sessions instead of a merciless restart.

In order to show you the kind of output you get, I've run a few jobs locking the same resources (TM locks) - which is not a true hanging situation because the blocking session can resolve the situation.

Here is the first lines from the oradebug hanganalyze:

Chains most likely to have caused the hang:
 [a] Chain 1 Signature: 'PL/SQL lock timer'<='enq: TM - contention'<='enq: TM - contention'
     Chain 1 Signature Hash: 0x3b645f57
 [b] Chain 2 Signature: 'PL/SQL lock timer'<='enq: TM - contention'<='enq: TM - contention'
     Chain 2 Signature Hash: 0x3b645f57
 [c] Chain 3 Signature: 'PL/SQL lock timer'<='enq: TM - contention'<='enq: TM - contention'
     Chain 3 Signature Hash: 0x3b645f57
You know what is at the top of the blocking chain. Here the first job that locked the table was idle, using dbms_lock.sleep.
And just below you have all information about the sessions involved:
Chain 1:
    Oracle session identified by:
                instance: 1 (demo.demo)
                   os id: 7929
              process id: 42, oracle@vboxora12c (J002)
              session id: 23
        session serial #: 7
    is waiting for 'enq: TM - contention' with wait info:
                      p1: 'name|mode'=0x544d0003
                      p2: 'object #'=0x1737c
                      p3: 'table/partition'=0x0
            time in wait: 3.142454 sec
           timeout after: 36.857546 sec
                 wait id: 10
                blocking: 0 sessions
            wait history:
              * time between current wait and wait #1: 0.027475 sec
              1.       event: 'jobq slave wait'
                 time waited: 0.459162 sec
                     wait id: 9
              * time between wait #1 and #2: 0.000026 sec
              2.       event: 'jobq slave wait'
                 time waited: 0.500681 sec
                     wait id: 8
              * time between wait #2 and #3: 0.000030 sec
              3.       event: 'jobq slave wait'
                 time waited: 0.500928 sec
                     wait id: 7
    and is blocked by [...]
System State That was very useful information. We can have more (more process dumps) with higher level. But if we have a serious situation where we will restart the database, It's better to get a systemstate dump with level 266 - so that you have something comprehensive to provide to Oracle Support.
Systemstate has all information about System Objects (sessions, processes, ...) but you have to navigate into it in order to understand the wait chain. In my example:
SO: 0x914ada70, type: 4, owner: 0x91990478, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
 proc=0x91990478, name=session, file=ksu.h LINE:13580, pg=0 conuid=0
(session) sid: 23 ser: 7 trans: 0x8ea8e3e8, creator: 0x91990478
service name: SYS$USERS
client details:
O/S info: user: oracle, term: UNKNOWN, ospid: 7929
machine: vboxora12c program: oracle@vboxora12c (J002)
Current Wait Stack:
 0: waiting for 'enq: TM - contention'
    name|mode=0x544d0003, object #=0x1737c, table/partition=0x0
    wait_id=10 seq_num=11 snap_id=1
    wait times: snap=15.991474 sec, exc=15.991474 sec, total=15.991474 sec
    wait times: max=40.000000 sec, heur=15.991474 sec
    wait counts: calls=6 os=6
    in_wait=1 iflags=0x15a0
There is at least one session blocking this session.
Dumping 1 direct blocker(s):
  inst: 1, sid: 254, ser: 5
Dumping final blocker:
  inst: 1, sid: 256, ser: 5
This is a session that is waiting, and we have the final blocker: inst: 1, sid: 256, ser: 5

Then we get to the final blocker by searching the sid: 256:

SO: 0x9168a408, type: 4, owner: 0x9198d058, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
 proc=0x9198d058, name=session, file=ksu.h LINE:13580, pg=0 conuid=0
(session) sid: 256 ser: 5 trans: 0x8ea6b618, creator: 0x9198d058
service name: SYS$USERS
client details:
O/S info: user: oracle, term: UNKNOWN, ospid: 7925
machine: vboxora12c program: oracle@vboxora12c (J000)
Current Wait Stack:
 0: waiting for 'PL/SQL lock timer'
    duration=0x0, =0x0, =0x0
    wait_id=0 seq_num=1 snap_id=1
    wait times: snap=25.936165 sec, exc=25.936165 sec, total=25.936165 sec
    wait times: max=50.000000 sec, heur=25.936165 sec
    wait counts: calls=1 os=9
    in_wait=1 iflags=0x5a0
There are 5 sessions blocked by this session.
Dumping one waiter:
  inst: 1, sid: 254, ser: 5
  wait event: 'enq: TM - contention'
    p1: 'name|mode'=0x544d0004
    p2: 'object #'=0x1737c
    p3: 'table/partition'=0x0
  row_wait_obj#: 95100, block#: 0, row#: 0, file# 0
  min_blocked_time: 19 secs, waiter_cache_ver: 44

Analysing the System State takes much longer than the hanganalyze, but it has more information.


When the blocking situation is not so desesperate, but you just want to see what is blocking, the hanganalyze information is also available online in V$WAIT_CHAINS. The advantage over ASH is that you see all processes (not only foreground, not only active ones).

Here is an example:

CHAIN_ID CHAIN CHAIN_SIGNATURE INSTANCE OSID PID SID BLOCK 1 FALSE 'PL/SQL lock timer' <='enq: TM - contention' <='enq: TM - contention' 1 7929 42 23 TRUE 1 FALSE 'PL/SQL lock timer' <='enq: TM - contention' <='enq: TM - contention' 1 7927 41 254 TRUE 1 FALSE 'PL/SQL lock timer' <='enq: TM - contention' <='enq: TM - contention' 1 7925 39 256 FALSE 2 FALSE 'PL/SQL lock timer' <='enq: TM - contention' <='enq: TM - contention' 1 7933 46 25 TRUE 3 FALSE 'PL/SQL lock timer' <='enq: TM - contention' <='enq: TM - contention' 1 7931 45 260 TRUE 4 FALSE 'PL/SQL lock timer' <='enq: TM - contention' <='enq: TM - contention' 1 7935 47 262 TRUE ASH Dump

There is something else that you can get if you have Diagnostic Pack. The ASH information can be dumped to trace file even if it cannot be collected in the database.

oradebug dump ashdumpseconds 30

that will gather ASH from latest 30 seconds, and the trace file will even have the sqlldr ctl file to load it in an ASH like table.

sqlplus -prelim

But what can you do if you can't even connect / as sysdba ? There is the 'preliminary connection' that does not create a session: sqlplus -prelim / as sysdba
With that you will be able to get a systemstate. You will be able to get a ashdump. But unfortunately, since you cannot get a hanganalyze:

ERROR: Can not perform hang analysis dump without a process state object and a session state object.

But there is a workaround for that (from Tanel Poders's blog): try to use a session that is already connected.
For exemple I use the DIAG background process (it's better not to use vital processes for that)

SQL> oradebug setorapname diag
Oracle pid: 8, Unix process pid: 7805, image: oracle@vboxora12c (DIAG)
Core message Even in hurry,
  • Always check an hanganalyze to understand the problem.
  • Always get a systemstate before a shutdown abort.
and you will have information to investigate later, or to provide to Oracle Support.

SQL Server 2014: Rebuilding Clustered Columnstore Indexes

Thu, 2014-01-30 01:35

I was surprised that you have to rebuild indexes that are stored In Memory. My previous posting "SQL Server 2014 - New Features: xVelocity memory optimized columnstore index" explains all processes connected with columnstore indexes. In this article, I will explain why, when, and how to rebuild them.

Archivelog deletion policy for Standby Database in Oracle Data Guard

Mon, 2014-01-27 18:56

Do you use ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY for your Oracle databases in Data Guard? Maybe you also use the Fast Recovery Area as archive log destination. That's good practice! But did you ever check that it works as expected?

What I mean is this:

  • The archived logs that you don't need are reclaimable by the FRA when space is needed
  • And the archived logs that are required for availability (standby or backup) are not deleted.

It's not an easy thing to check because Oracle doesn't show which archive log is reclaimable. Only the total reclaimable space is shown in v$recovery_area_usage. But that is not sufficient to validate which archivelog sequence is concerned. I'll show you below a query that returns the reclaimable status from the archived logs. And you will see that until 12c the APPLIED ON ALL STANDBY does not work as expected. You've probably seen a FRA full at standby site and solved it by deleting archived logs. But this is not the right solution because the FRA is supposed to do that.

Let's look at an example I encountered recently. The archivelog deletion policy is set correctly:


RMAN> show archivelog deletion policy;

RMAN configuration parameters for database with db_unique_name DATABASE_SITE2 are:


This configuration, an 11g feature, allows to delete an archive log as soon as it is applied to all standby destinations. Note that it works if I manually do a 'delete archivelog all;' but I expect that the archivelogs in the FRA becomes reclaimable automatically.

Unfortunately, this is not the case and the FRA is growing:


SQL> select * from v$recovery_area_usage where file_type='ARCHIVED LOG';

-------------------- ------------------ ------------------------- ---------------
ARCHIVED LOG                      61.11                     43.02             467


Let's check everything. We are on the standby database:


SQL> select open_mode,database_role from v$database;

-------------------- ----------------


The archivelogs are going to the Fast Recovery Area:


SQL> show parameter log_archive_dest_1
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_1                   string      location=USE_DB_RECOVERY_FILE_
                                                 DEST, valid_for=(ALL_LOGFILES,


All archived logs are applied (we are in SYNC AFFIRM):


DGMGRL> show database 'DATABASE_SITE2';


  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF

Database Status:


Well, with that configuration, I expect that all archivelogs are reclaimable - except the current one.

Let's investigate. V$RECOVERY_AREA_USAGE is an aggregate view. If we check its definition, we see that the reclaimable size comes from x$kccagf.rectype.

So I'll use it in in conjunction with v$archived_log in order to give the detail about which archived logs are reclaimable:


SQL> select applied,deleted,decode(rectype,11,'YES','NO') reclaimable
     from v$archived_log left outer join sys.x$kccagf using(recid) 
     where is_recovery_dest_file='YES' and name is not null
     group by applied,deleted,decode(rectype,11,'YES','NO') order by 5

--------- ------- ----------- ---------- -------------- --------------
YES       NO      YES                429           5938           6366
YES       NO      NO                  37           6367           6403
IN-MEMORY NO      NO                   1           6404           6404


The problem is there: Because of a bug (Bug 14227959 : STANDBY DID NOT RELEASE SPACE IN FRA) the archivelogs are not marked as reclaimable when the database is in mount mode.

The workaround is to execute dbms_backup_restore.refreshagedfiles. This is what must be scheduled (maybe daily) on the standby. It can be a good idea to do it at the same time as a daily 'delete obsolete', so here is the way to call it from RMAN:

RMAN> sql "begin dbms_backup_restore.refreshagedfiles; end;";

But I've found another workaround: just run the CONFIGURE ARCHIVELOG POLICY from RMAN as it refreshes the reclaimable flag - even when there is no change.

Then, you can run a daily job that does CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY on your database, whatever the role is.

It's different for the database where you do the backup, because you want to be sure that the backup is done before an archivelog is deleted: CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK;

This is a good way to prevent anyone from changing the configuration and keep it close to the backup scripts. At dbi services, we advise to keep the same configuration on all Data Guard sites for the same database so that a switchover can be done without any problem. For this reason, having a script that depends on the place where the backups are done is a better alternative than a configuration that depends on the database role.

Finally, here is the state of our reclaimable archivelogs after any of these solutions:


--------- ------- ----------- ---------- -------------- --------------
YES       NO      YES                466           5938           6403
IN-MEMORY NO      NO                   1           6404           6404

-------------------- ------------------ ------------------------- ---------------
ARCHIVED LOG                      61.11                     61.09             467


All applied archived logs are reclaimable and the FRA will never be full.

You can check the primary as well. Are you sure that Oracle will never delete an archived log that has not been backed up ? Check your deletion policy.

Here is the full query I use for that:


column deleted format a7
column reclaimable format a11
set linesize 120
select applied,deleted,backup_count
 ,decode(rectype,11,'YES','NO') reclaimable,count(*)
 ,to_char(min(completion_time),'dd-mon hh24:mi') first_time
 ,to_char(max(completion_time),'dd-mon hh24:mi') last_time
 ,min(sequence#) first_seq,max(sequence#) last_seq
from v$archived_log left outer join sys.x$kccagf using(recid)
where is_recovery_dest_file='YES'
group by applied,deleted,backup_count,decode(rectype,11,'YES','NO') order by min(sequence#)


This is the result on primary where the last archivelog backup has run around 21:00

--------- ------- ------------ ----------- -------- ------------ ------------ --------- --------
NO        YES                1 NO               277 15-jan 17:56 19-jan 09:49      5936     6212
NO        NO                 1 YES              339 19-jan 10:09 22-jan 21:07      6213     6516
NO        NO                 0 NO                33 22-jan 21:27 23-jan 07:57      6517     6549

That is fine according to my policy APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK

And here is the result on standby where the workaround job has run around 06:00 and redo apply is in SYNC

--------- ------- ------------ ----------- -------- ------------ ------------ --------- --------
YES       YES                0 NO               746 07-jan 13:27 17-jan 11:17      5320     6065
YES       NO                 0 YES              477 17-jan 11:37 23-jan 05:37      6066     6542
YES       NO                 0 NO                 8 23-jan 05:57 23-jan 08:14      6543     6550
IN-MEMORY NO                 0 NO                 1 23-jan 08:15 23-jan 08:15      6551     6551

This is good for my policy APPLIED ON ALL STANDBY - except that because of the bug mentioned above, redo applied since 06:00 are not yet reclaimable.


Oracle LOBs: Infer the file type with dbms_lob package

Thu, 2014-01-16 20:06

LOBs (Large OBjects) first appeared in Oracle 7 and were created to store large amount of data such as document files, video, pictures, etc. Today, we can store up to 128 TB of data in a LOB, depending on the DB_BLOCK_SIZE parameter settings. In this posting, I will show you how to load LOB files into the Oracle database and will present a way to identify the file type of the LOB stored in the database, based on the LOB value column only.

Each file type is associated to a "file signature". This signature is composed of several bytes (i. e. 00 01), and allows to uniquely identify a file type (i.e. zip file, png file, etc). The signature corresponds to the first bytes of the file. To identify the file type of a LOB stored in a database, we just have to extract the first bytes of the LOB value and then compare them to a list of known bytes. I have created a PL/SQL script for that purpose.

The first step is to create a table containing LOB files. Here, I will create a simple table to store internal Binary objects (BLOB). The statement used is:


CREATE TABLE app_doc (  doc_id NUMBER,  doc_value BLOB);


SQL> desc app_doc;
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
DOC_ID                                             NUMBER
DOC_VALUE                                          BLOB


Now, I want to load a file from the operating file system to the table APP_DOC. The file to load is a picture saved as PNG format. The path is "/oracle/documents/my_pic.png".

First, we must create an Oracle directory in order to access the file:




We then use the DBMS_LOB package through a PL/SQL block to load a file in the database:


  src_lob  BFILE := BFILENAME('DIR_DOCUMENTS', 'my_pic.png');
  dest_lob BLOB;
     RETURNING doc_value INTO dest_lob;
  DBMS_LOB.LoadFromFile( DEST_LOB => dest_lob,
                         SRC_LOB  => src_lob,
                         AMOUNT   => DBMS_LOB.GETLENGTH(src_lob) );


The table has now a new record:


SQL> select count (*) from app_doc;


I have populated the table with 28 files of several types (png, pdf, bmp, etc.) and the table has 28 records. Here is the list of files loaded into the table:


oracle@vmtest:/oracle/documents/ [rdbms11203] ls -1


Here is an extract of the PL/SQL written to identify the LOB file type:


-- Create a temporary table to store known document types


-- Populate the temporary table
INSERT INTO temp_doc_types VALUES (1,'jpeg','FFD8');
INSERT INTO temp_doc_types VALUES (2,'gif','4749');
INSERT INTO temp_doc_types VALUES (3,'png','8950');
INSERT INTO temp_doc_types VALUES (4,'bmp','424D');
INSERT INTO temp_doc_types VALUES (5,'pdf','2550');
INSERT INTO temp_doc_types VALUES (6,'doc','D0CF');
INSERT INTO temp_doc_types VALUES (7,'zip','504B');
INSERT INTO temp_doc_types VALUES (8,'rar','5261');


In this example, I chose to compare only the four first bytes of the LOB value. The number of bytes composing a file signature can be larger, but to simplify the example, I only used file signatures that have four bytes.

The number of files of each extension is returned by this statement:


SQL> select decode(type,NULL,'Unknown document type',type) as "TYPE OF DOCUMENT",
  2  count(*) as "NUMBER OF DOCUMENTS"
  3  from temp_doc_types a,
  4  (select (dbms_lob.substr(doc_value,2,1)) as FILE_TYPE from app_doc) b
  5  where a.sign(+)=b.FILE_TYPE group by a.type;


--------------------- -------------------
Unknown document type                   1
doc                                     1
bmp                                     2
png                                    18
pdf                                     3
jpeg                                    2
zip                                     1


The statement gets the first four bytes of each LOB and compares them to the temporary table, matching the LOB to a type.

  • The decode function is used to display 'Unknown document type' when NULL is returned and no extension matches between the LOB and the temporary table. This is the case for text files which have no file signature.
  • The DBMS_LOB.SUBSTR function is used to retrieve only the first four bytes from the LOB value. Parameters are the source LOB value, the amount of bytes to read (4), and the starting position (1).
  • An outer join is used to count LOBs not matching the temporary table. This will correspond to unidentified files types.


To finish, I drop the temporary table before exiting the job:


-- The temporary table is dropped before exit
DROP TABLE temp_doc_types;


And just for the eyes, this is the result in a HTML fashion, generated directly from sqlplus using the MARKUP option and an css style sheet. I let you read the Oracle documentation in order to know how to use the MARKUP option ;-)



Oracle 12c: Applying PSU with Multitenant DB & unplug/plug

Thu, 2014-01-16 01:23

The concept of Multitenant databases, which was introduced with Oracle 12c in June 2013, allows to run several databases on a single instance. Oracle presents this feature as a good solution for Oracle patching. The reason behind it is that it is now possible to unplug a container database (called PDB) from its original container (called CDB), in order to plug it into a new local or remote Container with a higher level of PSU. In this post, I will show how you can install the new PSU for Oracle 12c (released in October 2013) using Multitenant databases to keep the downtime as low as possible.

The following schema shows the concept:




We assume that a company is using a Container database CDB1 with two pluggable databases PDB1 and PDB2 running on it:


SQL> select name, open_mode from v$pdbs;


NAME                           OPEN_MODE
------------------------------ ----------
PDB$SEED                       READ ONLY
PDB1                           READ WRITE
PDB2                           READ WRITE


These two databases are accessed by users and applications and a window maintenance is planned in order to install the latest PSU for Oracle 12c.

Using the OPatch utility, we can see the currently installed patches:


$ opatch lsinventory




We can see that the patch 16527374 is already installed. It has been installed in order to fix a bug with Enterprise Manager Express 12c and Multitenant databases. Also note the presence of the latest OPatch utility

To unplug PDB2 from the container CDB1 and plug it into a new CDB with a higher PSU level, we need to install a second rdbms software with the same patching level ( + EM Express bug fix), on which the new PSU will be installed.

An empty container database CDB2 has been created on the second environment, with the Oracle Home /u00/app/oracle/product/12.1.0/db_2. Now it is time to install the new PSU on the second environment. The listener for this installation is named LISTENER_DB_2:


Step 1: Upgrade OPatch utility to the latest version (see note 6880880) if not already done.


Step 2: Download and unzip the PSU on the server (patch 17027533).

$ unzip -d /tmp


Step 3: Check conflicts between the PSU and already installed patches.

$ opatch prereq CheckConflictAgainstOHWithDetail -ph ./




The patch already installed to fix the EM Express bug is in conflict with the new PSU. Oracle will remove the existing patch before installing the new PSU. This is not an issue here, since Oracle provides the patch 16527374 specific to Oracle It will be possible to reinstall this patch once the PSU is applied.


Step 4: Shut down all databases and listeners running on the Oracle Home of the second environment.

Shut down CDB2:

SQL> shutdown immediate;


I did not create any listener for my second environment. If you plan to drop the current ORACLE_HOME after upgrade, you will have to create a new listener in the new ORACLE_HOME and to shut it down before upgrading.


Step 5: Install the new PSU.

$ cd /tmp/17027533
$ opatch apply



Note that the patch for EM Express bug fix has to be reinstalled for release, since it has been removed during the PSU install. This bug is not covered by the PSU.


Step 6: Restart databases and listeners.

Restart the CDB2 database:

SQL> startup;


Step 7: Load modified SQL files into the database with Datapatch tool.


$ cd $ORACLE_HOME/OPatch$ ./datapatch -verbose




It is possible that the following error occurs:

DBD::Oracle::st execute failed: ORA-20001: Latest xml inventory is not loaded into table


In this case, the parameter _disable_directory_link_check must be set to TRUE (see Oracle note 1602089.1) and the database must be restarted:

alter system set "_disable_directory_link_check"=TRUE scope=spfile;


We have now CDB1 running in PSU level, and CDB2 running in PSU level. Until now, no downtime occured on PDB1 and PDB2 databases. All configuration and installation steps for the PSU have been performed on a non-productive environment.

The next steps consist in unplugging PDB2 from CDB1, in order to plug it into CDB2.


Step 8: Stop the user application and shutdown PDB2 from CDB1.

From this step on, the pluggable database must be closed. The downtime will start now.


SQL> connect sys/manager@CDB1 as sysdba


Session altered.


Pluggable database altered.


Step 9: Unplug PDB2 from CDB1.


Session altered.


Pluggable database altered.


Step 10: Plug PDB2 into CDB2.


SQL> connect sys/manager@CDB2 as sysdba


     USING '/u01/oradata/CDB1/PDB2/PDB2.xml'
     MOVE FILE_NAME_CONVERT = ('/u01/oradata/CDB1/PDB2','/u01/oradata/CDB2/PDB2');
Pluggable database created.


The use of the MOVE clause makes the new pluggable database creation very quick, since the database files are not copied but only moved on the file system. This operation is immediate if using the same file system.


Session altered.


Pluggable database altered.


The database PDB2 is now opened, and users can access to the database. Note that if installing CDB2 on a different host, users may have to update the TNS connect string.


Step 11: Load modified SQL files into the database with Datapatch tool.

Since the rdbms has been upgraded to before any pluggable database has been plugged into CDB2, all newly plugged databases must execute the "datapatch" script in order to load the modified SQL files.

Run the following command once the CDB2 environment is set:


$ cd $ORACLE_HOME/OPatch
$ ./datapatch -verbose




Important: If you are using static listener registration, do not forget to change your listener.ora in order to provide the true ORACLE_HOME path corresponding to the new environment.

The pluggable database is now fully ready and downtime only occured between steps 8 and 10. By using the MOVE clause, the only downtime corresponds to the time required for shutting down the database from the source CDB and starting the database on the destination CDB. It represents a few seconds... And if both CDB are running on the same host, the users will not have to update their TNS connect string in order to access the database.

Patching PDBs using this method might ease the DBA life in the future :-)

Sockets, Cores, Virtual CPU, Logical CPU, Hyper-Threading: What is a CPU nowadays?

Mon, 2014-01-13 14:42

Because people know that I really like Oracle AWR reports, they send them to me from time to time. Here is one I have received for Christmas with the following question: 'Since our AIX server is virtualized, the response times are 4x longer. I feel that we are CPU bound, but cpu usage is mostly idle. Is there something hidden?' It is a question that is frequently raised with virtualized environments. Here, we will see that the system reports the CPU utilization as being only 30% busy. However, despite appearances, the report is coming from a very busy system suffering from CPU starvation. So, yes, something is hidden and we will see how we can reveal it. 

A CPU nowadays is not the physical processor we used to have. The example I have here comes from an IBM LPAR running AIX and I will explain Virtual CPU, Logical CPU and CPU multi-threading. Even if it's not Oracle specific, I'll use the figures from the AWR report which show the statistics gathered from the OS.

The AWR report covers 5 hours of activity where users are experiencing long response times:

 Snap IdSnap TimeSessionsCursors/Session Begin Snap: 27460 20-Dec-13 15:00:07 142 2.8 End Snap: 27465 20-Dec-13 20:00:49 101 2.2 Elapsed:   300.72 (mins)     DB Time:   855.93 (mins)    

Here is the average CPU load gathered from the OS:

Host CPU (CPUs: 20 Cores: 5 Sockets: )

Load Average BeginLoad Average End%User%System%WIO%Idle 9.82 9.26 18.7 12.0 4.4 69.3

 And because the report covers 5 hours we have an hourly detail in 'Operating System Statistics - Detail':

Snap TimeLoad%busy%user%sys%idle%iowait 20-Dec 15:00:07 9.82           20-Dec 16:00:15 12.50 26.67 14.09 12.58 73.33 6.21 20-Dec 17:00:26 15.53 41.47 24.44 17.03 58.53 5.84 20-Dec 18:00:34 6.44 36.91 25.20 11.70 63.09 4.55 20-Dec 19:00:40 6.52 24.39 15.00 9.40 75.61 3.35 20-Dec 20:00:49 9.26 24.34 14.99 9.35 75.66 1.88

The point is that when you look at that you think that the system is more than 50% idle. But that's wrong.

Look at the second line. Being only 26.67% busy with a load of 12.5 running processes would mean that the system is able to run 12.50/0.2667=47 processes concurrently. But that's wrong again: we don't have 47 CPU on the whole system.

Ratios are evil, because they hide the real numbers they come from. Let's see how the %cpu utilization is calculated.

StatisticValueEnd Value BUSY_TIME 11,162,236   IDLE_TIME 25,169,672   IOWAIT_TIME 1,584,726   SYS_TIME 4,360,698   USER_TIME 6,801,538   LOAD 10 9 NUM_CPUS 20   NUM_CPU_CORES 5   NUM_LCPUS 20   NUM_VCPUS 5  

During the 5 hours where the statistics were collected, we had 11,162,236 centiseconds of CPU that were used by running processes,  and 25,169,672 centiseconds of CPU resources were not used. Which supposes that we have in total 11,162,236+25,169,672 centiseconds, and that is about 100 hours of CPU time available. 100 hours of CPU during 5 hours of elapsed time supposes that we have 100/5=20 CPU in the system.

But that's wrong. We cannot run 20 processes concurrently. And the %cpu that is calculated from that is a lie because we cannot reach 100%.

This is virtualization: it lies about the available CPU power. So, do we have to calculate the %cpu from the number of cores that is 5 here ? Let's do it. We used 11,162,236 centiseconds of CPU, that is 31 hours. If we have 5 cpu during 5 hours then the cpu utilization is 31/(5x5)=124%

This is hyper-threading: we have more CPU than the number of cores.

Now it's time to define what is a CPU if you want to understand the meaning of the numbers.

The CPU is the hardware that executes the instructions of a running process. One CPU can cope with one process which is always working on CPU, or with two processes that spend half of their time outside of CPU (waiting for I/O for example), etc.

With multi-core processors, we have a socket that has several CPU cores in it. Cores in a socket can share some resources, but the core is processing the process instructions. So the number of CPU we're interrested in is the number of cores. The OS will calculate the %cpu from the number of cores, and that's the right thing to do. And Oracle uses the number of cores to calculate the license that you have to buy, and that's fair: it's the number of instances of their software that can run concurrently.

Now comes virtualization. In our exemple, we are on an IBM LPAR with 5 virtual CPU (the NUM_VCPU from the V$OSSTAT). What does that mean ? Can we have 5 processes running on CPU efficiently ? Yes if the other LPARs of the same hardware are not too busy. Then our 5 virtual CPU will actually run on 5 physical cores. But if all the LPAR are demanding CPU resources, and there's not enough physical CPU then the physical CPU resources will be shared. We will run on 5 virtual CPU, but those VCPU will be slower than physical ones, because there are shared.

Besides that, when the CPU has to access to RAM, there is a latency where the CPU is waiting before executing the next instruction. So the processor manufacturers introduced the ability to run another thread of process during that time (and without any context switch). This is called hyper-threading by Intel, or Symmetric Multiprocessing (SMP) by IBM. In our exemple we are on POWER7 processors that can do 4-way SMT. So theorically each of our 5 VCPU can run 4 threads: here are the 20 logical CPU that are reported. But we will never be able to run 4x more concurrent processes, so calculating %cpu utilization from that is misleading.

Then, how to get the real picture ? My favorite numbers come from load average and runqueue. 

Look at the line where load average was 15.53 and cpu utilization shows 41.47%. Because the 41.47% was calculated from the 20 CPU hypothesis, we know that we had on average 0.4147*20=8.294 process threads being running on CPU. And then among the 15.53 processes willing to run, 15.53-8.294=7.236 were waiting in the runqueue. When you wait nearly as much as you work you know that you are lacking resources: this is CPU starvation.

So hyperthreading is a nice feature. We can run 8 threads on 5 CPU. But don't be fooled by virtualization or hyper-threading. You should always try to know the physical resources that are behind it, and that usually requires the help of your system admin.