Skip navigation.

DBA Blogs

Getting started with 12c

Bobby Durrett's DBA Blog - Thu, 2014-01-16 15:12

Back in July I finally got Oracle 12c installed on my laptop as documented in this post: url

But, that was as far as I got.  The last thing I did was get an error message creating a user.  Well, I figured out how to create a new user and a few other things.  I’m working with the ORCL database that comes with the install and all the parameters, etc. that come with it.

Evidently the default install comes with a PDB called PDBORCL.  So, I have two tns entries one for the parent CBD and one for the child PDB and they look like this:

(PORT = 1521))

(PORT = 1521))

I guess the service name has the name of the PDB in it.

So, if I connect as SYSTEM/password@orcl I’m connected to the CDB and if I connect to SYSTEM/password@pdb I’m connected to the PDB.  When I connected to the PDB I could create a new user without getting an error.

But, when I first tried connecting to the PDB I got this error, even though the database was up:

ORA-01033: ORACLE initialization or shutdown in progress

So, to bring the database up (by the way, I’m on 64 bit Linux) after booting the Linux VM the following steps were required:

lsnrctl start

sqlplus / as sysdba


alter session set container=PDBORCL;


Probably this could all be scripted but that’s what I did today.

Interestingly there is only one pmon:

$ ps -ef | grep pmon
oracle   29495     1  0 06:52 ?        00:00:00 ora_pmon_orcl

But you get different results when you query dba_data_files depending on whether connected to the CDB or PDB:





So, I guess each PDB has its own SYSTEM and SYSAUX tablespaces?

Lastly when running my scripts to poke around I edited my sqlplus header script to report which container you are in.  It looks like this now:

set linesize 32000
set pagesize 1000
set long 2000000000
set longchunksize 1000
set head off;
set verify off;
set termout off;

column u new_value us noprint;
column n new_value ns noprint;
column c new_value cs noprint;

select name n from v$database;
select user u from dual;

set sqlprompt &ns:&cs:&us>

set head on
set echo on
set termout on
set trimspool on

spool &ns..&cs..logfilename.log

Replace “logfilename” with whatever you want for your script name.

It puts out a prompt like this:





And the log file names:



Anyway, this is just a quick post about my first attempts to get around in 12c.

- Bobby






Categories: DBA Blogs

Adaptive Optimization Limitation Example

Bobby Durrett's DBA Blog - Thu, 2014-01-16 11:58

I’ve been reading up on Oracle 12c to get certified and to help advise my company on potential uses for the new version.  I’ve been looking forward to researching the new Adaptive Optimization features because it makes so much sense that the database should change its plans when it finds differences between the expected number of rows each part of a plan sees and the actual number of rows.

I’ve written blog posts in the past about limitations of the optimizer related to its ability to determine the number of rows (cardinality) that steps in a plan would see.  I took the scripts from some of these and ran them on a 12c instance to see if the new features would cause any of the inefficient plans to change to the obvious efficient plans.

Sadly, none of my examples ran differently on 12c.  I don’t doubt that there are examples that run better because of the new features but the ones I constructed earlier didn’t see any improvement.  So, I thought I would blog about one such example.

Here is the original blog post with an example run on 11.2 Oracle: url

Here is the same script run on 12c: zip

Here is the query with the bad plan:

select B.DIVNUM 
where a.DIVNUM=B.DIVNUM and                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     

Plan hash value: 480645376                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              

| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|   0 | SELECT STATEMENT   |          |       |       |   421 (100)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|*  1 |  HASH JOIN         |          |   500K|  8300K|   421   (2)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|*  2 |   TABLE ACCESS FULL| DIVISION |     1 |    14 |     2   (0)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|   3 |   TABLE ACCESS FULL| SALES    |  1000K|  2929K|   417   (1)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         

There is only 1 SALES row that has the DIVNUM associated with DIVNAME=’Mayberry’. There 1,000,001 SALES rows  and there is an index on SALES.DIVNUM so an index scan would be the most efficient access method and a nested loops join the most efficient join method. But the 12c optimizer chooses a hash join and full table scan instead.

According to the 12c SQL Tuning manual there are two types of Adaptive Optimization that might help in this case: Adaptive Plans and Adaptive Statistics.  I tried to tweak my test script to get Adaptive Statistics to kick in by commenting out the dbms_stats calls but it didn’t help.  Also, I tried running the query several times in a row but it never changed plan.

I can see why Adaptive Plans wouldn’t work.  How long will it let the full scan of SALES go before it decides to switch to a nested loops join with an index scan?  If it gets half way through the table it is too late.  So, I’m not sure how Adaptive Plans could change the plan on the fly when it expects a lot of rows and only finds a few.

On the Adaptive Statistics I guess this is just a case that it still can’t handle.  I guess it is like a histogram across joins case that would be pretty complex to solve in general.

Anyway, this all reminds me of when I first learned about histograms.  I got all excited that histograms would solve all my query performance problems and then came crashing down to earth when I realized it wasn’t the case.  I think the analogy fits.  Histograms improved cardinality estimates and can help in certain cases.  I think the new adaptive features will help improve plans by using real cardinality figures where it can, but they aren’t a cure-all.

I’m not sure that getting cardinality right is a solvable problem in the general case.  You have to have a fast optimizer so there are limits to how much it can do.

I ran all this as the user SYSTEM on the base 12c 64 bit Linux install with all the default parameters unchanged on the standard default database.

- Bobby








Categories: DBA Blogs

WebLogic Server 12c - PerDomain Node Manager Configuration Model

Let's start with giving a brief definition of Node Manager. Server instances in a WebLogic Server production environment are often distributed across multiple domains, machines, and geographic...

We share our skills to maximize your revenue!
Categories: DBA Blogs

oracle security patch notification now out

Grumpy old DBA - Wed, 2014-01-15 20:02
Most people have seen this already but just in case oracle jan 2014 patch notification ...

From what I have seen there are huge differences company by company site by site how patching and testing of patching occurs.  That's a whole different blog entry though right?
Categories: DBA Blogs

New Features Guide Highlights

Bobby Durrett's DBA Blog - Wed, 2014-01-15 12:03

I just finished reading the main section of the Oracle 12c New Features Guide.  I read pages 17-107 of the version I have – Section 1 titled “Oracle Database 12c Release 1 (12.1) New Features”.  I underlined and asterisked the things that stuck out as interesting in this pass and thought I would include them in this post: New PL/SQL DBMS_UTILITY.EXPAND_SQL_TEXT Procedure

Expands views into SQL automatically. Default Values for Columns Based on Oracle Sequences

Use sequence in column definition for default values. Increased Size Limit for VARCHAR2, NVARCHAR2, and RAW 
Data Types

32K varchar2 columns JDBC Support for Database Resident Connection Pool

Possible alternative to shared servers Asynchronous Global Index Maintenance for DROP and 
TRUNCATE Partition

Global index not made unusable by partition maintenance Adaptive Query Optimization

Plans change when DB sees that its cardinality estimates
were wrong. Session-Private Statistics for Global Temporary Tables

Gather stats on global temp tables for your session only - cool. SecureFiles is the Default for LOB Storage

Not sure what the means, but good to know that the default 

1.4.1 Database Consolidation

Subsections 2-6 give some clues to the way the multitenant 
architecture works. Cloning a Database

Sounds similar to Delphix Oracle Data Pump Change Table Compression at Import Time

Imported data can be compressed using HCC on target. Multiple Indexes on Same Set of Columns

Can have different kinds of indexes on same set of columns 
(same order I assume) Active Database Duplication Enhancements

Faster clone of an open database using RMAN Enterprise Manager Database Express

Sounds like a better EM tool - would like to check it out
and review the 2-Day dba manuals which show uses of it. Queryable Patch Inventory

Don't have to do lsniventory to see your patches? Multi-Process Multi-Threaded Oracle

Wondering what platforms this is on and what it really means. Unified Context-Based Database Audit Architecture

Sounds like this may replace some third party tools.  Improved 
auditing. Parallel Upgrade

May speed up upgrade by parallelising

I read all 91 pages but there were sections that didn’t really interest me since they are about features we don’t use such as Spatial.  If you are interested in 12c I encourage you to read this as I did.  I printed out about 10 pages at a time.  It’s only 91 pages so it doesn’t take forever to read it.

- Bobby




Categories: DBA Blogs

12c Online Partitioned Table Reorganisation Part II (Move On)

Richard Foote - Tue, 2014-01-14 23:06
In Part I, we looked at some of the issues associated with locking and unusable indexes when Moving both tables and table partitions. The Oracle 12c Database has introduced a number of great new capabilities associated with online DDL activities. One of these really useful features is the capability to now move table partitions online while […]
Categories: DBA Blogs

Deinstalling GoldenGate 12c

VitalSoftTech - Tue, 2014-01-14 20:30

In GoldenGate 12c, the installation and the de-installation process has been standardized to use the Oracle Universal Installer (OUI) interface. In this article we will look at how can we deinstall Oracle GoldenGate.

The post Deinstalling GoldenGate 12c appeared first on VitalSoftTech.

Categories: DBA Blogs

Initialization Parameter Handling for Pluggable Databases in #Oracle 12c

The Oracle Instructor - Tue, 2014-01-14 15:25

In a Multitenant Database, the handling of initialization parameters has changed. This post shows some major differences. The playing field:

$ sqlplus / as sysdba

SQL*Plus: Release Production on Tue Jan 14 21:44:10 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>  select name,open_mode,con_id from v$pdbs;

NAME                           OPEN_MODE      CON_ID
------------------------------ ---------- ----------
PDB$SEED                       READ ONLY           2
PDB1                           MOUNTED             3

Apart from the mandatory pluggable seed database, there is one pluggable database which is presently not opened. Can I modify initialization parameters for pdb1 now?

SQL> alter session set container=pdb1;

Session altered.

SQL> alter system set ddl_lock_timeout=30;
alter system set ddl_lock_timeout=30
ERROR at line 1:
ORA-01219: database or pluggable database not open: queries allowed on fixed
tables or views only

SQL> alter system set ddl_lock_timeout=30 scope=spfile;

System altered.

In spite of the syntax, this did not modify the spfile:

SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /u01/app/oracle/product/12.1.0
SQL> host strings /u01/app/oracle/product/12.1.0/dbhome_1/dbs/spfilecdb1.ora
cdb1.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
*.dispatchers='(PROTOCOL=TCP) (SERVICE=cdb1XDB)'

Initialization parameters for pluggable databases are not kept in the spfile – if they are not inherited from the container database, that is. Instead, they are stored in a dictionary table of the container database. The documented view V$SYSTEM_PARAMETER displays them. But only if the pluggable database is opened:

SQL> connect / as sysdba
SQL> select name,value,con_id from v$system_parameter where name='ddl_lock_timeout';

NAME                           VALUE                                    CON_ID
------------------------------ ---------------------------------------- ------
ddl_lock_timeout               0                                             0

SQL> select,value$,con_id from pdb_spfile$ a join v$pdbs b on (a.pdb_uid=b.con_uid);

NAME                           VALUE$                                   CON_ID
------------------------------ ---------------------------------------- ------
ddl_lock_timeout               30                                            3

The undocumented table PDB_SPFILE$ is internally queried and the parameters are set accordingly when the pluggable database is opened:

SQL> alter pluggable database pdb1 open;

Pluggable database altered.

SQL> select name,value,con_id from v$system_parameter where name='ddl_lock_timeout';

NAME                           VALUE                                    CON_ID
------------------------------ ---------------------------------------- ------
ddl_lock_timeout               0                                             0
ddl_lock_timeout               30                                            3

Attention, show parameter displays different results, depending on the current container now:

SQL> show con_name

SQL> show parameter ddl_lock_timeout

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
ddl_lock_timeout                     integer     0

SQL> alter session set container=pdb1;

Session altered.

SQL> show parameter ddl_lock_timeout

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
ddl_lock_timeout                     integer     30

Upon unplugging the pluggable database, an xml file is generated, containing the description of the pluggable database. I thought that this file contains also the modified initialization parameters of that pluggable database, but that doesn’t seem to be the case. I checked it for the parameters DDL_LOCK_TIMEOUT and RESUMABLE_TIMEOUT and both do not show in the xml file. Instead, they got stored in the system tablespace of the unplugged database! Again, that is not documented and may change in the future.

Isn’t that fun to play with this exciting new stuff? At least I enjoyed it, and I hope you found it useful :-)

Tagged: 12c New Features
Categories: DBA Blogs

Final reminder ... see Jeff Smith @thatJeffSmith Jan 24 2014 for NEOOUG quarterly meeting

Grumpy old DBA - Tue, 2014-01-14 07:40
Jeff Smith aka @thatJeffSmith
Jeff Smith is a Senior Principal Product Manager in the Database Tools group at Oracle. He works on the team that brings you SQL Developer, SQL Developer Data Modeler, APEX Listener, the Public Cloud database offering, and 11gR2 XE. Jeff has presented online and in-person at many of the top user groups around the world for the past decade including ODTUG, IOUG, RMOUG, NEOUG, Suncoast Oracle User Group, and more.

Two presentations ( with plenty of time for questions/discussion ):
1) What’s New in Oracle SQL Developer and SQL Developer Data Modeler v4.0
2) Oracle SQL Developer Tips and Tricks

Also NEOOUG January Business meeting topics: Financial report/Election of 2014 officers/GLOC 2014 conference update.

Please note: lunch at 12:15 meeting starts 1 pm ... EVENT REGISTRATION ends 11:00 am on Jan 20 2014

* * *
No day of event walk in ( apologies security procedures ).  You must register in advance to attend.

Complete information and registration information event registration :   
Categories: DBA Blogs

My first week at Pythian

Pythian Group - Mon, 2014-01-13 09:13

“You are lucky,” the taxi driver told me when he picked me up at the Ottawa Airport. “It began snowing right after your flight landed.” I personally like snow — just like the Chinese saying, “heavy snow brings heavy harvest”. Yes, I am lucky to be able to work at Pythian where the world’s top 5% talented data lovers come together. I am lucky to begin my first week at Pythian with hard weather and hard work which eventually leads to heavy harvest, I always believe.

I feel Pythian is like a warm home, when Dag met me on early Monday morning to bring me to the office. After that, Jennifer and Sarah introduced me to every person we met and they all show their kindness to me. I feel Pythian is transparent, when HR, Marketing, Sales and SD departments shared as much as possible information to us new employees. We can even see everyone’s calendar and work posting. Yes, I mean it, EVERYONE! Which is really good, because being transparent means being confident, reliable and responsible. I also feel inspiring while Paul, Pythian founder, shared his story and his thoughts regarding the company.

Besides all the above good things, you need to work hard as well, since there are lots of information and skills waiting for you.  No worry, you will have an experienced SDC to help you as we have Shawna. Here lists 3 tips to help you as a DBA survive and enjoy the first week at Pythian.

1. Taking those BORG sessions are important but only part of your tasks. The Onboarding check list your SDC sent to you before your arriving is really important. Make sure you walk through all the items in the list and try you best to understand them.

2. Start to use Support Track to book your time the earlier the better. Make sure you book your time under the right CR numbers listed in the email coming with the checklist.

3. Your TTL may send you an email asking you do some Bootcamp tasks, spend some time on it because it is your job related technical things. Just follow the instructions, it is not that hard.

Go relax and enjoy your first week at Pythian!

By the way, if you are lucky enough having a seat near Michael S. Abbey, he will share his experiences with you, calm you down and relax your stress. which is really helpful.

Last but not the least, I would like say thank you to Christina, Danil, Amanda, Manoj and Steve who made the wonderful decision to bring me in!

Categories: DBA Blogs

Deduping 100 Gigs Worth of Files? Gimme 5 Minutes…

Pythian Group - Mon, 2014-01-13 08:57

Shortly before the Holidays, I became aware of the Dallas/Fort Worth Perl Mongers Winter Hackaton Contest. The challenge was simple enough: Here’s a directory structure filled with 100G worth of files, go and find all the duplicate files in there, as fast as you can.

You know me: can’t resist a contest. But since I knew I didn’t have heaps of time to sink into it, and because — let’s face it — the Perl community is despairingly brim-full of smart people who always beat me to the finish line, I decided to compete on slightly different terms. My main objective would be to leverage a maximum from modern Perl tools so that effort to get a working solution would be minimal. As for the performance of the solution, I decided that I’d be satisfied, and that the case for modernity and laziness would be made if it was at least in the ballpark of the other entries.

And so it began…

The Application Interface

Writing code to interact with the users is booooooring.

The hard part, the fun part, the challenging part, is to come up with the function do_it( $foo, $bar ). Once this is done, all that remains to do is to turn the handle and give a way to the user to set $foo and $bar. And to validate the incoming values for $foo and $bar. And document what are $foo and $bar. Blergh. It’s dotting the i’s, and crossing the t’s. It’s writing the police report after that blood-pumping drug cartel take-down.

Fortunately, there are a few modules that will take that tedium off yours hands. One of my favorites is MooseX-App, and for this project I used its single-command variation, MooseX::App::Simple. Its use is pretty straightforward. The main module of the application is a Moose class

package Deduper;

use strict;
use warnings;

use MooseX::App::Simple;

Attributes of the class that you want to be accessible as options of the script are defined using the option keyword. Same story for positional parameters, with the parameter keyword. Other attributes stays untouched, and won’t be visible to the user.

parameter root_dir => (
    is => 'ro',
    required => 1,
    documentation => 'path to dedupe',

option hash_size => (
    isa => 'Int',
    is => 'ro',
    default => '1024',
    trigger => sub {
        my $self = shift;
        Deduper::File->hash_size( $self->hash_size );
    documentation => 'size of the file hash',

option stats => (
    isa => 'Bool',
    is => 'ro',
    documentation => 'report statistics',

option max_files => (
    traits => [ 'Counter' ],
    isa => 'Int',
    is => 'ro',
    predicate => 'has_max_files',
    default => 0,
    documentation => 'max number of files to scan (for testing)',
    handles => {
        dec_files_to_scan => 'dec',

has start_time => (
    is => 'ro',
    isa => 'Int',
    default => sub { 0 + time },

Last required touch for the class: a run() method that will be invoked when the app is run:

sub run {
    my $self = shift;

    say join "\t", map { $_->path } @$_ for $self->all_dupes;

With that, your app is in working condition. Command-line parsing, argument validation, --help text, it’s all taken care of for you:

$ cat ./

use Deduper; 

$ ./
Required parameter 'root_dir' missing
usage: [long options...] --help

    root_dir  path to dedupe [Required]

    --hash_size           size of the file hash [Default:"1024"; Integer]
    --help -h --usage -?  Prints this usage information. [Flag]
    --max_files           max number of files to scan (for testing) [Default:
                          "0"; Integer]
    --stats               report statistics [Flag]

Traversing Directory Structures

Traversing directory structures isn’t terribly hard. But there are a few things like symlinks that you have to watch for. What we care about, really, is to have each file of the structure handed to us on a silver platter. Let’s have somebody else deal with the underlying menial work.

Here, this somebody else is Path::Iterator::Rule. With it, visiting all files of the directory structure boils down to creating an attribute

has file_iterator => (
    is => 'ro',
    lazy => 1,
    default => sub {
        my $self = shift;
        return Path::Iterator::Rule->new->file->iter_fast( 
            $self->root_dir, {
                follow_symlinks => 0,
                sorted          => 0,

and, well, using it

while( my $file = $self->file_iterator->() ) {
    # do stuff with $file...
A Couple Of No-Brainish Optimizations

Everybody love optimizations that require no more effort than to press a ‘turbo’ button. On this project, two modules providing that kind of boost were just begging to be used.

The first is MooseX::XSAccessor, which invests the accessors of the class with super-speedy XS powers. Alas, that module does not speed up lazy-defined attributes, which I used in spade. But since its use only require to drop a

use MooseX::XSAccessor;

at the top of the file, it’s not like I have much to lose anyway.

The second module is MooseX::ClassAttribute. As you probably realized from the previous code snippets, my code create an object per file. For 100G worth of files, that turns out to be lots of objects. So having configuration attributes that will all end up having the same value over and over again for each object would be quite wasteful. In that case, using an attribute which value is shared for all objects of the class makes much more sense. And, again, using that module is dead simple:

package Deduper::File;

use Moose;
use MooseX::ClassAttribute;

class_has hash_size => (
    isa => 'Int',
    is => 'rw',
    default => 1024,

sub foo {
    my $self = shift;

    # hash_size can be used as a regular attribute
    my $hs = $self->hash_size;

The cherry on top of everything is that those class attributes are totally transparent to the rest of the code. Ever find that you want per-object values after all, change ‘class_has’ for ‘has’, and you’re done.

Keeping Different Functionalities Nicely Segregated

A bonus that using a Moose-based class brought to the project was to keep different functionalities logically apart via method modifiers. For example, the run() method of an app typically juggles with the different options requested:

sub run {
    my $self = shift;

    my $n = $self->max_files || -1;

    until ( $self->finished ) {
        print $self->next_dupe;
        if ( $n > -1 ) {
            $self->finished(1) if $n == 0;
    $self->print_stats if $self->stats;

instead, once can encapsulate those different behaviors in separate functions

sub check_max_files {
    my $self = shift;

    $self->finished(1) if $self->files_to_scan == 0;

sub stats {
    # print a lot of stuff

sub run {
    my $self = shift;

    print $self->next_dupe until $self->finished;

and stitch in whatever is needed at creation time:

sub BUILD {
    my $self = shift;


        next_dupe => \&check_max_file,
    ) if $self->max_files;

    $self->meta->add_after_method_modifier( run => \&print_stats )
        if $self->stats;


If managed properly, this makes each method much smaller and much more single-minded. As a nice side-effect, the final application object will only contain the code that it requires. If the option --max_file isn’t passed, the algorithm won’t have an additional ‘if’ statement to deal with per iteration. It’s not much, but it will make the default case just a tad faster. I must say, however, that this is not a free lunch: if the option is passed, the running time will be slower than if a simple ‘if’ was used. But that trade-off can make sense, like here where I’m ready to have a slight penalty when I run my test runs, but really want to have the tires screeching for the main event.

Finally, The Core Algorithm

With all boilerplate stuff dealt with, we can focus on the core problem. After some experimenting, I ended up with a recipe that isn’t exactly ground-breaking, but seems to be efficient. The encountered files are first classified by file size. For files of the same size, we compare the first X bytes of the file (where ‘X’ is fairly small). If they share that beginning, we compare their last X bytes (in case we’re dealing with file types with the same preamble). Then, ultimately, we compare full-file digests (using Digest::xxHash which is pretty damn fast). Everything is only computed if required, and then cached so that we do the work only once.

Translated into code, it looks pretty much like

# in class Deduper

sub next_dupe {
    my $self = shift;

    return if $self->finished;

    while( my $file = $self->file_iterator->() ) {
        $file = Deduper::File->new( path => $file );
        my $orig = $self->is_dupe($file) or next;
        return $orig => $file;


sub is_dupe {
    my( $self, $file ) = @_;

    return $_ for $self->find_orig($file);

    # not a dupe? enter it in the registry


sub find_orig {
    my( $self, $file ) = @_;

    # do we have any file of the same size?
    my $candidates = $self->files->{$file->size}
        or return;

    my @c;

    # only have a sub-hash if we have more than one
    # file, so as not to compute the 'hash' (beginning of the file) 
    # needlessly
    if( ref $candidates eq 'Deduper::File' ) {
        return if $candidates->hash ne $file->hash;
        @c = ( $candidates );
    else {
        @c = @{ $candidates->{$file->hash} || return };

    # first check if any share the same inode
    my $inode = $file->inode;
    for ( @c ) {
        return $_ if $_->inode == $inode;

    # then check if dupes
    for ( @c ) {
        return $_ if $_->is_dupe($file);


sub add_file {
    my( $self, $file ) = @_;

    if( my $ref = $self->files->{$file->size} ) {
        if ( ref $ref  eq 'Deduper::File' ) {
            $ref = $self->files->{$file->size} = { $ref->hash => [ $ref ] };
        push @{$ref->{$file->hash}}, $file;
    else {
        # nothing yet, just put the sucker
        $self->files->{$file->size} = $file;


# and then in Deduper::File

sub is_dupe {
    my( $self, $other ) = @_;

    # if we are here, it's assumed the sizes are the same
    # and the beginning hashes are the same

    # different hashes?
    return $self->end_hash eq $other->end_hash
        && $self->digest eq $other->digest;
And I Give You The Pudding, Fully Cooked

The application, as submitted to the contest, can be found on GitHub.

“And how did it fare, performance-wise?”, you ask? If the contest results are to be trusted, it processed the 100G monster in a little less than 4 minutes which, I think, ain’t too shabby.

Mostly considering that, y’know, it turned out to be the best time of all submitted entries. :-)

Categories: DBA Blogs

Join Alex Carlos Mauro Scott and speak at the Great Lakes Oracle Conference 2014

Grumpy old DBA - Sat, 2014-01-11 08:34
The Great Lakes Oracle Conference is May 12-14 2014 in Cleveland at CSU.  Beautiful campus environment and state of the art student center main ballroom.  Plus well of course the chance to speak at the city where the river caught on fire right ( long story ask me over a beer ). We have 1/2 day workshops on Monday ( Alex Gorbachev / ( Carlos Sierra and Mauro Pagano ) / Scott Spendolini ) and then main conference starts Tuesday with keynotes by Steven Feuerstein and Tom Kyte. Call for Abstracts is open now here:  Please consider joining us at a rapidly growing regional conference that Carol Dacko considers "One of the best choices for any Oracle Professional to attend".   
Categories: DBA Blogs

Changing OEM12c Repository retention settings

DBASolved - Fri, 2014-01-10 16:14

Today, I’ve been working on a script to pull some sizing/metric information out of the Oracle Enterprise Manager (OEM) repository database.  As I was working through (pulling my hair out…lol) the details; I had a discussion with an good friend.  He mentioned to me that I needed to change  the retention setting on the repository for usage with the script (I was working on) and long-term historical tracking.   Although, I already knew this needed to be done, it took some poking around to find the document number in MOS.

In case you want to check out the document or documentation, you can find this information at these points of interest:
MOS Doc ID: 1405036.1

Armed with the documentation and a few more grey hairs, I took at look at what default settings were configured after building the repository.  The script in Listing 1 will display what the current/default retention settings are.

Listing 1: Script for retention settings
select table_name, partitions_retained
from em_int_partitioned_tables

After running the script in Listing 1, your output should tell you if you are configured with the default settings.  The default settings for retention are 7, 32 and 12.  Figure 1 shows you the output from the environment I’m working in.  As you can see, the retention settings have not been changed.

Figure 1: Current retention settings


These numbers may look a bit funny when you first look at them.  Before changing theses settings it is good to understand what these values mean.  The values for EM_METRIC_VALUES and EM_METRIC_VALUES_HOURLY are displayed in DAYS  The value for EM_METRIC_VALUES_DAILY is displayed in MONTH.

For tracking and historical purposes, these settings are a bit low.  Most organizations will want to have data that is retained for longer period of times; mostly due to security and compliance reasons.  Now, this brings up the topic of how to change these values.

Oracle has provided a PL/SQL API to allow these changes to be done.  The SQL statements in Listing 2 shows how this API is used for each of the retention changes.

Listing 2: PL/SQL API to change retention settings

  gc_interval_partition_mgr.set_retention(‘SYSMAN’, ‘EM_METRIC_VALUES’, 10);

  gc_interval_partition_mgr.set_retention(‘SYSMAN’, ‘EM_METRIC_VALUES_HOURLY’, 90);

  gc_interval_partition_mgr.set_retention(‘SYSMAN’, ‘EM_METRIC_VALUES_DAILY’, 36);

Changes that were made by the PL/SQL API, the retention setting for the OEM repository have been changed.  Figure 2, shows you the updated information that the repository will use.

Figure 2: Updated retention settings


With the updated settings, the OEM repository will now keep raw metric data for 10 days, hourly metric data for 90 days (3 months) and keep all the daily metric data for 36 months (3 years).  This increase in retention should help with reporting on growth of monitored targets over the long term.


twitter: @dbasolved


Filed under: OEM
Categories: DBA Blogs

The Twelve Days of NoSQL: Day Twelve: Concluding Remarks

Iggy Fernandez - Fri, 2014-01-10 10:20
Originally posted on So Many Oracle Manuals, So Little Time:
[twitter-follow screen_name='Oratweets'] Day One: Disruptive Innovation Day Two: Requirements and Assumptions Day Three: Functional Segmentation Day Four: Sharding Day Five: Replication and Eventual Consistency Day Six: The False Premise of NoSQL Day Seven: Schemaless Design Day Eight: Oracle NoSQL Database Day Nine: NoSQL Taxonomy Day…
Categories: DBA Blogs

Meet us at NRF14 booth #2233

Pythian Group - Fri, 2014-01-10 10:18
Categories: DBA Blogs

SQL Server 2012: Installing and configuring Distributed Replay

Pythian Group - Fri, 2014-01-10 09:46

Today we are going to discuss one of the powerful new functions of SQL Server 2012. The name of the feature is Distributed Replay which can be used to assess the impact of changes and upgrades or SQL Server tunings by replaying a trace captured from production SQL Server environment to a test environment.
What are the benefits?

Unlike SQL Server profiler, the Distributed Replay can run the recorded trace against as much as 16 Distributed Replay Client, all these clients are controlled by one Distributed Replay Controller. Distributed Replay components are installed separately from the regular SQL Server install. The Distributed Replay Controller can act as a replacement for Ostress, except for the ability to replay SQL and RML files.

This powerful feature will help customers/DBAs in understanding the impact of any application/configuration changes on performance for multiple servers. You can customize the trace replay by choosing appropriate mode for your environment


We will use a Distributed Replay controller and a Distributed Client on server SQLNODE1, a second Distributed Replay Client on server SQLNODE2 and a target instance with SQL Server 2008 R2 instance.

For installation we need to create two Active Directory accounts .First CORP\DR_Controller for Distributed Replay Controller and CORP\DR_Client for Distributed Replay Client. Below are the screenshots for reference.

Distributed Replay installation

We will start the installation now — Make sure that you install Management Tools too as it provides Distributed Replay administration tool. For this installation I will install Distributed Replay Controller and Distributed Replay Client, as I already have Management tools installed on my SQL Server instances.

We will start installation on SQLNODE1 as it will act as a controller as well as client.

Step #1: Select the installation type. Here we will add features to existing standalone installation, selecting the same.

Step #2: Setup will install setup support files.

Step #3: You have option to include SQL Server updates.

Step #4: Setup will install setup files.

Step #5: Setup should pass all setup support rules. You may ignore warnings and go ahead on a case-by-case basis.

Step #6: Since I already have the DB Engine installed, I will choose to Add features to existing instance of SQL Server 2012.

Step #7: Since this Server will act as a Controller and Client thus selecting both here.

Step #8: This page will review the disk space requirement.

Step #9: On this page, Provide both the accounts created for Distributed Replay Controller and Distributed Replay Client in Active directory.

Step #10: On Distributed Replay Controller page, add the Active Directory account that we have created.

Step #11: On Distributed Replay Client page, provide the controller name.

Step #12: On Error Reporting Page you can opt for sending information of Error reporting to Microsoft.

Step #13: On this page setup will check if the installation will be blocked by running rules. All green here.

Step #14: We are ready to install. You might like to .cross verify the selected configurations before proceeding.

Step #15: Installation is in progress.

Step #16: Finally the setup is successful. You can close the window now.

Windows Firewall configurations

We will configure the Windows firewall on the Distributed Replay Controller to allow inbound connection for DReplayController.exe application .

Go to Windows firewall, Inbound Rules, and add a new rule for a program:

Browse to the DReplayController.exe location:

Allow the Connection.

Specify a name to the rule.

Similarly create a rule on each Distributed Replay Client to allow Distributed Replay Clients to connect and register to the Distributed Replay controller,

Start Services and check client registrations.. First start the Controller service on the Distributed Replay Controller SQLNODE1, open a command prompt screen and enter:

NET STOP “SQL Server Distributed Replay Controller”

NET START “SQL Server Distributed Replay Controller”

You should see the result of this command in the log folder of Distributed Replay Controller under C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayController\log.It should be like this

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] Microsoft SQL Server Distributed Replay Controller – 11.0.2100.60.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] © Microsoft Corporation.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] All rights reserved.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] Current edition is: [Enterprise Edition].

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] The number of maximum supported client is 16.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] Windows service “Microsoft SQL Server Distributed Replay Controller” has started under service account “CORP\DR_Controller”. Process ID is 2888.

2014-01-06 20:30:42:498 OPERATIONAL [Controller Service] Time Zone: India Standard Time.

2014-01-06 20:30:42:498 OPERATIONAL [Common] Initializing dump support.

2014-01-06 20:30:42:498 OPERATIONAL [Common] Dump support is ready.

Now start Distributed Replay Client service on your Distributed Replay Clients SQLNODE1 and SQLNODE2. Verify that they are correctly synchronized with your Distributed Replay Controller.

On both servers, open a command prompt screen and run:

NET STOP “SQL Server Distributed Replay Client”

NET START “SQL Server Distributed Replay Client”

Open the latest log file in the location C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\log and check for text Registered with controller

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] Microsoft SQL Server Distributed Replay Client – 11.0.2100.60.

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] © Microsoft Corporation.

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] All rights reserved.

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] Current edition is: [Enterprise Edition].

2014-01-06 20:30:51:015 OPERATIONAL [Common] Initializing dump support.

2014-01-06 20:30:51:015 OPERATIONAL [Common] Dump support is ready.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Windows service “Microsoft SQL Server Distributed Replay Client” has started under service account “CORP\DR_Client”. Process ID is 2872.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Time Zone: India Standard Time.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Controller name is “SQLNODE1″.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Working directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\WorkingDir”.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Result directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\ResultDir”.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Heartbeat Frequency(ms): 3000

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Heartbeats Before Timeout: 3

2014-01-06 20:30:51:078 OPERATIONAL [Client Service] Registered with controller “SQLNODE1″.

You might get below error message in the Client Log if in case you forget to add the Client Service account in the Controller Service access permission page during setup or you just change the Distributed Replay client service account .I re-produced the scenario and below is the log file.

2014-01-06 18:04:54:603 OPERATIONAL [Client Service] Microsoft SQL Server Distributed Replay Client – 11.0.2100.60.

2014-01-06 18:04:54:619 OPERATIONAL [Client Service] © Microsoft Corporation.

2014-01-06 18:04:54:619 OPERATIONAL [Client Service] All rights reserved.

2014-01-06 18:04:54:806 OPERATIONAL [Client Service] Current edition is: [Enterprise Edition].

2014-01-06 18:04:54:806 OPERATIONAL [Common] Initializing dump support.

2014-01-06 18:04:54:853 OPERATIONAL [Common] Dump support is ready.

2014-01-06 18:04:54:884 OPERATIONAL [Client Service] Windows service “Microsoft SQL Server Distributed Replay Client” has started under service account “CORP\DR_Client”. Process ID is 2800.

2014-01-06 18:04:55:196 OPERATIONAL [Client Service] Time Zone: India Standard Time.

2014-01-06 18:04:55:243 OPERATIONAL [Client Service] Controller name is “SQLNODE1″.

2014-01-06 18:04:55:243 OPERATIONAL [Client Service] Working directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\WorkingDir”.

2014-01-06 18:04:55:258 OPERATIONAL [Client Service] Result directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\ResultDir”.

2014-01-06 18:04:55:258 OPERATIONAL [Client Service] Heartbeat Frequency(ms): 3000

2014-01-06 18:04:55:258 OPERATIONAL [Client Service] Heartbeats Before Timeout: 3

2014-01-06 18:04:55:648 CRITICAL [Client Service] [0xC8100005 (6)] Failed to connect controller with error code 0×80070005.

The error code 0×80070005 is also described as “ACCESS DENIED.” Below are the steps to fix it.

  1. On the server where controller is installed. Goto – Start -> Run and type dcomcnfg and open Component Services.
  2. Navigate to Console Root –> Component Services –> Computers –> My Computer –> DCOM Config -> DReplayController
  3. Open the properties of DReplayController and select Security tab
  4. Edit “Launch and Activation Permissions” and grant “Distributed Replay client service account” permission for “Local Activation” and “Remote Activation”.
  5. Edit “Access Permissions” and grant “Distributed Replay client service account” permission for “Local Access” and “Remote Access”.
  6. Add “Distributed Replay client service account” domain user account within “Distributed COM Users” group.
  7. Restart controller and client services like below

NET STOP “SQL Server Distributed Replay Controller”

NET STOP “SQL Server Distributed Replay Client”

NET START “SQL Server Distributed Replay Controller”

NET START “SQL Server Distributed Replay Client”

  1. Check the Distributed Replay Client log file and see the message “Registered with controller SQLNODE1”


At this point, our Distributed Replay lab is ready to use, clients and controller are registered together. We will perform a demonstration in upcoming blog post and will understand this powerful feature.

Categories: DBA Blogs

Log Buffer #354, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-01-10 09:33

The new year is here with a bang, and the rythem of everything new is propagating into the database technologies, hence providing database bloggers ample material to write about. This Log Buffer Edition covers all that and more.


Do you know that DB12c sqlldr is a little smarter…finally ? Now, *.ctl file is not always necessary.

ODI – High performance data movement using Datapump.

Informatica PowerCenter 9.5.1 Hot Fix 2 is now available in the Business Intelligence (BI) 11g media pack for download and use with BI Applications 7.9.x.

This was tested on Oracle 11.2. If you create a tablespace with a datafile with autoextend on, the file’s maxsize cannot be less than its size. If it is, you get an ORA-02494

Oracle Exadata X4 (Part 2): The All Flash Database Machine?

SQL Server:

What are companies doing to optimize BYOD practices?

Stairway to SQLCLR Level 1: What is SQLCLR?

Relational database or graph database? Why not have both?

SQL Server 2014 Hybrid Cloud Scenarios: Migrating On-Premises SQL Server to Windows Azure Virtual Machines

An AlwaysOn Bug with Service Broker Transmission Queue


MySQL Enterprise Monitor 2.3.15 Is Now Available.

New MySQL web installer, Have you tried it yet?

One of the most important config for InnoDB is the innodb_buffer_pool_size.

XtraBackup Complains of Missing perl-DBD-MySQL.

High-Velocity Data—AKA Fast Data or Streaming Data—seems to be all the rage these days.

Categories: DBA Blogs

How to Install a Clustered SQL Server 2012 Instance – Step-by-step – Part 4

Pythian Group - Fri, 2014-01-10 09:32

We have reached the last article of this series. To close this series out, we will talk about Distributed Transaction Coordinator, or simply DTC. I’ll try to do some simple explanation. After that, I’ll demonstrate how to prepare the DTC for a clustered instance.

What is DTC (MS DTC)?
The MS DTC is a OS level service, which comes automatically installed and running under the Network Service account. Its role is to ensure that a distributed transaction is consistent, even with failures.
Those transactions might be initiated when a transaction is dealing with data on multiple computers via network or when the transaction is dealing with multiple processes in a single computer.

All the participants of a distributed transaction works in sync with the other participants (computers) involved in a transaction, looking for the right moment to commit or abort its work. For this reason, we need to make sure that the computers can reach each other.

Do I need to configure MS DTC on my environment?
The answer for this question is the standard for almost everything involved with SQL Server; It depends. You need to understand whether or not you will perform distributed transactions. If you have more than one instance in the same computer, you won’t need the DTC. On the other hand, if you have a two nodes cluster with two clustered instances communicating with each other, you will need the DTC – the instances could be in different nodes. Another possible scenario is when you have the database engine and SSIS installed, in this case you will need to configure the DTC.

For more information, check this link:

How to create a clustered MS DTC?

Since Windows 2008, we are allowed to have more than one instance of MS DTC in a server/cluster. So, for clustered SQL Server installations is a best practice to have a Role exclusively for the DTC and a dedicated DTC for each SQL Server Role.

As documented per Microsoft, the SQL Server follow this path to choose the MS DTC instance to use:

  • Use MS DTC installed to the local group, else
    • Use the mapped instance of MS DTC, else
      • Use the cluster’s default instance of MS DTC, else
        • Use the local machine’s installed instance of MS DTC

To configure a DTC in cluster, we will need a disk and a hostname.

To configure a Role exclusively for the DTC, follow the steps:

  1. Right-click on Roles and pick the “Configure Role” option.
    Screen Shot 2014-01-02 at 14.18.43
  2. A new window will open. Click “next”.Screen Shot 2014-01-02 at 14.18.52
  3. Choose the option “Distributed Transaction Coordinator (DTC)” from the list. Click  “Next”.Screen Shot 2014-01-02 at 14.19.11
  4. Fill the hostname in the “Name” field and the IP in the “Network” section. Click “Next”.Screen Shot 2014-01-02 at 14.20.33
  5. Pick up the disk to be used. Click “Next”.Screen Shot 2014-01-02 at 15.14.42
  6. Review the configurations and click “Next”.Screen Shot 2014-01-02 at 15.14.57
  7. The installation will run and in the last step you will see a report. Click “Finish”.Screen Shot 2014-01-02 at 15.15.11
  8. Now you will be able to see a new Role created in the cluster, with all the indicated resources.
Screen Shot 2014-01-02 at 15.16.51

To add a DTC resource into the SQL Server Role, follow the steps:

  1. Right-click the Role, go to “Add Resource”->”More Resources” -> “Distributed Transaction Coordinator”.Screen Shot 2014-01-02 at 15.30.50
  2. The resource will be created in the selected Role, now we need to configure it. Right-click the “New Distributed Transaction Coordinator” and click on “Properties”.Screen Shot 2014-01-02 at 15.31.20
  3. As referred early on this article, the DTC needs a hostname and a disk to work. On dependencies you can pick up those items as shown, and click “ok”.Screen Shot 2014-01-02 at 15.32.44
  4. Now, let’s bring it online.Screen Shot 2014-01-02 at 15.32.55

How to configure the network for distributed transactions?

Note: On clustered environments, you just need to perform the following steps one time.

  1. On “Server Manager” go to “Tools”->”Component Services” or run the command “dcomcnfg”.Screen Shot 2014-01-02 at 15.33.55
  2. Expand the tree, right-click the desired DTC and choose “Properties”.Screen Shot 2014-01-03 at 11.46.28
  3. Go to the “Security” tab and check “Network DTC Acess” as well as “Allow Inbound” and “Allow Outbound”, as shown bellow. Click Ok.Screen Shot 2014-01-03 at 11.49.02
  • Let’s briefly describe the some of the options on this window:
    • Network DTC Access“: Enable/Disable the network access.
    • Allow inbound“:  Permit a distributed transaction originated from another computer to run on the current computer.
    • Allow outbound“:  Permit a distributed transaction initiated in the current computer to run on a remote computer.
    • Enable XA transactions” and “Enable SNA LU 6.2 Transactions“: Enables/Disable those particular specifications for distributed transactions.

Troubleshooting DTC

There’s a tool called DTC Ping which can help us to verify if the DTC is working correctly on all the computers that should be involved in a transaction.

You can download this tool here:

I recommend the reading of this article, to learn hos to use this tool, as well as troubleshoot the possible errors: Troubleshooting MSDTC issues with the DTCPing tool.

Another great tool is the DTC Tester. You can simulate a distributed transaction on SQL Server:

Screen Shot 2014-01-03 at 14.15.25

To download and get more info about this tool, check this link: .

The End

This way we finish this series about how to install a clustered instance. We still have too many details to cover and I will try to create separated articles with best practices, configuration alternatives, etc.

I hope you enjoyed this series and as always, if you have any doubts, contact me!

Thank you for reading!

If you want to check the other parts of this series, here are the links:

Categories: DBA Blogs

A New Year, a New Dancecard

Pythian Group - Fri, 2014-01-10 09:29

First things first: Happy New Year y’all! Health, happiness, and all that jazz to each and everyone of youses!

So the Holidays are over — or almost over, if you are one of the few lucky souls. Which means that we have to get back into the saddle. Considering that, ahem, some of us have just fallen off the grid and into the eggnog for the last two weeks, that’s easier said than done. In order to jog those hacking muscles back into shape, I decided to revisit an idea that I outlined in the Holidays of 2012: dancecard.

To recap, when I start a new Dancer (or Catalyst application, I always have to drop in the JavaScript libraries I’m using. That’s tedious. Instead, I would like to have a little tool that invites chosen libraries to come and dance with the app. A kind of Tanzkarte, if you will.

For the 2014 edition of this dancecard, I decided to take a slightly different approach. Instead of keeping a repository of the libraries locally, I wanted to be even lazier and simply have a main registry that would point to the download location or the git repository of the libraries.

For the moment I won’t go into the details of the code. Mostly because I pretty much banged the keyboard like a semi-hangover monkey for the last 2 hours, and it reflects in the quality of the codebase. Let it just be said that the git repo is at the usual place, and that I used MooseX::App to build the app.

So, what did those few hours of hacking gave me? Well, with the config file ~/.dancecard/config.yaml, which looks like:

browser_command: firefox %s
            - underscore

I can now get a list of all libraries I have at my disposal:

$ dancecard list
    no lib installed

Of course, I can also install any of those. And… look Ma: dependencies are handled automatically!

$ dancecard install backbone
 backbone installed
 underscore installed

Something I also often do is peruse the online documentation of those libraries. Noticed those “documentation” urls in the yaml registry? It’s there so that I can do,

$ dancecard doc bacbone

and have the documentation page opened automatically in Firefox. Or, for maximum laziness, I can also do,

$ dancecard doc --installed

which will open the documentation URL of every library found in the current project.

Next Steps

If anyone shows interest in the thing, I’ll bundle the app for CPAN consumption. As for features, things that I’ll probably tackle for my own use are:

  • Tags (no software is true software without tags nowadays);
  • Virtual libraries (just as a mean to install a group of libraries in a single go);
  • Fish completion (I wuv the fish shell);
  • Cached local documentation (’cause hitting the ‘Net each time I look for a piece of documentation fills me with guilt);
  • An online main registry of libraries;
  • And whatever else I may think of.
Categories: DBA Blogs