DBA Blogs

A Tale of Three Cities: Perspectives on innovation from New York, San Francisco and Sydney

Pythian Group - Tue, 2016-03-22 12:29

Recently, Pythian hosted a number of Velocity of Innovation (Velocity) events. I moderated two of these: one last June in New York, and one in November in San Francisco. Another event in Sydney, Australia was moderated by Tom McCann, senior customer experience analyst with Forrester.

Our Velocity events have given us unique insights into what IT professionals in various regions see as their top priorities or concerns. And although we always framed our discussions with similar questions, it was interesting to see the different directions they took in each location — especially when it came to the topic of innovation.

So what makes a particular region fertile ground for innovation? And can you measure it?

The Global Innovation Index (GII) ranks countries based on a multitude of indicators of innovation. The United States ranks number 2 on the GII, behind Switzerland, while Australia is number 17, out of 141 countries. According to the GII website, the index aims to capture the multi-dimensional facets of innovation and provide the tools to assist in tailoring policies to promote long-term output growth, improved productivity and job growth.

The ideas discussed in the US and Australian locations seemed to align with the GII results, with US panelists expressing more positive attitudes and concrete ideas on how companies can improve agility and become more innovative. And while being at the forefront of technology in the Asia-Pacific region, the Australian panelists and audience members described more cautious approaches to achieving innovation.

Sydney: Cautiously moving forward

Early in the Sydney panel discussion, Chris Mendez, executive consultant big data and analytics from Industrie IT, sparked a lively discussion about innovation by asserting that innovation is lacking in that region.

“I actually don’t think there’s enough innovation in Australia, in particular. There’s a lot of talk about it, people are doing a lot of experiments, and there are some companies who’ve set up business purely based on tool sets that use data to innovate. But there are a few things that seem to be working against innovation, and I think one of those things is that it doesn’t stand on its own,” Mendez said.

According to Francisco Alvarez, vice president, APAC at Pythian, the risks associated with innovation might be holding companies back in Australia. “The main problem for most companies is that innovation equals risk,” Alvarez said.

Alvarez also commented on what it takes to make innovation work. “If you take a step back and look at the companies that are doing well in the market, you can see that there is one factor that differentiates them: they were not afraid to try to innovate. And because of that innovation they are getting their share of the market and gaining ground. Just look at the financial market. CBA was considered crazy a few years ago for all the investment they were making in technology, social media, apps and so on. They got ahead. And now everybody is trying to do the same,” he said.

Mendez thinks that innovation needs to start from the top. “I think there’s also a very big misunderstanding at board levels about innovation because boards are there to actually stop you changing your business. The fundamental tenant is: ‘We’ve got a great business model here, it’s running well, we’ve got to make sure that any change to it doesn’t damage that.’ There’s a natural caution at board levels and it’s totally understandable,” he said.

While cautious, the Sydney panelists expressed that they thought there is hope for more innovation in the future. They expressed a need to proceed slowly, watching what works for innovation leaders.

“The key is to have a balance,” Alvarez said.

New York: Early adopters

If you were to put our New York panelists on Geoffrey Moore’s https://en.wikipedia.org/wiki/Geoffrey_Moore Technology Adoption Lifecycle, you might classify them as early adopters, rather than true innovators. Not surprising, since New York’s competitive industries such as banking and publishing rely on innovative technologies, but they don’t create them.

According to New York panelist, Forrester Analyst Gene Leganza, what makes an enterprise agile is the ability to sense what’s going on in the marketplace and to quickly respond to it. But, he said that innovation comes at a cost. “The flip side of agility is innovation. An interesting aspect of innovation is getting really hot talent into your environment. Getting the right talent and doing smart things and being leading edge are challenges. You have to figure out what level to drop in on, where you are in the industry. You need to determine if you are a startup or a state organization that needs to be a fast follower,” Leganza said.

Otto Toth, CTO at Huffington Post warned that innovating quickly is not always in the best interest of the business, or it may not be the way to do it properly. He asserted that quick innovation can actually work against the business, and that instead of making your business faster, being very agile can slow everything down.

“Too many decision-makers just slow down the process. It’s better to have a few people or a core team who make the decisions and come up with new features,” he added.

Leganza went on to describe what it takes at various levels of the organization. He said that there’s a notion at the engineer level that agility means bureaucracy won’t get in their way. Then there’s agility at the enterprise level, which is about reducing risk and understanding how soon change can be in production.

“The higher up you go, the more people are going to be receptive to what improves the whole portfolio rather than one project. This is where architects come in. They have been hands-on, but have the credibility and knowledge to guide the organization more strategically,” Leganza said.

San Francisco: The innovators

In San Francisco the narratives on innovation were quite different. Although cities don’t have their own GII ranking, you might assume that the West Coast IT leaders are the innovators. And judging by the discussion at the San Francisco event, this assumption seemed to be true.

Cory Isaacson, CTO at RMS was one of our San Francisco panelists. His company runs catastrophe models for some of the world’s largest insurance companies, like scenarios that will tell what a disaster like an earthquake or hurricane might cost them. Isaacson has been working on bringing big data and scalable systems together to create a new cloud-based platform.

“At my company some of the things that we’re trying to do are, honestly, more advanced than most other things I’ve ever seen in my career. But when you’re doing innovation, it is risky. There’s no way around it. There is a lot to evaluate: from different algorithms to the risk models and the catastrophe models,” said Isaacson.

Sean Rich, director of IT at Mozilla added to the San Francisco discussion by talking about some of the concrete innovations his company is working on. They’re taking a partnership approach to enable agility.

“Innovation is doing something new. In an effort toward achieving agility, one of the things that we’re doing is enabling the agility of our business partners, by changing our own operating model. Instead of traditional IT where we run all the services and infrastructure necessary to drive the business, we’re taking more of an enabler or partnership approach,” Rich said.

“We’re now doing things like encouraging shadow IT, encouraging the use of SaaS applications and helping them really do that better through different service offerings like vendor management or change management of user adoption for certain platforms and data integration” he added.

“Overall, we’re looking at ourselves differently, and asking what new capabilities we need to develop, and what processes, tools and skills we need to enable agility for our marketing group or our product lines, as an example,” Rich said.

Aaron Lee, the Chief Data Officer at Pythian, runs a team that specializes in helping clients harness technology to deliver real outcomes. Usually they involve things like big data, DevOps, cloud, advanced analytics — he’s involved in some of the most leading edge initiatives for Pythian customers. He takes a practical approach to innovation with clients, and said that companies could improve innovation by looking at the root of the motivation for it.

“They need to ask: Why are we going down this path, trying to innovate something and what is the value of that thing we’re trying to innovate?

“If the shared goals around innovation opportunities aren’t defined in a way that actually lead to success over time, then the business is just like any other organism: it starts to get more risk averse. Then it becomes harder and harder to execute any kind of change agenda. Planning in a way that is likely to have a good long-term outcome, even at the outset of any sort of initiative, is one key success criteria that we put in place to help ourselves and our customers get to a good place,” Lee said.

Isaacson added that companies like Google have been known to allow an engineer to take a day a week or a day every two weeks to just look at things. “I think though, the challenge is you have to get your organization up to the point where this is an economically viable thing to do. Once we get more ahead of the curve, I think we could do that kind of thing,” he said.

Interested in being a part of a discussion like these? VELOCITY OF INNOVATION is a series of thought-leadership events for senior IT management hosted by Pythian. Pythian invites leading IT innovators to participate in discussions about today’s disruptive technologies: big data, cloud, advanced analytics, DevOps, and more. These events are by invitation only.

If you are interested in attending an upcoming Velocity of Innovation event in a city near you, please contact events@pythian.com. To view our schedule of upcoming events visit our Velocity of Innovation page.

Categories: DBA Blogs

I Am Speaking at OTN Yathra 2016

Oracle in Action - Tue, 2016-03-22 09:19

RSS content

The Oracle ACE directors and Oracle Volunteers  in the region are organizing their third evangelist event called ‘OTNYathra 2016’ from 23rd  April 2016 to 1st May 2016.  This yathra or tour will a series of 6 conferences across 6 major cities (Chennai, Bangalore, Hyderabad,  Pune, Mumbai and Delhi) managed by ACE directors and Oracle Volunteers of the region.

I will be speaking at this year’s OTNYathra  about Oracle Database 12c new feature : Highly Available NFS (HANFS) over ACFS.

HANFS over ACFS enables highly available NFS servers to be configured using Oracle ACFS clusters. The NFS exports are exposed through Highly Available VIPs (HAVIPs), and this allows Oracle’s Clusterware agents to ensure that HAVIPs and NFS exports are always available. If the node hosting the export(s) fails, the corresponding HAVIP and hence its corresponding NFS export(s) will automatically fail over to one of the surviving nodes so that the NFS client continues to receive uninterrupted service of NFS exported paths.

My session will be held on Sunday 1st May, 2016   from 3:00pm to 3:50pm in
Room 1, BirlaSoft, H–9, Sector 63, NOIDA – 201306, NCR Delhi
Hope to meet you there!!



Copyright © ORACLE IN ACTION [I Am Speaking at OTN Yathra 2016], All Right Reserved. 2016.

The post I Am Speaking at OTN Yathra 2016 appeared first on ORACLE IN ACTION.

Categories: DBA Blogs

Apache Cassandra 2.1 Incremental Repair

Pythian Group - Mon, 2016-03-21 15:05

The “incremental repair” feature has been around since Cassandra’s 2.1. Conceptually the idea behind incremental repair is straightforward, but it can get complicated. The official Datastax document describes the procedure for migrating to incremental repair, but in my opinion, it doesn’t give a full picture. This post aims to fill in this gap by summarizing and consolidating the information of Cassandra incremental repair.

Note: this post assumes the reader has a basic understanding of Apache Cassandra, especially the “repair” concept within Cassandra.


1. Introduction

The idea of incremental repair is to mark SSTables that are already repaired with a flag (a timestamp called repairedAt indicating when it was repaired) and when the next run of repair operation begins, only previously unrepaired SSTables are scanned for repair. The goal of an “incremental repair” is two-fold:

1) It aims to reduce the big expense that is involved in a repair operation that sets out to calculate the “merkle tree” on all SSTables of a node;

2) It also makes repair network efficient because only rows that are marked as “inconsistent” will be sent across the network.

2. Impact on Compaction

“Incremental repair” relies on an operation called anticompaction to fulfill its purpose. Basically, anticompaction means splitting an SSTable into two: one contains repaired data and the other contains non-repaired data. With the separation of the two sets of SSTables, the compaction strategy used by Cassandra also needs to be adjusted accordingly. This is because we cannot merge/compact a repaired SSTable with an unrepaired SSTable together. Otherwise, we lose the repaired states.

Please note that when an SSTable is fully covered by a repaired range, no anticompaction will occur. It will just rewrite the repairedAt field in SSTable metadata.

SizeTiered compaction strategy takes a simple strategy. Size-Tiered compaction is executed independently on the two sets of SSTables (repaired and unrepaired), as the result of incremental repair Anticompaction operation.

For Leveled compaction strategy, leveled compaction is executed as usual on repaired set of SSTables, but for unrepaired set of SSTables, SizeTiered compaction will be executed.

For DateTiered compaction strategy, “incremental repair” should NOT be used.

3. Migrating to Incremental Repair

By default, “nodetool repair” of Cassandra 2.1 does a full, sequential repair. We can use “nodetool repair” with “-inc” option to enable incremental repair.

For Leveled compaction strategy, incremental repair actually changes the compaction strategy to SizeTiered compaction strategy for unrepaired SSTables. If a nodetool repair is executed for the first time on Leveled compaction strategy, it will do SizeTiered compaction on all SSTables because until the first incremental repair is done, Cassandra doesn’t know the repaired states. This is a very expensive operation and it is therefore recommended to migrate to incremental repair one node at a time, and follow the following procedure to migrate to incremental repair:

  1. Disable compaction on the node using nodetool disableautocompaction
  2. Run the default full, sequential repair.
  3. Stop the node.
  4. Use the tool sstablerepairedset to mark all the SSTables that were created before you disabled compaction.
  5. Restart cassandra
3.1 Tools for managing SSTable repaired/unrepaired state

Cassandra offers two utilities for SSTable repaired/unrepaired state management:

  • sstablemetadata is used to check repaired/unrepaired state of an SSTable. The syntax is as below:

             sstablemetadata <sstable filenames>

  • sstablerepairedset is used to manually mark if an SSTable is repaired or unrepaired. The syntax is as below. Note that this tool has to be used when Cassandra is stopped.

             sstablerepairedset [–is-repaired | –is-unrepaired] [-f <sstable-list> | <sstables>]

Please note that with utility sstablerepairedset, you can also stop incremental repair on Leveled compaction and restore the data to be leveled again with the “—is-unrepaired” option. Similarly, the node needs to be stopped first.

4. Other Considerations with Incremental Repair

There are some other things to consider when using incremental repair.

  • For Leveled compaction, once an incremental repair is used, it should be done so continuously. Otherwise, only SizeTiered compaction will be executed. It is recommended to run incremental repair daily and run full repairs weekly to monthly.
  • Recovering from missing data or corrupted SSTables require a non-incremental full repair.
  • “nodetool repair” –local option should be only used with full repair, not with incremental repair.
  • In C* 2.1, sequential repair and incremental repair does NOT work together.
  • With SSTable’s repaired states being tracked via it’s metadata, some Cassandra tools can impact the repaired states:
    1. Bulk loading will make loaded SSTables unrepaired, even if was repaired in a different cluster.
    2. If scrubbing causes dropped rows, new SSTables will be marked as unrepaired. Otherwise, SSTables will keep their original repaired state.
Categories: DBA Blogs

Pythian at Collaborate 16

Pythian Group - Mon, 2016-03-21 14:27

Collaborate is a conference for Oracle power users and IT leaders to discuss and find solutions and strategies based on Oracle technologies. This many Oracle experts in one place only happens one per year, and Pythian is excited to be attending. If you are attending this year, make sure to register for some of the sessions featuring Pythian’s speakers, listed below.

Collaborate 16 is on April 10-14, 2016 at the Mandalay Bay Resort and Casino in Las Vegas, Nevada, US.


Pythian Collaborate 16 Speaker List:


Michael Abbey | Consulting Manager | Oracle ACE

Communications – the Good, the Bad, and the Best

Tues April 12 | 9:15 a.m. – 10:15 a.m. | North Convention, Room South Pacific D

Traditional DB to PDB: The Options

Tues April 12 | 2:15 p.m. – 3:15 p.m. | Room Jasmine A

Documentation – A Love/Hate Relationship (For Now)

Wed April 13 | 8:00 a.m. – 9:00 a.m. | Room Palm A


Nelson Caleroa | Database Consultant | Oracle ACE

Exadata Maintenance Tasks 101

Tues April 12 | 10:45 a.m. – 11:45 am | Room Palm C

Evolution of Performance Management: Oracle 12c Adaptive Optimization

Tues April 12 | 3:30 p.m. – 4:30 p.m | Room Jasmine A


Subhajit Das Chaudhuri | Team Manager

Deep Dive Into SSL Implementation Scenarios for Oracle Application E-Business Suite

Wed April 13 | 8:00 a.m. – 9:00 a.m. | Room Breakers E


Alex Gorbachev | CTO | Oracle ACE Director

Oaktable World: TED Talks

Wed April 13 | 12:00 p.m. – 12:30 p.m. | Room Mandalay Bay Ballroom

Oaktable World: Back of a Napkin Guide to Oracle Database in the Cloud

Wed April 13 | 4:15 p.m. – 5:15 p.m. | Room Mandalay Bay Ballroom


Gleb Otochkin | Principal Consultant

Two towers or story about data migration. Story about moving data and upgrading databases.

Mon April 11 | 4:30 p.m. – 5:30 p.m. | Room Jasmine A


Simon Pane | ATCG Principal Consultant | Oracle Certified Expert

Oracle Database Security: Top 10 Things You Could & Should Be Doing Differently

Mon April 11 | 2 p.m. – 3 p.m. | Room Palm A

Time to get Scheduling: Modernizing your DBA scripts with the Oracle Scheduler (goodbye CRON)

Tues April 12 | 10:45 a.m. – 11:45 a.m. | Room Palm A


Roopesh Ramklass | Principal Consultant

Oracle Certification Master Exam Prep Workshop

Sun April 10 | 9:00 a.m. – 3:00 p.m. | Room Jasmine C

Fast Track Your Oracle Database 12c Certification

Wed April 13 | 8:00 a.m. – 9:00 a.m. | Room Jasmine A


Categories: DBA Blogs

Step by Step installation oracle 12c database on Linux 6 (centos)

Learn DB Concepts with me... - Sun, 2016-03-20 16:24
Assumptions :

  • You have a some flavor of Linux operating system installed (I have used centos 6 in this example).
  • If you cant afford a separate machine you can use Virtual box or stemware software to visualize your desktop or laptop.
  • Assuming that you have downloaded oracle 12 software onto linux machine. If not you can download from this link Software-Download
  • You have full/required privileges on you Linux host.

Oracle Installation Prerequisites

In order to perform the installtion of oracle 12c software on Linux box you need to perform some pre-reqs, which can be done automatically or through manual updates.Please follow below instructions.

Automatic Setup

If you plan to use the "oracle-rdbms-server-12cR1-preinstall" package to perform all your prerequisite setup, issue the following command.

# yum install oracle-rdbms-server-12cR1-preinstall -y

It will be a good option to to do an update.

# yum update

************* ***********


************* ***********

If you have not used the "oracle-rdbms-server-12cR1-preinstall" package to perform all prerequisites, you will need to manually perform the following setup tasks.

Add or amend the following lines in the "/etc/sysctl.conf" file.

fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 4398046511104
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
Run the following command to change the current kernel parameters.

/sbin/sysctl -p

Add the following lines to the "/etc/security/limits.conf" file.

oracle   soft   nofile    1024
oracle   hard   nofile    65536
oracle   soft   nproc    16384
oracle   hard   nproc    16384
oracle   soft   stack    10240
oracle   hard   stack    32768


# From Public Yum or ULN
 Yum install binutils,  compat-libcap1,  compat-libstdc++-33,  compat-libstdc++-33.i686,  gcc,  gcc-c++,  glibc,  glibc.i686,  glibc-devel,  glibc-devel.i686,  ksh,  libgcc,  libgcc.i686,  libstdc++,  libstdc++.i686,  libstdc++-devel,  libstdc++-devel.i686,  libaio,  libaio.i686,  libaio-devel,  libaio-devel.i686,  libXext,  libXext.i686,  libXtst,   libXtst.i686,  libX11,   libX11.i686,  libXau,  libXau.i686,   libxcb,  libxcb.i686,  libXi,  libXi.i686,  make,  sysstat,  unixODBC,  unixODBC-devel
Create the new groups and users as per your requirement. For my case just to keep it simple lets use 3 groups & oracle user.

groupadd -g 54321 oinstall
groupadd -g 54322 dba
groupadd -g 54323 oper

useradd -u 54321 -g oinstall -G dba,oper oracle.

Set SELINUX to permissive or diable it if this is test env.
Set secure Linux to permissive by editing the "/etc/selinux/config" file, making sure the SELINUX flag is set as follows.


Create the directories in which the Oracle software will be installed.

mkdir -p /u01/app/oracle/product/12.1/db_1
chown -R oracle:oinstall /u01
chmod -R 775 /u01

Add the following lines at the end of the "/home/oracle/.bash_profile" file.

# Oracle Settings
export TMP=/tmp
export TMPDIR=$TMP

export ORACLE_HOSTNAME=ol6-121.localdomain
export ORACLE_UNQNAME=orcl
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/12.1/db_1
export ORACLE_SID=orcl

export PATH=/usr/sbin:$PATH

export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
export CLASSPATH=$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib

Change your directory to location where you have downloaded oracle software.

Start the Oracle Universal Installer (OUI) by issuing the following command in the database directory.


I don't want any update to uncheck to receive updates.

Lets create a server class.

To keep this simple I am going to select typical installation.

Make sure you have selected the home path correct. 

After This step you will be notified if you have any pre-req failures. Make sure you have cleared them all. Missing ksh package can be ignored as this is a known bug. Oracle is expecting a specific version of ksh & i have a latest pkg. Assuming that you have all cleared up.

Now select install.

Now you will be prompted to execute shell scripts before the installation of software is complete. I have missed that prompt screen but it will ask you to execute below 2 shell scripts as root user. see below screen

After executing hit OK and it will continue to install oracle DB software.

You will see this screen after installation is complete.

That's it you have completed your oracle 12c database software installation. You can query as below

Please drop your comments below if you found this blog helpful to you.

Categories: DBA Blogs

Links for 2016-03-19 [del.icio.us]

Categories: DBA Blogs

Partner Webcast - Oracle Developers Tools Update

Oracle's developer tools strategy is to offer the best possible developer tools choices to support diverse needs.Oracle offers a complete and integrated set of application development and...

We share our skills to maximize your revenue!
Categories: DBA Blogs


Learn DB Concepts with me... - Sat, 2016-03-19 17:30


-------------------- --------------------

SQL> archive log list;
Database log mode           Archive Mode
Automatic archival           Enabled
Archive destination           /u01/app/oracle/oraarch/
Oldest online log sequence     250
Next log sequence to archive   0
Current log sequence           252

SQL> recover managed standby database using current logfile disconnect from session;
Media recovery complete.
SQL> alter database open read only;

Database altered.


[oracle@Linux01 Desktop]$ sqlplus /"As sysdba"

SQL> select name,open_mode,db_unique_name,PROTECTION_MODE,PROTECTION_LEVEL FROM V$DATABASE;

--------     --------------------    ---------         --------------------         ---------

SQL> conn atoorpu/XXXXX;
SQL> create table abcd (name varchar2(20));

Table created.

SQL> insert into abcd values ('arvind');

1 row created.

SQL> commit;

Commit complete.

SQL> alter system switch logfile;

System altered.

SQL> insert into abcd values ('reddy');

1 row created.

SQL> commit;

Commit complete.

SQL> create table abcds as select * from abcd;

Table created.


[oracle@Linux02 Desktop]$ sqlplus /"As sysdba"

SQL> select name,open_mode,db_unique_name,PROTECTION_MODE,PROTECTION_LEVEL FROM V$DATABASE;

--------     --------------------    ---------         --------------------         ---------

SQL> conn atoorpu/XXXXX;
SQL> select * from abcd;


SQL> select * from abcd;


Categories: DBA Blogs


Learn DB Concepts with me... - Sat, 2016-03-19 10:16

-------------------- --------------------

SQL> select name,db_unique_name,open_mode,DATABASE_ROLE from v$database;

--------- ------------------------------ -------------------- ----------------

 SQL> alter database open read only;

Database altered.

SQL> select name,db_unique_name,open_mode,DATABASE_ROLE from v$database;

--------- ------------------------------ -------------------- ----------------


-------------------- --------------------

SQL> SELECT database_role, open_mode FROM v$database;

---------------- --------------------

SQL> select process,status from v$managed_standby;

--------- ------------

Categories: DBA Blogs

Compression -- 4 : RMAN (BASIC) Compression

Hemant K Chitale - Sat, 2016-03-19 07:34
BASIC Compression in RMAN is free with the Enterprise Edition. Advanced Compression with a specified algorithm requires the Advanced Compression Option Licence.

RMAN> backup as compressed backupset database root;

Starting backup at 19-MAR-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=48 device type=DISK
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_sysaux_9fxmvhl3_.dbf
input datafile file number=00001 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_9fxmx6s1_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_undotbs1_9fxn0vgg_.dbf
input datafile file number=00006 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_users_9fxn0t8s_.dbf
channel ORA_DISK_1: starting piece 1 at 19-MAR-16
channel ORA_DISK_1: finished piece 1 at 19-MAR-16
piece handle=/u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T081217_cgtjk2ms_.bkp tag=TAG20160319T081217 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:02:16
Finished backup at 19-MAR-16

Starting Control File and SPFILE Autobackup at 19-MAR-16
piece handle=/u01/app/oracle/fast_recovery_area/ORCL/autobackup/2016_03_19/o1_mf_s_906884075_cgtjofh1_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 19-MAR-16

RMAN> list backup;

List of Backup Sets

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
1 Full 385.78M DISK 00:02:09 19-MAR-16
BP Key: 1 Status: AVAILABLE Compressed: YES Tag: TAG20160319T081217
Piece Name: /u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T081217_cgtjk2ms_.bkp
List of Datafiles in backup set 1
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- --------- ----
1 Full 3034354 19-MAR-16 /u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_9fxmx6s1_.dbf
3 Full 3034354 19-MAR-16 /u01/app/oracle/oradata/ORCL/datafile/o1_mf_sysaux_9fxmvhl3_.dbf
4 Full 3034354 19-MAR-16 /u01/app/oracle/oradata/ORCL/datafile/o1_mf_undotbs1_9fxn0vgg_.dbf
6 Full 3034354 19-MAR-16 /u01/app/oracle/oradata/ORCL/datafile/o1_mf_users_9fxn0t8s_.dbf

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
2 Full 17.20M DISK 00:00:03 19-MAR-16
BP Key: 2 Status: AVAILABLE Compressed: NO Tag: TAG20160319T081435
Piece Name: /u01/app/oracle/fast_recovery_area/ORCL/autobackup/2016_03_19/o1_mf_s_906884075_cgtjofh1_.bkp
SPFILE Included: Modification time: 19-MAR-16
SPFILE db_unique_name: ORCL
Control File Included: Ckp SCN: 3034519 Ckp time: 19-MAR-16

1 select status, command_id, to_char(start_time,'DD-MON HH24:MI') Start_At, to_char(end_time,'DD-MON HH24:MI') End_At,
2 trunc(input_bytes/1048576) Input_MB, trunc(output_bytes/1048576) Output_MB, input_type
3 from v$rman_backup_job_details
4* order by start_time

------------------- ------------------------- ----------------- ------------- ---------- ----------
COMPLETED 2016-03-19T08:12:02 19-MAR 08:12 19-MAR 08:14 1807 402


Compare the INPUT_MB and OUTPUT_MB to see the compression achieved.

Default compression can also be configured with

RMAN> show compression algorithm;

RMAN configuration parameters for database with db_unique_name ORCL are:


new RMAN configuration parameters:
new RMAN configuration parameters are successfully stored

RMAN> backup datafile 1;

Starting backup at 19-MAR-16
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1 device type=DISK
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_9fxmx6s1_.dbf
channel ORA_DISK_1: starting piece 1 at 19-MAR-16
channel ORA_DISK_1: finished piece 1 at 19-MAR-16
piece handle=/u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T082704_cgtkdrdh_.bkp tag=TAG20160319T082704 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:55
Finished backup at 19-MAR-16

Starting Control File and SPFILE Autobackup at 19-MAR-16
piece handle=/u01/app/oracle/fast_recovery_area/ORCL/autobackup/2016_03_19/o1_mf_s_906884879_cgtkgk5d_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 19-MAR-16

RMAN> list backup of datafile 1;

List of Backup Sets

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
1 Full 385.78M DISK 00:02:09 19-MAR-16
BP Key: 1 Status: AVAILABLE Compressed: YES Tag: TAG20160319T081217
Piece Name: /u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T081217_cgtjk2ms_.bkp
List of Datafiles in backup set 1
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- --------- ----
1 Full 3034354 19-MAR-16 /u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_9fxmx6s1_.dbf

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
3 Full 207.82M DISK 00:00:55 19-MAR-16
BP Key: 3 Status: AVAILABLE Compressed: YES Tag: TAG20160319T082704
Piece Name: /u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T082704_cgtkdrdh_.bkp
List of Datafiles in backup set 3
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- --------- ----
1 Full 3035302 19-MAR-16 /u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_9fxmx6s1_.dbf

1 select status, command_id, to_char(start_time,'DD-MON HH24:MI') Start_At, to_char(end_time,'DD-MON HH24:MI') End_At,
2 trunc(input_bytes/1048576) Input_MB, trunc(output_bytes/1048576) Output_MB, input_type
3 from v$rman_backup_job_details
4* order by start_time

------------------- ------------------------- ----------------- ----------------- --------- ---------
COMPLETED 2016-03-19T08:12:02 19-MAR 08:12 19-MAR 08:14 1807 402

COMPLETED 2016-03-19T08:18:29 19-MAR 08:27 19-MAR 08:28 803 225

CDB$ROOT@ORCL> !du -sh /u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T082704_cgtkdrdh_.bkp
208M /u01/app/oracle/fast_recovery_area/ORCL/backupset/2016_03_19/o1_mf_nnndf_TAG20160319T082704_cgtkdrdh_.bkp


Note the slight discrepancy in reported backuppiece size.  The LIST BACKUP shows it as 207.82MB,  the query on v$rman_backup_job_details dividing the output_bytes by 1048576 shows the size as 225MB (truncating to units) while the OS level "du -sh" command shows it as 208MB.

Categories: DBA Blogs

Links for 2016-03-18 [del.icio.us]

Categories: DBA Blogs

The 5 Best Things That Will Happen to DBAs When SQL Server Moves to Linux

Pythian Group - Fri, 2016-03-18 14:50


In the second half of 2017, SQL Server will start calling Linux its second home. Azure Data Lake for Ubuntu was the sign that Microsoft was serious about going Linux. Private preview is already available for SQL Server on Linux, and this scribe is not part of it but let me wildly guess what would warm the hearts of those DBAs who have played with Oracle on Linux and SQL Server on Windows:


  1. Cleanup won’t require sifting through registry entries and cleaning it up. The uninstall would be quick, simple, and a breeze.
  2. No confusion about Windows authentication or SQL Server authentication.
  3. A much much better system utilities for monitoring and root cause analysis.
  4. Boasting and bragging rights about managing an enterprise database on Linux.


Probably the most serious thing that will happen with SQL Server on Linux is that more and more Oracle DBAs who prefer to use Linux, will start taking an interest in managing SQL Server.

Pythian is perfectly poised to leverage this change in technology from Microsoft. We have world class SQL Server DBAs, Linux gurus, and some magnificent Oracle DBAs. Existing clients as well as new clients can count on these resources to get their SQL Server databases supported or migrated to Linux.

And of course, it doesn’t matter really whether these database are on cloud or not come 2017, because Pythian has already covered that too.

Categories: DBA Blogs

Pythian Tech Chatter

Pythian Group - Fri, 2016-03-18 13:23

At Pythian, the DevOps, Big Data, and Data Science teams use Slack for our IM system. We’re a diverse group drawn to a wide range of technologies so there’s some interesting and valuable chatter about what folks are reading. Here are some subjects that came across our channels the past couple of weeks:


Pythian engineers support Solr and Elasticsearch for a number of clients. Here’s a great summary of the important tunables in elasticsearch:

We often are called into companies to help manage deployment infrastructure, and it is quite common to encounter an unwieldy monolithic application that has been cobbled together over the years. Our head of DevOps pointed out this book as a great resource for strategies to decompose those into microservices:

Big Data

As Data Lake architectures mature, we’re seeing more comprehensive offerings from vendors. This blog talks about Microsoft’s offerings. I like the integration of active directory for strong security and the U-SQL approach to pulling data from data lakes.  While I’m not a huge fan of C#, the concept of having a library of extractors and outputters is a nice nut and bolt approach:

Another thread from the Data Lakes discussion highlights the critical importance of Data Governance. Waterline’s Data Inventory tool is a strong player for MetaData/Governance automation:

Data Science

Facebook released its implementation of deep learning neural nets last year. The Data Science team has been spending some time with it as they evaluate and build AI tools:

The team has also been using some great Java tools for natural language processing from Stanford:

Categories: DBA Blogs

The Art of Mobilising Oracle Forms

It is one thing to have a vision to mobilise parts of your business to realise benefits, but it is quite another thing to embark on an undertaking. Yes you want an amazing mobile experience, push...

We share our skills to maximize your revenue!
Categories: DBA Blogs

2016 Oracle Digital Transformation EMEA Partner Community Forum 12.-13. April 2016, Budapest

Registration for the FREE exclusive two-day EMEA Digital Transformation partner Community forum, April 12&13, Budapest (Hungary), is now open! This is a must attend event if you are: ...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Math Resources

Bobby Durrett's DBA Blog - Thu, 2016-03-17 17:51

I feel like I have not been posting very much on this blog lately. I have been focused on things outside of Oracle performance so I haven’t had a lot of new scripts to post.  I have been quietly updating my Python source code on GitHub so check that out. I have spent a lot of time educating myself in various ways including through the leadership and communication training program that comes from Toastmasters. My new job title is “Technical Architect” which is a form of technical leadership so I’m trying to expand myself beyond being an Oracle database administrator that specializes in performance tuning.

In addition to developing my leadership and communication skills I have gotten into a general computer science self-education kick. I took two introductory C.S. classes on edX. I also read a book on Linux hacking and a book on computer history. I was thinking of buying one of the Donald Knuth books or going through MIT’s free online algorithms class class 6.006. I have a computer science degree and spent two years in C.S. graduate school but that was a long time ago. It is kind of fun to refresh my memory and catch up with the latest trends. But the catch is that both the Knuth book and MIT’s 6.006 class require math that I either never learned or have forgotten. So, I am working my way through some math resources that I wanted to share with those who read this blog.

The first thing I did was to buy a computer math book, called Concrete Mathematics,  that seemed to cover the needed material. Reviews on Amazon.com recommended this book as good background for the Knuth series and one of the Oracle performance experts that I follow on Twitter recommended it for similar reasons. But, after finishing my second edX class I began exploring the MIT OCW math class that was a prerequisite to MIT’s 6.006 algorithms class. MIT calls the math class 6.042J and I am working through the Fall 2010 version of the class. There is a lot of overlap between the class and the book but they are not a perfect match. The book has some more difficult to follow material than the class. It is probably more advanced.  The class covers some topics, namely graph theory, that the book does not.  The free online class has some very good lecture videos by a top MIT professorTom Leighton. I even had my wife and daughters sit down and watch his first lecture with me on our family television for fun on my birthday.

The book led me to a great free math resource called Maxima. Maxima has all kinds of great math built into it such as solving equations, factoring integers, etc. Plus, it is free. There are other similar and I think more popular programs that are not free but for my use it was great to simply download Maxima and have its functionality at my fingertips.

The last resource that I wanted to mention is the Mathematics section of Stack Exchange. It is a pretty structured online forum with a question and answer format. It is helpful to me since I am going through 6.042J without a professor or teaching assistant to answer my questions. The people on math stack exchange are very helpful if you at least try to follow the etiquette for their forum. For example, they have an easy to use way to format math formulas in your questions and answers and the users of the forum expect you to use it. But it isn’t hard. I had one question from the Concrete Math book where I couldn’t understand the answer key in the back. I asked about it on stack exchange and got a great answer in no time.

Anyway, maybe all of this math and computer science study is a departure from my bread and butter Oracle database work and performance tuning. But the free online resources like the OCW web site, the Maxima program, and the stack exchange forum along with the book that I paid for are a great set of resources. I have already used some of the concepts that I have learned about number theory and its application to RSA encryption. But, at the same time I am enjoying studying these things and mostly see it as something fun to do in my spare time. (I’m weird I know.)

So, I have written this blog post to share the math related things that I am studying and using to those who might benefit from them. I am not a math expert, but I am getting a lot out of these materials. I hope that others find these resources as enjoyable and educational as I have.


Categories: DBA Blogs

Drop all objects in your current schema

Learn DB Concepts with me... - Thu, 2016-03-17 14:49

You can use this simple pl-sql block to drop all objects in your current schema.

VSQL varchar2(4000);
OBJ_NAME varchar2(100);
OBJ_TYPE varchar2(100);
OBJ_OWNER varchar2(100);
cursor c1 is select object_type,object_name from user_objects where object_type in ('TABLE','VIEW');


open c1;
fetch c1 into OBJ_TYPE,OBJ_NAME;
exit when c1%NOTFOUND;
VSQL:=' drop '||OBJ_TYPE||' '||OBJ_NAME||' cascade constraints';
VSQL:=' drop '||OBJ_TYPE||' '||OBJ_NAME;
end loop;
close c1;

Note :

Remove "where object_type in ('TABLE','VIEW')" to drop all objects from current user.
Or you can edit to include only type of objects you want to be dropped.

You can Alter OBJECT TYPE any of below:

Categories: DBA Blogs

GoldenGate 12.2 Big Data Adapters: part 2 – FLUME

Pythian Group - Thu, 2016-03-17 12:36

In this blog post we continue our review of the new Oracle GoldenGate Big Data adapters. In the first part of the series I tested the basic HDFS adapter and checked how it worked with some DML and DDL. In this article I will try the Flume adapter and see how it works.

A quick reminder on what Flume is. It is not a topic about the popular Australian musician. Today we are talking about Apache Flume. In short, it is a pipeline or a kind of streaming system that allows you to move large amount of data. It has simple architecture and, in general, there are three main components:

a) Source: where data enters into Flume from an outside system.
b) Sink: responsible for passing data to the destination system whether it is the final destination, or another flow.
c) Channel: connects the Source and Sink.

I know that it is a rather simplistic description, but the main subject of this article is not what Flume can do, but how we can pass our data from Oracle to Flume using GoldenGate.
My first post discussed how you set up an Oracle source system, and how to start GoldenGate initial load and extract. I am not repeating it here. Let’s assume we have the source system. It is Oracle database, replicating DML and DDL for one particular schema, and GGTEST using Oracle GoldenGate 12.2 to trail files on our box where we already have GoldenGate for Big Data. Have a look at the first part to see how to set up the GoldenGate for Big Data (OGG BD).

So, we have our OGG BD setup and the manager up and running.

GGSCI (sandbox.localdomain) 1> info manager

Manager is running (IP port sandbox.localdomain.7839, Process ID 18521).

GGSCI (sandbox.localdomain) 2>

What we need now is to prepare our Flume agent to accept messages from OGG. I’ve already set up my Flume’s agent-ng service on my Linux box, and now we need to prepare the configuration file for the agent to handle the income stream, and pass it to the destination system. We will set up our source to “avro” and sink will be writing to HDFS. The source can be either avro or thrift. According to oracle documentation the Flume handler can stream data from a trail file to Avro or Thrift RPC Flume sources.

I have to admit that the destination as HDFS looks quite artificial since we have a special adapter for HDFS and don’t need a Flume to write there. But such a configuration can help us compare different adapters and what they can do.
I used Flume version 1.6.0:

[oracle@sandbox flume-ng]$ bin/flume-ng version
Flume 1.6.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
From source with checksum b29e416802ce9ece3269d34233baf43f
[oracle@sandbox flume-ng]$

Here is my configuration file for the Flume agent:

# Name/aliases for the components on this agent
agent.sources = ogg1
agent.sinks = hdfs1
agent.channels = ch1
#Avro source
agent.sources.ogg1.type = avro
agent.sources.ogg1.bind =
agent.sources.ogg1.port = 4141

# Describe the sink
agent.sinks.hdfs1.type = hdfs
agent.sinks.hdfs1.hdfs.path = hdfs://sandbox/user/oracle/ggflume
#agent.sinks.hdfs1.type = logger

# Use a channel which buffers events in memory
agent.channels.ch1.type = memory
agent.channels.ch1.capacity = 100000
agent.channels.ch1.transactionCapacity = 10000

# Bind the source and sink to the channel
agent.sources.ogg1.channels = ch1
agent.sinks.hdfs1.channel = ch1

I’ve made the configuration simple and clear. You may change agent.sources.ogg1.port and agent.sinks.hdfs1.hdfs.path  depending on your system.

On the target HDFS we have to create directory as it was defined in our sink configuration.

[oracle@sandbox ~]$ hadoop fs -mkdir /user/oracle/ggflume
[oracle@sandbox ~]$ hadoop fs -ls /user/oracle/ggflume
[oracle@sandbox ~]$

We can start our Flume agent now.

[root@sandbox conf]# service flume-ng-agent start
Starting Flume NG agent daemon (flume-ng-agent):           [  OK  ]
[root@sandbox conf]# service flume-ng-agent status
Flume NG agent is running                                  [  OK  ]
[root@sandbox conf]#
[root@sandbox conf]# tail /var/log/flume-ng/flume.log
25 Feb 2016 11:56:37,113 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:120)  - Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
25 Feb 2016 11:56:37,121 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:96)  - Component type: CHANNEL, name: ch1 started
25 Feb 2016 11:56:37,122 INFO  [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:173)  - Starting Sink hdfs1
25 Feb 2016 11:56:37,123 INFO  [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:184)  - Starting Source ogg1
25 Feb 2016 11:56:37,139 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.source.AvroSource.start:228)  - Starting Avro source ogg1: { bindAddress:, port: 4141 }...
25 Feb 2016 11:56:37,146 INFO  [lifecycleSupervisor-1-2] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:120)  - Monitored counter group for type: SINK, name: hdfs1: Successfully registered new MBean.
25 Feb 2016 11:56:37,147 INFO  [lifecycleSupervisor-1-2] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:96)  - Component type: SINK, name: hdfs1 started
25 Feb 2016 11:56:38,114 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:120)  - Monitored counter group for type: SOURCE, name: ogg1: Successfully registered new MBean.
25 Feb 2016 11:56:38,115 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:96)  - Component type: SOURCE, name: ogg1 started
25 Feb 2016 11:56:38,116 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.source.AvroSource.start:253)  - Avro source ogg1 started.
[root@sandbox conf]#

Flume is ready, and we can now prepare our OGG configuration. We have examples for the Flume adapter configuration files in $OGG_HOME/AdapterExamples/big-data/flume/ :

[oracle@sandbox oggbd]$ ll AdapterExamples/big-data/flume/
total 12
-rw-r--r--. 1 oracle oinstall 107 Dec  9 12:56 custom-flume-rpc.properties
-r-xr-xr-x. 1 oracle oinstall 812 Dec  9 12:56 flume.props
-rw-r--r--. 1 oracle oinstall 332 Dec  9 12:56 rflume.prm
[oracle@sandbox oggbd]$

We can copy the examples to our configuration directory and adjust them to our needs:

[oracle@sandbox oggbd]$ cp AdapterExamples/big-data/flume/* dirprm/

Here is configuration file for our adapter:

[oracle@sandbox oggbd]$ cat dirprm/flume.props

gg.handlerlist = flumehandler
#gg.handler.flumehandler.maxGroupSize=100, 1Mb
#gg.handler.flumehandler.minGroupSize=50, 500 Kb





javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

You will need to adjust your gg.classpath depending on your system, as it has to include Flume java classes and a file with Flume source properties (custom-flume-rpc.properties file).
Here is my example for the custom-flume-rpc.properties file which will be used by OGG adapter to connect to the flume-ng agent. I’ve placed it to the dirprm directory along with other parameters.

[oracle@sandbox oggbd]$ cat dirprm/custom-flume-rpc.properties

As you can see, my flume-ng agent is on the same host as the OGG which may not be the same for you. In your case you may need to provide hostname and port for your running glume-ng agent. We need to prepare the configuration file for our initial load. The OGG trail file is located in the dirdat/ directory and has name initld.

<pre></pre> <pre>Here is our parameter file for initial load using passive replicat: [oracle@sandbox oggbd]$ cat dirprm/irflume.prm --initial REPLICAT irflume -- Command to run REPLICAT in passive mode -- ./replicat paramfile dirprm/irflume.prm reportfile dirrpt/irflume.rpt SPECIALRUN END RUNTIME EXTFILE /u01/oggbd/dirdat/initld --DDLERROR default discard DDL include all TARGETDB LIBFILE libggjava.so SET property=dirprm/flume.props REPORTCOUNT EVERY 1 MINUTES, RATE GROUPTRANSOPS 10000 MAP ggtest.*, TARGET bdtest.*; 

Let’s run the load and see what we get in the end:

[oracle@sandbox oggbd]$ ./replicat paramfile dirprm/irflume.prm reportfile dirrpt/irflume.rpt

The command completed successfully and we got three new files on HDFS. The first 2 files had the schema description and the 3-d one had the data for the replicated tables.

[root@sandbox ~]# hadoop fs -ls /user/oracle/ggflume
Found 12 items
-rw-r--r--   1 flume oracle       1833 2016-03-10 11:17 /user/oracle/ggflume/FlumeData.1457626634620
-rw-r--r--   1 flume oracle       1762 2016-03-10 11:17 /user/oracle/ggflume/FlumeData.1457626634621
-rw-r--r--   1 flume oracle       1106 2016-03-10 11:17 /user/oracle/ggflume/FlumeData.1457626634622
[root@sandbox ~]# hadoop fs -tail /user/oracle/ggflume/FlumeData.1457626634620
        "name" : "PK_ID",
        "type" : [ "null", "string" ],
        "default" : null
      }, {
        "name" : "PK_ID_isMissing",
        "type" : "boolean"
      }, {
        "name" : "RND_STR",
        "type" : [ "null", "string" ],
        "default" : null
      }, {
        "name" : "RND_STR_isMissing",
        "type" : "boolean"

[root@sandbox ~]# hadoop fs -tail /user/oracle/ggflume/FlumeData.1457626634621
  }, {
    "name" : "primary_keys",
    "type" : {
      "type" : "array",
      "items" : "string"
  }, {
    "name" : "tokens",
    "type" : {
      "type" : "map",
      "values" : "string"
[root@sandbox ~]# hadoop fs -tail /user/oracle/ggflume/FlumeData.1457626634622
:?v??8?????	SaQm?"BDTEST.TEST_TAB_1Ñ??
                                          ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-10T11:17:14.448000(00000000-10000002012
                                                                                   ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-10T11:17:14.459000(00000000-10000002155
                                                                                   ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-10T11:17:14.459001(00000000-10000002298
                                                                                   ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-10T11:17:14.460000(00000000-10000002441
PK_ID4RXZT5VUN&2013-09-04:23:32:56HW82LI73&2014-05-11:05:23:23"BDTEST.TEST_TAB_2?????"BDTEST.TEST_TAB_2I42016-02-16 19:17:40.76289942016-03-10T11:17:14.466000(00000000-10000002926

The initial load has succeeded, and now we can create and start the proper ongoing replication to HDFS through Flume.
Let’s prepare a new parameter file for our permanent Flume replicat and starting it up.

GGSCI (sandbox.localdomain) 2> edit param rflume
-- Trail file for this example is located in "dirdat/" directory
-- Command to add REPLICAT
-- add replicat rflume, exttrail dirdat/or
TARGETDB LIBFILE libggjava.so SET property=dirprm/flume.props
MAP ggtest.*, TARGET bdtest.*;

GGSCI (sandbox.localdomain) 1> add replicat rflume, exttrail dirdat/or, begin now

GGSCI (sandbox.localdomain) 2> start replicat rflume

Sending START request to MANAGER ...

Let’s insert a row and see what we get on the target system.

orclbd> insert into ggtest.test_tab_1
  2  values (7,dbms_random.string('x', 8), sysdate-(7+dbms_random.value(0,1000)),
  3  dbms_random.string('x', 8), sysdate-(6+dbms_random.value(0,1000))) ;

1 row inserted.

orclbd> commit;

Commit complete.


As soon as commit had been executed we received a couple of new files on HDFS where the first had the schema for the changed table, and the second had the data for the transaction or  “payload”.

[root@sandbox ~]# hadoop fs -ls /user/oracle/ggflume
-rw-r--r--   1 flume oracle       1833 2016-03-10 11:17 /user/oracle/ggflume/FlumeData.1457626634620
-rw-r--r--   1 flume oracle       1762 2016-03-10 11:17 /user/oracle/ggflume/FlumeData.1457626634621
-rw-r--r--   1 flume oracle       1106 2016-03-10 11:17 /user/oracle/ggflume/FlumeData.1457626634622
-rw-r--r--   1 flume oracle       1833 2016-03-10 12:43 /user/oracle/ggflume/FlumeData.1457631817021
-rw-r--r--   1 flume oracle        605 2016-03-10 12:43 /user/oracle/ggflume/FlumeData.1457631817022
[root@sandbox ~]#

[root@sandbox ~]# hadoop fs -cat /user/oracle/ggflume/FlumeData.1457631817021
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable???:]B?9?k??	]kTSa?m??{
  "type" : "record",
  "name" : "TEST_TAB_1",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
  }, {
    "name" : "op_type",
    "type" : "string"
  }, {
[root@sandbox ~]# hadoop fs -cat /user/oracle/ggflume/FlumeData.1457631817022
  "type" : "record",
  "name" : "generic_wrapper",
  "namespace" : "oracle.goldengate",
  "fields" : [ {
    "name" : "table_name",
    "type" : "string"
  }, {
    "name" : "schema_hash",
    "type" : "int"
  }, {
    "name" : "payload",
    "type" : "bytes"
  } ]
                            ?"BDTEST.TEST_TAB_1I42016-03-10 17:43:31.00169042016-03-10T12:43:33.464000(00000000080001408270

I prepared and executed a small regression testing of inserts and updates to the table using jmeter, and started to push inserts and updates with a rate about 29 transaction per second.Even with one flume channel and my small Hadoop environment, it had a pretty good response time without trashing any errors. Flume put about 900 transactions per a HDFS file.

-rw-r--r--   1 flume oracle     123919 2016-03-10 14:52 /user/oracle/ggflume/FlumeData.1457639485465
-rw-r--r--   1 flume oracle      35068 2016-03-10 14:52 /user/oracle/ggflume/FlumeData.1457639485466
-rw-r--r--   1 flume oracle     145639 2016-03-10 14:52 /user/oracle/ggflume/FlumeData.1457639485467
-rw-r--r--   1 flume oracle     178943 2016-03-10 14:52 /user/oracle/ggflume/FlumeData.1457639485468
-rw-r--r--   1 flume oracle     103285 2016-03-10 14:52 /user/oracle/ggflume/FlumeData.1457639485469
[oracle@sandbox Downloads]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1457639485467 | wc -l
[oracle@sandbox Downloads]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1457639485468 | wc -l
[oracle@sandbox Downloads]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1457639485469 | wc -l
[oracle@sandbox Downloads]$

I’ve also tried the “thrift” datasource for Flume and it worked well too. To switch from “avro” to “thrift” I changed the value in the parameter agent.sources.ogg1.type in the flume.conf and restarted the flume agent. You also have to change client.type from default to thrift in your custom-flume-rpc.properties file. It worked fine, and I was able to get the information from the trail and write to the hdfs.

[oracle@sandbox oggbd]$ ./replicat paramfile dirprm/irflume.prm reportfile dirrpt/irflume.rpt
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume
Found 3 items
-rw-r--r--   1 flume oracle       1833 2016-02-25 16:05 /user/oracle/ggflume/FlumeData.1456434311892
-rw-r--r--   1 flume oracle       1762 2016-02-25 16:05 /user/oracle/ggflume/FlumeData.1456434311893
-rw-r--r--   1 flume oracle       1106 2016-02-25 16:05 /user/oracle/ggflume/FlumeData.1456434311894
[oracle@sandbox oggbd]$
[oracle@sandbox oggbd]$ hadoop fs -cat  /user/oracle/ggflume/FlumeData.1456434311892
  "type" : "record",
  "name" : "TEST_TAB_1",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
  }, {

[oracle@sandbox oggbd]$ hadoop fs -cat  /user/oracle/ggflume/FlumeData.1456434311894
                                                                                                              ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-02-25T16:05:11.480000(00000000-10000002012
                                                                                   ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-02-25T16:05:11.498000(00000000-10000002155
                                                                                   ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-02-25T16:05:11.498001(00000000-10000002298
                                                                                   ?"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-02-25T16:05:11.499000(00000000-10000002441
PK_ID4RXZT5VUN&amp;2013-09-04:23:32:56HW82LI73&amp;2014-05-11:05:23:23"BDTEST.TEST_TAB_2?????"BDTEST.TEST_TAB_2I42016-02-16 19:17:40.76289942016-02-25T16:05:11.505000(00000000-10000002926
PK_IDRND_STR_1ACC_DATE7IJWQRO7T&amp;2013-07-07:08:13:52[oracle@sandbox oggbd]$

You can see from the output that in the FlumeData.1456434311894 file we are getting the schema description and in the FlumeData.1456434311894 we have the data from the tables TEST_TAB_1 and TEST_TAB_2.

Let’s try some simple DDL commands.
If we truncate a table:

orclbd> truncate table ggtest.test_tab_1;

Table GGTEST.TEST_TAB_1 truncated.


It is not going to be replicated. If we are altering the table, we are not seeing it as a separate command, but it is going to be reflected in the new schema definition for any new transaction replicated to HDFS. You will get a file with new schema definition and the transaction itself in a next file.

orclbd> alter table ggtest.test_tab_1 add (new1 varchar2(10));

Table GGTEST.TEST_TAB_1 altered.

orcl> insert into ggtest.test_tab_1
  2  values (7,dbms_random.string('x', 8), sysdate-(7+dbms_random.value(0,1000)),
  3  dbms_random.string('x', 8), sysdate-(6+dbms_random.value(0,1000)),'new_col' );

1 row created.

orcl> commit;

Commit complete.

[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1457117136700
  "type" : "record",
  "name" : "TEST_TAB_1",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
        "name" : "NEW1",
        "type" : [ "null", "string" ],
        "default" : null
      }, {
        "name" : "NEW1_isMissing",
        "type" : "boolean"
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1457117136701
}SB???"BDTEST.TEST_TAB_1?????"BDTEST.TEST_TAB_1I42016-03-04 18:45:30.00131442016-03-04T13:45:34.156000(00000000000000014363

As I’ve mentioned in my previous post, the deeper investigation of supported DDL changes is going to be the subject of a dedicated blog post. Here we can conclude that the adapter worked as expected, and supported the flow of transactions from our Oracle database down to Flume using Avro and Thrift sources. Of course it is not production implementation, and serves only as a basic functional and elementary regression testing. For a serious production workflow we need to develop appropriate architecture.

In my next few posts I plan to check Kafka and HBASE adapters and see how they work. Stay tuned!

Categories: DBA Blogs

Connection timeout parameters in MySQL

Pythian Group - Thu, 2016-03-17 08:09
  • wait_timeout
  • interactive_timeout
  • net_read_timeout
  • net_write_timeout

What do these timeouts do in MySQL? If you search the web for one or more of these, you may find complaints that no comprehensive explanation exists for all of these timeouts in one place (besides the obvious documentation of dynamic server system variables in MySQL). This blog post seeks to provide a central documentation source for timeouts and provide some practical explanation.

Knowing what timeouts do helps in a troubleshooting effort. It’s good to understand when an issue is timeout related and when it’s not, and to know the right reasons for changing timeout variables, or the right time to ask the developer or ad-hoc user to please tune the variables in the session, instead of asking the DBA to change the global variables.

Before diving into the meat of this topic, here are a few introductory concepts that I will touch on.

I mentioned session variables. The four timeout variables we are discussing: interactive_timeout, wait_timeout, net_read_timeout, net_write_timeout can all be set within the context of the session using the SET command. e.g. “set session wait_timeout=3600;“.  A general rule of thumb that I always try to follow: Once I have what I think are correct timeouts to protect my mission critical application, I do not want to change them unless there is a really good reason. If I’m encountering a client disconnection issue and it’s not coming from my main application, I need to see if the problem can be alleviated by adjusting timeouts on the session level using SET commands.

Second, it’s important to note that timeouts are there to protect your server and critical application. You don’t want too many clients connected to your server doing nothing (they can crowd out connections that need to do something). You don’t want clients that are in a failed state due to a network connection or other interruptive problem to continue to consume resources such as locks on your database. Timeouts should be set very thoughtfully, based on the unique environment. If you’re unsure, leave at defaults and adjust when needed.

Finally, I created a perl script to illustrate wait_timeout situations. I made a great effort to also incorporate net_read_timeout and net_write_timeout into the script and you’ll see empty functions there as place-holders. There are many layers to the “MySQL client/server onion” involving buffers on the system, client, and server layers. In lieu of providing a direct scripting examples of net_read_timeout and net_write_timeout, I hope you’ll find the other methods of explanation included below, useful.


According to the MySQL manual, Interactive_timeout is only used for connections from interactive clients like the command line MySQL clients.

The technical explanation is: “clients utilizing the “CLIENT_INTERACTIVE option to mysql_real_connect().” Check the specific documentation to your favorite client to see whether it fits this description.

If your client fits this behavior, wait_timeout is set to the value in interactive_timeout. The only benefit to this parameter is flexibility. If your command line or other interactive clients have a vastly different requirement than your application, feel free to set this differently than wait_timeout.


Wait timeout is simply there to protect you in the common case of clients sitting there doing nothing but absorbing a connection. You want to set this to the lowest acceptable number of seconds in order to protect your server against an application malfunction or some other event that may cause too many connections to the database to be opened, crowding out other clients. If a client is doing nothing for wait_timeout seconds, the MySQL server will terminate the connection. The proper setting for this variable depends on the particular environment.

An environment I worked in for 3+ years set this to 120 seconds, because jboss connection pooling (and associated parameters) took care of making sure the mission critical application in the environment had available connections.

A common misconception and common misuse is to try to adjust wait_timeout for a situation in which a query has been interrupted. Wait timeout only applies to idle sessions. If the connection is in any other state besides idle, wait_timeout does not apply. This is why utilities like pt-kill exist. It does not help that an idle connection being interrupted by KILL produces the same error message.

Example using my perl script and the MySQL processlist:

jscott@js-trusty1:~$ ./test_timeouts.pl -K
Doing Killable Wait Timeout Test
ATTN: You have 20 seconds to kill the MySQL thread!!

in another shell

jscott@js-trusty1:~$ mysql -e "show processlist"
| Id | User | Host | db | Command | Time | State | Info |
| 36 | js | | NULL | Sleep | 28514 | | NULL |
| 79 | root | localhost:43350 | timeout_test | Sleep | 4 | | NULL |
| 80 | root | localhost | NULL | Query | 0 | NULL | show processlist |
jscott@js-trusty1:~$ mysql -e "kill 79";

back to the perl script output.

DBD::mysql::st execute failed: MySQL server has gone away at ./test_timeouts.pl line 134.
Err: 2006
ErrStr: MySQL server has gone away
State: HY000


Note that we got MySQL error code 2006 “MySQL server has gone away”.

Now we’ll allow wait_timeout to trigger instead of killing a process.

jscott@js-trusty1:~$ ./test_timeouts.pl -W
Doing Wait Timeout Test
DBD::mysql::st execute failed: MySQL server has gone away at ./test_timeouts.pl line 151.
Err: 2006
ErrStr: MySQL server has gone away
State: HY000

The fact that a killed idle process and a process receiving wait_timeout get the same error message and code (2006 “MySQL server has gone away”) causes much confusion. Rest assured, wait_timeout only kills idle connections, even if the error messages are the same.  If there is a running query, you will typically get error code 2013: “Lost connection to MySQL server during query”.

The most common reason a query is killed, in my experience, is a backup. Many backup utilities for MySQL have the default behavior of killing long-running queries.


net_read_timeout is the number of seconds mysql will wait between bytes as you are sending it data. As mentioned above, I tried very hard to synthetically cause a net_read_timeout with a perl script and was unsuccessful.

The best way to explain net_read_timeout is to use the example of an ETL (Extract, Transform, Load) job. At the beginning of an ETL job, you select data from one data source, then transform and load it into another data source, such as a MySQL data warehouse. Tools like “Pentaho Data Integration” (cited below) provide a GUI view of ETL.

Pentaho ETL Process

Pentaho ETL Process

If your ETL job is complex, and you use something similar to Pentaho Kettle, you have very little visibility into what is happening behind the scenes.

If the steps in between the initial query in an ETL job and an output step consume more than net_read_timeout seconds, then an error may occur due to incomplete statements being sent to the “output” steps.  In mid-transaction, if the MySQL server fails to receive data within the timeout interval, the connection will be terminated.

Another way net_read_timeout can occur: Normally all database connections used in a Pentaho ETL transformation are opened at the beginning of the transformation. If you have “use result streaming cursor” turned off on the database connection used for “table input”, the result set may take longer than “net_read_timeout” seconds to receive, causing an error on the database connection being used for the output steps.

Have you ever executed a simple query (with a large result set) on the MySQL command line and wondered why there is a pause of several seconds or even minutes prior to beginning to receive the results? Your client actually has begun receiving the results, but they simply have not been presented to you. The reason for this is, it’s generally better for a MySQL server to be “finished” with a query as soon as possible; therefore, the default behavior of the MySQL client libraries on a read-only query is to receive all the results first, before passing the results onto the program. The default behavior is called “mysql_store_result“.  The MySQL reference manual does a good job of explaining this in detail. As the manual states, you should not change the default behavior unless you are doing a minimal amount of processing on each result row.

Homemade ETL programs in other languages can have the same issue. Take care to use auto_reconnect features and/or streaming cursors / “mysql_use_result”, when appropriate.

ETLs and scripts are good examples of clients which can be modified with SET commands.  “SET SESSION net_read_timeout=<longer value>;” should be considered for jobs outside of your main application.

Lastly, net_read_timeout is also often caused by bad network connections. This is the default explanation offered in many blog posts on this topic.

The Error returned to the client will be (2013) “Lost connection to MySQL server during query”.


net_write_timeout is the number of seconds MySQL will wait for the client to receive additional data before terminating a connection. Using the example of the ETL job above, if steps in the middle of the transformation cause the “input” steps to pause for longer than net_write_timeout seconds, it’s possible to receive an error. I’ve never seen net_write_timeout hit because of an ETL job. The MySQL client has default behavior of buffering result sets.  Also, I try to use the “use result streaming cursor” in Pentaho or the “mysql_use_result” option in the Perl DBI, sparingly.

The most typical case of receiving net_write_timeout is a backup such as a mysqldump. During a long data transfer such as a mysqldump, a myriad of factors could come into play, not the least of which is network connectivity. Remember, if you run into problems with net_write_timeout on a mysqldump backup, mysqldump has its own section of the my.cnf. You can set an option file on a client or a server to specifically set this and other variables for a mysqldump session.

Categories: DBA Blogs

Links for 2016-03-16 [del.icio.us]

Categories: DBA Blogs


Subscribe to Oracle FAQ aggregator - DBA Blogs