Skip navigation.

Pythian Group

Syndicate content
Love Your Data
Updated: 11 hours 51 min ago

How to run OpenTSDB with Google Bigtable

Mon, 2016-03-14 11:49

In a previous post (OpenTSDB and Google Cloud Bigtable) we discussed OpenTSDB, an open source distributed database specifically designed for storing timeseries data. We also explained how OpenTSDB relies on Apache HBase for a reliable and scalable data backend. However, deployment and administration of an HBase cluster is not a trivial task, as it requires a full Hadoop setup. This means that it takes a big data engineer (or better a team of them) to plan for the cluster sizing, provision the machines and setup the Hadoop nodes, configure all services and tune them for optimal performance. If this is not enough, Operations teams have to constantly monitor the cluster, deal with hardware and service failures, perform upgrades, backup regularly, and a ton of other tasks that make maintenance of a Hadoop cluster and OpenTSDB a challenge for most organizations.

With the release of Google Bigtable as a cloud service and its support for the HBase API, it was obvious that if we managed to integrate OpenTSDB with Google Bigtable, we would enable more teams to have access to the powerful functionality of OpenTSDB by removing the burden from maintaining an HBase cluster.

Nevertheless, integration of OpenTSDB with Bigtable was not as seamless as dropping a few jars in its release directory. This happened because the OpenTSDB developers went over and above the standard HBase libraries, by implementing their very own asynchbase library. Asynchbase is a fully asynchronous, non-blocking, thread-safe, high-performance HBase API. And no one can put it better than the asynchbase developers themselves who claim that ‘This HBase client differs significantly from HBase’s client. Switching to it is not easy as it requires one to rewrite all the code that was interacting with any HBase API.’

This meant that integration with Google Bigtable required OpenTSDB to switch back to the standard HBase API. We saw the value of such an effort here at Pythian and set about developing this solution.

The asyncbigtable library

Today, we are very happy to announce the release of the asyncbigtable library. The asyncbigtable library is a 100% compatible implementation of the great asynchbase library that can be used as a drop in replacement and enable OpenTSDB to use Google Bigtable as a storage backend.

Thanks to support from the OpenTSDB team, the asyncbigtable code is hosted in the OpenTSDB GitHub repository.

Challenges

To create asyncbigtable we had to overcome two great challenges. The first one was that OpenTSDB assumes that the underlying library (until now asynchbase) performs asynchronous and non-blocking operations. On the other hand, the standard HBase API only supports synchronous and blocking calls. As a workaround for this, we used the BufferedMutator  implementation that collects all Mutation operations in a buffer and performs them in batches, allowing for mutations with an extremely low latency.

The second challenge stemmed from the fact that the OpenTSDB project has a very limited set of jar dependencies, that are explicitly defined in Makefiles. Contrary to this spartan approach, HBase and Bigtable client libraries have a significant number of transitive dependencies. Since, adding those dependencies one-by-one in the OpenTSDB build process would complicate its dependency management, we decided to  package all asyncbigtable dependencies in an uber-jar using the Maven assembly plugin. Therefore, building OpenTSDB with asyncbigtable support is now as simple as downloading a single beefy jar.

Build stepsBefore you start

Before you build OpenTSDB with Google Bigtable support, you must complete the following required steps:

  1. Create a Google Bigtable cluster (https://cloud.google.com/bigtable/docs/creating-cluster)
  1. Install HBase shell with access to the Google Bigtable cluster (https://cloud.google.com/bigtable/docs/installing-hbase-shell)
  1. Download and install the required tools for compiling OpenTSDB from source (http://opentsdb.net/docs/build/html/installation.html#compiling-from-source)
Build and run OpenTSDB
  1. Clone and build the modified source code from the Pythian github repository:

git clone -b bigtable git@github.com:pythian/opentsdb.git
cd opentsdb
sh build-bigtable.sh

  1. Create OpenTSDB tables

OpenTSDB provides a script that uses HBase shell to create its tables.  To create the tables run the following command:
env COMPRESSION=NONE HBASE_HOME=/path/to/hbase-1.1.2 \
./src/create_table.sh

  1. Run OpenTSDB

export HBASE_CONF=/path/to/hbase-1.1.2/conf
mkdir -p <tmp_dir>
./build/tsdb tsd --port=4242 --staticroot=build/staticroot \
--cachedir=<tmp_dir>

Future work

By all means our work on asyncbigtable does not stop here. We are putting great effort towards improving the library to achieve the high quality standards of the rest of OpenTSDB code. Our first priority is to test the library against most real world scenarios and achieve the highest quality. In the future, we plan to benchmark the performance of OpenTSDB with Bigtable and compare how it competes against HBase.

We are also working on building a true asynchronous implementation of the asyncbigtable library by integrating deeper with the Google Bigtable API.

Acknowledgements

We would like to thank the OpenTSDB developers (Benoît Sigoure and Chris Larsen) for their brilliant work in building such great software and for embracing the asyncbigtable library. Their insights and code contributions helped us deal with some serious issues. Also, we would like to thank the Google Cloud Bigtable team because they expressed genuine interest in this project and they were very generous in providing us with cloud infrastructure and excellent support.

Categories: DBA Blogs

Bug in Ointment: ORA-600 in Online Datafile Move

Mon, 2016-03-14 08:02

Instead of using ‘fly in ointment’, I have used ‘Bug in Ointment’ because in this prolonged Australian summer, my backyard is full of bugs (to the sheer delight of my bug-loving son, at the same time causing much anxiety among the rest of us). When your backyard is full of bugs and you get bugs in a database, it’s only natural to customize the idioms.

Oracle 12c has been warming up the hearts of database aficionados in various ways with its features. One of the celebrated features is the online datafile moving and renaming. Lots has been written about it and suffice to say that we don’t need any down time in order to move, rename, or copy the data files anymore. It’s an online operation with zero down time incurring a slight performance overhead.

I was playing with this feature on my test system with Oracle 12.1 on OEL 6, and when moving a datafile in a pluggable database I got this error:

ORA-600 [kpdbGetOperLock-incompatible] from ALTER PLUGGABLE DATABASE .. DATAFILE ALL ONLINE

Well, I tried searching for this error using ORA-600 look up tool, but it didn’t turn up anything and simply informed me:

An Error document for ORA-600 [kpdbgetoperlock-incompatible] is not registered with the tool.

Digging more in My Oracle Support pulled out following associated bug:

Bug 19329654 – ORA-600 [kpdbGetOperLock-incompatible] from ALTER PLUGGABLE DATABASE .. DATAFILE ALL ONLINE (Doc ID 19329654.8)

The good news was that the bug was fixed in the 12.1.0.2.1 (Oct 2014) Database Patch Set Update. And it’s true, after applying this PSU, everything was hunky-dory.

Categories: DBA Blogs

Changes to DBCA Patch Application Behaviour Causes PDB Cloning to Fail

Fri, 2016-03-11 07:23
Background

A test upgrade from 11g to 12c and conversion to a container and pluggable database recently pointed out some important 12c behavior differences with respect to the DBCA and whether or not it automatically applies PSUs installed in the Oracle Home.

The original objective was to take an existing 11.2.0.4 database and upgrade it to 12.1.0.2 and convert it to a PDB.

From a high level the procedure was:

  • Install the Oracle 12.1.0.2 software and apply the latest PSU (in this case the JAN2016 PSU).
  • Create a new CDB to house the upgraded database.
  • Upgrade the 11.2.0.4 database to 12.1.0.2 in-place using the DBUA.
  • Convert the upgraded 12c database to a PDB (via the clone through DB link method).

Seemed pretty straight forward. However as part of the PDB conversion (running of the noncdb_to_pdb.sql script), the following error was encountered:

SQL> DECLARE
  2     threads pls_integer := &&1;
  3  BEGIN
  4     utl_recomp.recomp_parallel(threads);
  5  END;
  6  /
DECLARE
*
ERROR at line 1:
ORA-04045: errors during recompilation/revalidation of SYS.DBMS_SQLPATCH
ORA-00600: internal error code, arguments: [kql_tab_diana:new dep], [0x0CF59D0B8], [0x7F1525B91DE0], [1], [2], [], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_UTILITY", line 1294
ORA-06512: at line 1

 

The noncdb_to_pdb.sql script can only be run once so at this point the PDB conversion has failed and must be restarted. But first we must understand what went wrong or what steps we missed.

Root Cause: DBCA no longer automatically applies PSUs

It’s obvious from the ORA-04045 error that the issue is related to patching. But the question is still what was missed in the process since the 12c Oracle Home was fully patched before creating or upgrading any databases?

The problem is that DBAs have maybe become complacent with respect to PSU applications after creating databases. With Oracle Database 11g whenever we created a database via the DBCA, the latest PSU was automatically applied. It doesn’t matter if we created the database from a template or used a custom install. Regardless of which DBCA method was used, after DB creation we’d see something similar to:

SQL> select comments, action_time from dba_registry_history
  2  where bundle_series like '%PSU' order by 2;

COMMENTS                       ACTION_TIME
------------------------------ ------------------------------
PSU 11.2.0.4.160119            04-MAR-16 02.43.52.292530 PM

SQL>

 

Clearly the latest PSU (JAN2016 in this case) installed in the Oracle Home was applied automatically by the DBCA. And of course this is reflected in the official README documentation (in this example for DB PSU patch 21948347 [JAN2016] – requires a My Oracle Support login to view) which states:

There are no actions required for databases that have been upgraded or created after installation of PSU 11.2.0.4.160119.

 

However this functionality has completely changed with Oracle Database 12c! The change in behaviour is documented in My Oracle Support (MOS) Note: “12.1:DBCA (Database Creation) does not execute “datapatch” (Doc ID 2084676.1)” which states:

DBCA does not execute datapatch in Oracle 12.1.0.X. The solution is to apply the SQL changes manually after creating a new Database

 

Similarly the 12c JAN2016 DB PSU (patch 21948354) README documentation states:

You must execute the steps in Section 3.3.2, “Loading Modified SQL Files into the Database” for any new or upgraded database.

 

This is a significant change in behaviour and is the root cause of the PDB creation error!

 

Resolving the “ORA-00600 [kql_tab_diana:new dep]” error

Back to the CDB creation error, the first logical place to check whenever experiencing plug-in or PDB creation errors is the PDB_PLUG_IN_VIOLATIONS view:

SQL> CREATE PLUGGABLE DATABASE MY_PROD FROM NON$CDB@clone_link FILE_NAME_CONVERT=('/u01/app/oracle/oradata/MY_PROD','/u01/app/oracle/oradata/CPRD1/MY_PROD');

Pluggable database created.

SQL> SELECT name, type, status, message, action FROM pdb_plug_in_violations ORDER BY 1,2;

NAME     TYPE      STATUS    MESSAGE                                  ACTION
-------- --------- --------- ---------------------------------------- ----------------------------------------
MY_PROD  ERROR     PENDING   PDB plugged in is a non-CDB, requires no Run noncdb_to_pdb.sql.
                             ncdb_to_pdb.sql be run.

MY_PROD  WARNING   PENDING   CDB parameter compatible mismatch: Previ Please check the parameter in the curren
                             ous '11.2.0.4.0' Current '12.1.0.2.0'    t CDB

MY_PROD  WARNING   PENDING   Service name or network name of service  Drop the service and recreate it with an
                             MY_PROD in the PDB is invalid or conflic  appropriate name.
                             ts with an existing service name or netw
                             ork name in the CDB.


SQL>

 

Nothing there is really concerning yet. It’s pretty much what we’d expect to see at this point. However, taking the next step in the PDB clone process encounters the error:

SQL> ALTER SESSION SET CONTAINER=MY_PROD;

Session altered.

SQL> @$ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql
...
...
SQL> DECLARE
  2     threads pls_integer := &&1;
  3  BEGIN
  4     utl_recomp.recomp_parallel(threads);
  5  END;
  6  /
DECLARE
*
ERROR at line 1:
ORA-04045: errors during recompilation/revalidation of SYS.DBMS_SQLPATCH
ORA-00600: internal error code, arguments: [kql_tab_diana:new dep],
[0x062623070], [0x7FB582065DE0], [1], [2], [], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_UTILITY", line 1294
ORA-06512: at line 1


Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

 

Now looking in the PDB_PLUG_IN_VIOLATIONS view the error is evident:

SQL> SELECT name, type, status, message, action FROM pdb_plug_in_violations ORDER BY 1,2;

NAME     TYPE      STATUS    MESSAGE                                            ACTION
-------- --------- --------- -------------------------------------------------- --------------------------------------------------
MY_PROD  ERROR     PENDING   SQL patch ID/UID 22139226/19729684 (Database PSU 1 Call datapatch to install in the PDB or the CDB
                             2.1.0.2.160119, Oracle JavaVM Component (Jan2016))
                             : Installed in the PDB but not in the CDB.

MY_PROD  ERROR     PENDING   PDB plugged in is a non-CDB, requires noncdb_to_pd Run noncdb_to_pdb.sql.
                             b.sql be run.

MY_PROD  WARNING   RESOLVED  Service name or network name of service MY_PROD in Drop the service and recreate it with an appropria
                              the PDB is invalid or conflicts with an existing  te name.
                             service name or network name in the CDB.

MY_PROD  WARNING   RESOLVED  CDB parameter compatible mismatch: Previous '11.2. Please check the parameter in the current CDB
                             0.4.0' Current '12.1.0.2.0'

MY_PROD  WARNING   PENDING   Database option OLS mismatch: PDB installed versio Fix the database option in the PDB or the CDB
                             n NULL. CDB installed version 12.1.0.2.0.

MY_PROD  WARNING   PENDING   Database option DV mismatch: PDB installed version Fix the database option in the PDB or the CDB
                              NULL. CDB installed version 12.1.0.2.0.


6 rows selected.

SQL>

 

At this point since the CDB clone has failed and since the noncdb_to_pdb.sql script cannot be run twice, the new PDB should be dropped. Resolving the root cause of the error by patching and then repeating the clone is necessary.

Applying the PSU

Fortunately the fix is conceptually simple: apply the PSU patch into the database. Though the catch is that I actually had installed the “Combo of 12.1.0.2.160119 OJVM PSU and 12.1.0.2.160119 DB PSU (Jan 2016)” bundle patch (22191659) into the Oracle Home. This combo includes the DB PSU (patch 21948354) plus the OJVM PSU (patch 22139226). And while the DB PSU can be applied without outage, the OJVM patch cannot. Instead for the OJVM patch or the combo, the CDB and the PDBs must all be restarted in UPGRADE mode.

Restarting in UPGRADE mode is fine in this case study where the CDB was just recently created to house the newly upgraded PDB. But if trying to plug the new database into an existing CDB with other applications running in production, shutting down the entire CDB to run datapatch may cause a problem.

Following the README documentation for the just the JAN2016 DB PSU (patch 21948354) doesn’t help. It states that the patch can be applied the database and pluggable databases open (section “3.3.2 Loading Modified SQL Files into the Database“). However because I’ve installed the combo patch into the Oracle Home, trying to patch with the database open will cause the patching to fail:

$ ./datapatch -verbose
SQL Patching tool version 12.1.0.2.0 on Fri Mar  4 15:45:27 2016
Copyright (c) 2015, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_3260_2016_03_04_15_45_27/sqlpatch_invocation.log

Connecting to database...OK
Note:  Datapatch will only apply or rollback SQL fixes for PDBs
       that are in an open state, no patches will be applied to closed PDBs.
       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
       (Doc ID 1585822.1)
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Patch 22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016)):
  Installed in the binary registry only
Bundle series PSU:
  ID 160119 in the binary registry and not installed in any PDB

Adding patches to installation queue and performing prereq checks...
Installation queue:
  For the following PDBs: CDB$ROOT PDB$SEED
    Nothing to roll back
    The following patches will be applied:
      22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016))
      21948354 (Database Patch Set Update : 12.1.0.2.160119 (21948354))

Error: prereq checks failed!
  patch 22139226: The pluggable databases that need to be patched must be in upgrade mode
Prereq check failed, exiting without installing any patches.

Please refer to MOS Note 1609718.1 and/or the invocation log
/u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_3260_2016_03_04_15_45_27/sqlpatch_invocation.log
for information on how to resolve the above errors.

SQL Patching tool complete on Fri Mar  4 15:45:52 2016
$

 

The solution to this error is to start the CDB and PDBs in UPGRADE mode (as per the OJVM patch documentation) and then re-run datapatch:

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup upgrade;
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2926320 bytes
Variable Size             603982096 bytes
Database Buffers         1476395008 bytes
Redo Buffers               13848576 bytes
Database mounted.
Database opened.
SQL> alter pluggable database all open upgrade;

Pluggable database altered.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

$ ./datapatch -verbose
SQL Patching tool version 12.1.0.2.0 on Fri Mar  4 15:50:59 2016
Copyright (c) 2015, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_5137_2016_03_04_15_50_59/sqlpatch_invocation.log

Connecting to database...OK
Note:  Datapatch will only apply or rollback SQL fixes for PDBs
       that are in an open state, no patches will be applied to closed PDBs.
       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
       (Doc ID 1585822.1)
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Patch 22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016)):
  Installed in the binary registry only
Bundle series PSU:
  ID 160119 in the binary registry and not installed in any PDB

Adding patches to installation queue and performing prereq checks...
Installation queue:
  For the following PDBs: CDB$ROOT PDB$SEED
    Nothing to roll back
    The following patches will be applied:
      22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016))
      21948354 (Database Patch Set Update : 12.1.0.2.160119 (21948354))

Installing patches...
Patch installation complete.  Total patches installed: 8

Validating logfiles...
Patch 22139226 apply (pdb CDB$ROOT): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/22139226/19729684/22139226_apply_CPRD1_CDBROOT_2016Mar04_15_51_23.log (no errors)
Patch 21948354 apply (pdb CDB$ROOT): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/21948354/19553095/21948354_apply_CPRD1_CDBROOT_2016Mar04_15_51_24.log (no errors)
Patch 22139226 apply (pdb PDB$SEED): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/22139226/19729684/22139226_apply_CPRD1_PDBSEED_2016Mar04_15_51_28.log (no errors)
Patch 21948354 apply (pdb PDB$SEED): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/21948354/19553095/21948354_apply_CPRD1_PDBSEED_2016Mar04_15_51_29.log (no errors)
SQL Patching tool complete on Fri Mar  4 15:51:31 2016
$

 

Now retrying the CDB cloning process:

SQL> ALTER SESSION SET CONTAINER=MY_PROD;

Session altered.

SQL> @$ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql
...
...

SQL> alter session set container = "&pdbname";

Session altered.

SQL>
SQL> -- leave the PDB in the same state it was when we started
SQL> BEGIN
  2    execute immediate '&open_sql &restricted_state';
  3  EXCEPTION
  4    WHEN OTHERS THEN
  5    BEGIN
  6      IF (sqlcode  -900) THEN
  7        RAISE;
  8      END IF;
  9    END;
 10  END;
 11  /

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> WHENEVER SQLERROR CONTINUE;
SQL> ALTER PLUGGABLE DATABASE MY_PROD OPEN;

Warning: PDB altered with errors.

SQL> connect / as sysdba
Connected.
SQL> SELECT name, type, status, message, action FROM pdb_plug_in_violations ORDER BY 1,2;

NAME     TYPE      STATUS    MESSAGE                                            ACTION
-------- --------- --------- -------------------------------------------------- --------------------------------------------------
MY_PROD  ERROR     RESOLVED  PDB plugged in is a non-CDB, requires noncdb_to_pd Run noncdb_to_pdb.sql.
                             b.sql be run.

MY_PROD  ERROR     PENDING   PSU bundle patch 160119 (Database Patch Set Update Call datapatch to install in the PDB or the CDB
                              : 12.1.0.2.160119 (21948354)): Installed in the C
                             DB but not in the PDB.

MY_PROD  WARNING   RESOLVED  Service name or network name of service MY_PROD in Drop the service and recreate it with an appropria
                              the PDB is invalid or conflicts with an existing  te name.
                             service name or network name in the CDB.

MY_PROD  WARNING   PENDING   Database option OLS mismatch: PDB installed versio Fix the database option in the PDB or the CDB
                             n NULL. CDB installed version 12.1.0.2.0.

MY_PROD  WARNING   PENDING   Database option DV mismatch: PDB installed version Fix the database option in the PDB or the CDB
                              NULL. CDB installed version 12.1.0.2.0.

MY_PROD  WARNING   RESOLVED  CDB parameter compatible mismatch: Previous '11.2. Please check the parameter in the current CDB
                             0.4.0' Current '12.1.0.2.0'


6 rows selected.

SQL>

 

Note that first time the error was related to the OJVM PSU patch and stated that the PDB was patched but the CDB was not. Now after patching the CDB the error message states that the DB PSU patch is installed in the CDB but not the PDB.

Again the solution is to run datapatch one more time. Fortunately since we’re only patching a PDB, we no longer need to worry about starting the CDB and PDBs in UPGRADE mode to apply the OJVM patch. The OJVM patch does not apply to the PDBs.  Hence we can patch successfully with both the CDB and PDBs open:

$ ./datapatch -verbose
SQL Patching tool version 12.1.0.2.0 on Fri Mar  4 16:19:06 2016
Copyright (c) 2015, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_9245_2016_03_04_16_19_06/sqlpatch_invocation.log

Connecting to database...OK
Note:  Datapatch will only apply or rollback SQL fixes for PDBs
       that are in an open state, no patches will be applied to closed PDBs.
       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
       (Doc ID 1585822.1)
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Patch 22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016)):
  Installed in binary and CDB$ROOT PDB$SEED MY_PROD
Bundle series PSU:
  ID 160119 in the binary registry and ID 160119 in PDB CDB$ROOT, ID 160119 in PDB PDB$SEED

Adding patches to installation queue and performing prereq checks...
Installation queue:
  For the following PDBs: CDB$ROOT PDB$SEED
    Nothing to roll back
    Nothing to apply
  For the following PDBs: MY_PROD
    Nothing to roll back
    The following patches will be applied:
      21948354 (Database Patch Set Update : 12.1.0.2.160119 (21948354))

Installing patches...
Patch installation complete.  Total patches installed: 1

Validating logfiles...
Patch 21948354 apply (pdb MY_PROD): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/21948354/19553095/21948354_apply_CPRD1_MY_PROD_2016Mar04_16_19_31.log (no errors)
SQL Patching tool complete on Fri Mar  4 16:19:32 2016
$

 

And finally the cloned PDB can be opened successfully:

SQL> ALTER PLUGGABLE DATABASE MY_PROD CLOSE;

Pluggable database altered.

SQL> ALTER PLUGGABLE DATABASE MY_PROD OPEN;

Pluggable database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 MY_PROD                        READ WRITE NO

SQL> SELECT name, type, status, message, action FROM pdb_plug_in_violations ORDER BY 1,2;

NAME     TYPE      STATUS    MESSAGE                                            ACTION
-------- --------- --------- -------------------------------------------------- --------------------------------------------------
MY_PROD  ERROR     RESOLVED  PDB plugged in is a non-CDB, requires noncdb_to_pd Run noncdb_to_pdb.sql.
                             b.sql be run.

MY_PROD  ERROR     RESOLVED  PSU bundle patch 160119 (Database Patch Set Update Call datapatch to install in the PDB or the CDB
                              : 12.1.0.2.160119 (21948354)): Installed in the C
                             DB but not in the PDB.

MY_PROD  WARNING   RESOLVED  Service name or network name of service MY_PROD in Drop the service and recreate it with an appropria
                              the PDB is invalid or conflicts with an existing  te name.
                             service name or network name in the CDB.

MY_PROD  WARNING   PENDING   Database option OLS mismatch: PDB installed versio Fix the database option in the PDB or the CDB
                             n NULL. CDB installed version 12.1.0.2.0.

MY_PROD  WARNING   PENDING   Database option DV mismatch: PDB installed version Fix the database option in the PDB or the CDB
                              NULL. CDB installed version 12.1.0.2.0.

MY_PROD  WARNING   RESOLVED  CDB parameter compatible mismatch: Previous '11.2. Please check the parameter in the current CDB
                             0.4.0' Current '12.1.0.2.0'


6 rows selected.

SQL>

The warnings marked as “PENDING” can be safely ignored.

Conclusion

What started out as an issue when cloning a non-CDB into a PDB led to some learning about patching with Oracle Database 12c.

The most important take-away is that Oracle Database 12c introduces a change in behaviour when it comes to patch applications through the DBCA. This change is well documented in both the patch and MOS documents. So if a DBA reads through the documentation thoroughly, they won’t have a problem.  However if the DBA is used to doing things the “old way” and only skims through the documentation they may unexpectedly get caught with errors such as the ORA-00600 encountered when creating a PDB through cloning.

References

My Oracle Support (MOS) Documents:

  • 12.1:DBCA(Database Creation) does not execute “datapatch” (Doc ID 2084676.1)
  • How to Convert Non PDB to PDB Database in 12c – Testcase (Doc ID 2012448.1)
  • How to migrate an existing pre12c database(nonCDB) to 12c CDB database ? (Doc ID 1564657.1)
  • Complete Checklist for Upgrading to Oracle Database 12c Release 1 using DBUA (Doc ID 1516557.1)

Pythian Blogs:

 

Categories: DBA Blogs

Centralized authentication with Red Hat Directory Server for Linux systems

Thu, 2016-03-10 15:14

User management on Linux systems can be tedious, and when you add in more than 10 systems the chances are it is going to take a good amount of time for you to manage user accounts on all systems individually.

There are various tools available to overcome this, and all of these use LDAP in some way.

The same goes for Red Hat Directory Server, which is an extension of LDAP by Red Hat to provide centralized user management. Though I have primarily demonstrated integration with Red Hat Directory Server with Linux systems, it can be used on all systems which supports LDAP authentication.

You can find the official Red Hat Directory Server installation guide here.

For our test scenario I used two RHEL 5 servers named as Server101 which is the Red Hat Directory Server and Server201 which is the client.

For RHEL based systems you need to make sure that you are subscribed to RHDS repo for installing Red Hat Directory Server. If you are using CentOS or other derivatives you can use 389-Directory Server which is upstream for Red Hat Directory Server.

Once you have the prerequisite ready you can start with installation.

Installing Red Hat Directory Server

I have designated server101 as Red Hat Directory Server.

Below are the steps to Install packages required for Red Hat Directory Server.

[root@server101 ~]#yum install redhat-ds -y

yum install redhat-ds

Installing RHDS

RHDS005

RHDS006

 

Once the installation is complete we can move to configuring Red Hat Directory Server.

 

Configuring Red Hat Directory Server

 

[root@server101 ~]# setup-ds-admin.pl

Once you run this command you will be prompted for inputs by the setup script which are mostly straight forward.

But there are few things that need to be taken care of before we proceed with the configuration.

We want to run the ldap service as ldap user, so create ldap user and group if its not there.

Then open the below ports on your firewall/iptables so that directory server can work properly.

  • 389 for LDAP service
  • 686 for secure LDAP service
  • 9830 for directory server admin console connectivity

You should also increase the number of file descriptors as it can help Red Hat Directory Server access files more efficiently. Editing the maximum number of file descriptors the kernel can allocate can also improve file access speeds.

  • First, check the current limit for file descriptors in  /proc/sys/fs/file-max
  • If the setting is lower than 64000, edit the /etc/sysctl.conf file, and reset the fs.file-max parameter and set it to 64000 or up.
  • Then increase the maximum number of open files on the system by editing the /etc/security/limits.conf configuration file. Add the following entry
    *        -        nofile        8192

 

After this we can start configuring Red Hat Directory Server with setup-ds-admin.pl command.

Once it is executed it will prompt for inputs which are mostly self explanatory, like below. Mostly we will accept the default values, as this is a fresh installation. We will only change the system user and group which will run ldap service from nobody to ldap user we created earlier. Don’t forget to make a note of passwords for admin and Directory Manager as it will be used to login to Admin Console.

 

[root@server101 ~]# setup-ds-admin.pl -k

==============================================================================
This program will set up the Red Hat Directory and Administration Servers.

It is recommended that you have “root” privilege to set up the software.
Tips for using this program:
– Press “Enter” to choose the default and go to the next screen
– Type “Control-B” then “Enter” to go back to the previous screen
– Type “Control-C” to cancel the setup program

Would you like to continue with set up? [yes]: yes

==============================================================================
BY SETTING UP AND USING THIS SOFTWARE YOU ARE CONSENTING TO BE BOUND BY
AND ARE BECOMING A PARTY TO THE AGREEMENT FOUND IN THE
LICENSE.TXT FILE. IF YOU DO NOT AGREE TO ALL OF THE TERMS
OF THIS AGREEMENT, PLEASE DO NOT SET UP OR USE THIS SOFTWARE.

Do you agree to the license terms? [no]: yes

==============================================================================
Your system has been scanned for potential problems, missing patches,
etc. The following output is a report of the items found that need to
be addressed before running this software in a production
environment.

Red Hat Directory Server system tuning analysis version 10-AUGUST-2007.

NOTICE : System is i686-unknown-linux2.6.18-308.el5 (1 processor).

WARNING: 502MB of physical memory is available on the system. 1024MB is recommended for best performance on large production system.

NOTICE: The net.ipv4.tcp_keepalive_time is set to 7200000 milliseconds
(120 minutes). This may cause temporary server congestion from lost
client connections.

WARNING: There are only 1024 file descriptors (hard limit) available, which
limit the number of simultaneous connections.

WARNING: There are only 1024 file descriptors (soft limit) available, which
limit the number of simultaneous connections.

Would you like to continue? [no]: yes

==============================================================================
Choose a setup type:

1. Express
Allows you to quickly set up the servers using the most
common options and pre-defined defaults. Useful for quick
evaluation of the products.

2. Typical
Allows you to specify common defaults and options.

3. Custom
Allows you to specify more advanced options. This is
recommended for experienced server administrators only.

To accept the default shown in brackets, press the Enter key.

Choose a setup type [2]:

==============================================================================
Enter the fully qualified domain name of the computer
on which you’re setting up server software. Using the form
<hostname>.<domainname>
Example: eros.example.com.

To accept the default shown in brackets, press the Enter key.

Computer name [server101.suratlug.org]: server101.example.com

==============================================================================
The servers must run as a specific user in a specific group.
It is strongly recommended that this user should have no privileges
on the computer (i.e. a non-root user). The setup procedure
will give this user/group some permissions in specific paths/files
to perform server-specific operations.

If you have not yet created a user and group for the servers,
create this user and group using your native operating
system utilities.

System User [nobody]: ldap
System Group [nobody]: ldap

==============================================================================
Server information is stored in the configuration directory server.
This information is used by the console and administration server to
configure and manage your servers.  If you have already set up a
configuration directory server, you should register any servers you
set up or create with the configuration server. To do so, the
following information about the configuration server is required: the
fully qualified host name of the form
<hostname>.<domainname>(e.g. hostname.example.com), the port number
(default 389), the suffix, the DN and password of a user having
permission to write the configuration information, usually the
configuration directory administrator, and if you are using security
(TLS/SSL). If you are using TLS/SSL, specify the TLS/SSL (LDAPS) port
number (default 636) instead of the regular LDAP port number, and
provide the CA certificate (in PEM/ASCII format).

If you do not yet have a configuration directory server, enter ‘No’ to
be prompted to set up one.

Do you want to register this software with an existing
configuration directory server? [no]:

==============================================================================
Please enter the administrator ID for the configuration directory
server. This is the ID typically used to log in to the console.  You
will also be prompted for the password.

Configuration directory server
administrator ID [admin]:
Password:
Password (confirm):

==============================================================================
The information stored in the configuration directory server can be
separated into different Administration Domains. If you are managing
multiple software releases at the same time, or managing information
about multiple domains, you may use the Administration Domain to keep
them separate.

If you are not using administrative domains, press Enter to select the
default. Otherwise, enter some descriptive, unique name for the
administration domain, such as the name of the organization
responsible for managing the domain.

Administration Domain [example.com]:

==============================================================================
The standard directory server network port number is 389. However, if
you are not logged as the superuser, or port 389 is in use, the
default value will be a random unused port number greater than 1024.
If you want to use port 389, make sure that you are logged in as the
superuser, that port 389 is not in use.

Directory server network port [389]:

==============================================================================
Each instance of a directory server requires a unique identifier.
This identifier is used to name the various
instance specific files and directories in the file system,
as well as for other uses as a server instance identifier.

Directory server identifier [server101]:

==============================================================================
The suffix is the root of your directory tree.  The suffix must be a valid DN.
It is recommended that you use the dc=domaincomponent suffix convention.
For example, if your domain is example.com,
you should use dc=example,dc=com for your suffix.
Setup will create this initial suffix for you,
but you may have more than one suffix.
Use the directory server utilities to create additional suffixes.

Suffix [dc=example, dc=com]:

==============================================================================
Certain directory server operations require an administrative user.
This user is referred to as the Directory Manager and typically has a
bind Distinguished Name (DN) of cn=Directory Manager.
You will also be prompted for the password for this user. The password must
be at least 8 characters long, and contain no spaces.
Press Control-B or type the word “back”, then Enter to back up and start over.

Directory Manager DN [cn=Directory Manager]:
Password:
Password (confirm):

==============================================================================
The Administration Server is separate from any of your web or application
servers since it listens to a different port and access to it is
restricted.

Pick a port number between 1024 and 65535 to run your Administration
Server on. You should NOT use a port number which you plan to
run a web or application server on, rather, select a number which you
will remember and which will not be used for anything else.

Administration port [9830]:

==============================================================================
The interactive phase is complete.  The script will now set up your
servers.  Enter No or go Back if you want to change something.

Are you ready to set up your servers? [yes]:
Creating directory server . . .
Your new DS instance ‘server101’ was successfully created.
Creating the configuration directory server . . .
Beginning Admin Server creation . . .
Creating Admin Server files and directories . . .
Updating adm.conf . . .
Updating admpw . . .
Registering admin server with the configuration directory server . . .
Updating adm.conf with information from configuration directory server . . .
Updating the configuration for the httpd engine . . .
Starting admin server . . .
The admin server was successfully started.
Admin server was successfully created, configured, and started.
Exiting . . .
Log file is ‘/tmp/setupZa3jGe.log’

[root@server101 ~]#

RHDS012

RHDS013

RHDS020

RHDS021

RHDS022

 

Now that we have installed and configured Red Hat Directory Server its not set to autostart during system boot.

So we need to make Red Hat directory service and redhat directory console admin service to start at boot.

[root@server101 ~]# chkconfig dirsrv-admin --list 
dirsrv-admin   0:off1:off2:off3:off4:off5:off6:off 
[root@server101 ~]# chkconfig dirsrv --list 
dirsrv         0:off1:off2:off3:off4:off5:off6:off 
[root@server101 ~]# chkconfig dirservrv on 
[root@server101 ~]# chkconfig dirsrv-admin on 
[root@server101 ~]#

Now that we have our server ready, we need to add a user to it.

We will use Directory Server admin console to connect to the GUI and will create ldap user from there.

We can invoke directory server admin console gui with redhat-idm-console. It will open a GUI like below.

Directory Server Admin Console GUI

Directory Server Admin Console GUI

The user id is directory manager which was created during directory server setup, mostly it will be default as cn=Directory Manager. Now put your password and Administration url as http://server101:9830.

Directory Server Admin Console

Directory Server Admin Console

Once you login you will be presented with console screen as below.

RHDS051

Now click on Users and Groups tab and then click on create button, there select user from the menu.

RHDS031  RHDS032

Now Select organizational unit, mostly we will use the default and will select people from the list as below.

RHDS033

It will open Create User menu.

RHDS034

Now we will create ldapuser account as shown below. Fill in required details. Also select posix user tab as we need the account for unix system login. Fill up required details for posix account as well.

RHDS035   RHDS036

RHDS037

Now that we have created user account we can start configuring client.

 

Configuring Linux client for LDAP login

I have created server201 which we will configure for LDAP login.

For that we need to execute authconfig-tui from console.

It will open a terminal ui to configure authconfig to use LDAP.

[root@server201 pam.d]# authconfig-tui

RHDS040

Select Use LDAP for user information.

RHDS041

Select Use LDAP Authentication.

RHDS042

After this we need to make sure when user login on the server with LDAP authentication the home dir is created automatically, which is not enabled by default.

We can do this by executing below command at console.

[root@server201 pam.d]# authconfig –enablemkhomedir –update

RHDS052

Once this is done you can now use your ldap user to login to client server.

Now that we have created LDAP, we can use it to centralized login for all linux systems in the environment.

The user management is easy from single location.

We can also configure TLS and do replication for redundancy.

We can define schema and policies as well but that is for another time.

 

Categories: DBA Blogs

Log Buffer #464: A Carnival of the Vanities for DBAs

Thu, 2016-03-10 13:42

This Log Buffer Edition delves deep into the realms of Oracle, SQL Server and MySQL while gathering up some nifty blog posts for this week.


Oracle:

Speed, Security, and Best Practices in the Cloud: Oracle Releases Market-Leading Retail Demand Forecasting Solution

OBIEE 12c – Your Answers After Upgrading

Using the SQL ACCESS Advisor PL/SQL interface

How has JD Edwards EnterpriseOne 9.2 Transformed your Business?

In the article you will have a look at an example of configuring Fast Start Failover (FSFO).

SQL Server:

How to show Quarters Dynamically in SQL

Azure SQL Data Warehouse is a fully-managed and scalable cloud service. It is still in preview, but solid.

The occasional problems that you can get with POST and GET are typical of the difficulties of separating any command and query operations.

4 Convenient Ways To Run PowerShell Scripts

10 New Features Worth Exploring in SQL Server 2016

MySQL:

Maintaining mission critical databases on our pitchfork wielding brother, the “Daemon” of FreeBSD, seems quite daunting, or even absurd, from the perspective of a die-hard Linux expert, or from someone who has not touched it in a long time.

Planets9s: Sign up for our best practices webinar on how to upgrade to MySQL 5.7

Using jemalloc heap profiling with MySQL

Sometimes a Variety of Databases is THE Database You Need

Taking the new MySQL 5.7 JSON features for a test drive

Categories: DBA Blogs

Debugging Kibana using Chrome developer tools

Tue, 2016-03-08 17:53

Amazon Elasticsearch Service is a managed service to implement Elasticsearch in AWS. Underlying instances are managed by AWS and interaction with the service is available through API and AWS GUI.

Kibana is also integrated with Amazon Elasticsearch Service. We came across an issue which caused Kibana4 to show the following error message, when searching for *.

Courier Fetch: 10 of 60 shards failed.

Error is not very descriptive.

As Amazon Elasticsearch service is an endpoint only and we do not have direct access to the instances. We also have access to few API tools.

We decided to see what can be found from the chrome browser.

The Chrome Developer Tools (DevTools) contains lots of useful debugging possibilities.

DevTools can be started using several methods.

1. Right click and click Inspect.
2. From Menu -> More Tools -> Developer Tools
3. Press F12

Network tab under DevTools can be used to debug wide variety of issues. It records every requests made when a web page is loading. It captures wide range of information about every request like HTTP access Method, status and time took to complete the request etc.

By clicking on any of the requested resource, we will be able to get more information on the request.

In this case, the interesting bit was under the Preview tab. The Preview tab captures the data chrome got back from the search and store it as objects.

A successful query would look like the image below captured from Kibana3 of public website logstash.openstack.org.

kibana-es

We checked “_msearch?timeout=3000..” and received following errors messages under the nested values (For example “responses” -> “0” -> “_shards” -> “failures” -> “0”)

{index: “logstash-2016.02.24”, shard: 1, status: 500,…}index: “logstash-2016.02.24″reason: “RemoteTransportException[[Leech][inet[/10.212.25.251:9300]][indices:data/read/search[phase/query]]]; nested: ElasticsearchException[org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large, data for [@timestamp] would be larger than limit of [5143501209/4.7gb]]; nested: UncheckedExecutionException[org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large, data for [@timestamp] would be larger than limit of [5143501209/4.7gb]]; nested: CircuitBreakingException[[FIELDDATA] Data too large, data for [@timestamp] would be larger than limit of [5143501209/4.7gb]]; “shard: 1status: 500

So the issue is clear, fielddata usage is above the limit.

As per Amazon documentation,

Field Data Breaker –
Percentage of JVM heap memory allowed to load a single data field into memory. The default value is 60%. We recommend raising this limit if you are uploading data with large fields.
indices.breaker.fielddata.limit
For more information, see Field data in the Elasticsearch documentation.

Following url documents the supported Amazon Elasticsearch operations.

http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-gsg-supported-operations.html

On checking the current heap usage (second column) of the data nodes, we can see that heap usage is very high,

$ curl -XGET “http://elasticsearch.abc.com/_cat/nodes?v”
host ip heap.percent ram.percent load node.role master name
x.x.x.x   10   85   0.00   –   m   Drax the Destroyer
x.x.x.x   7   85   0.00   –   *   H.E.R.B.I.E.
x.x.x.x   78   64   1.08   d   –   Black Cat
x.x.x.x   80   62   1.41   d   – Leech
x.x.x.x   7   85   0.00   –   m   Alex
x.x.x.x   78   63   0.27   d   –   Saint Anna
x.x.x.x   80   63   0.28   d   –   Martinex
x.x.x.x   78   63   0.59   d   –   Scorpio

Following command can be used to increase the indices.breaker.fielddata.limit value. This can be used as a workaround.

$ curl -XPUT elasticsearch.abc.com/_cluster/settings -d ‘{ “persistent” : { “indices.breaker.fielddata.limit” : “89%” } }’

Running the command allowed the kibana search to run without issues and show the data.

The real solution would be to increase the number of nodes or reduce the amount of field data that need to be loaded by limiting number of indexes.

AWS Lamda can be used to to run a script to cleanup indices as a scheduled event.

Categories: DBA Blogs

Can The Public Cloud Meet the Needs of Your Enterprise Applications?

Tue, 2016-03-08 12:19

 

Any applications your company runs on premise can also be run in the public cloud. But does that mean they should be?

While the cloud offers well-documented benefits of flexibility, scalability, and cost efficiency, some applications — and especially business-critical enterprise applications — have specific characteristics that can make them tricky to move into a public cloud environment.

That’s not to say you shouldn’t consider the cloud as an option, but you should be aware of the following enterprise application needs before you make any migration decisions:

1. Highly customized infrastructure

Enterprise applications often rely on software components that are uniquely configured: they may need very specific storage layouts and security settings or tight integration with certain third-party tools. That makes it hard to replace them with generic platform-as-a-service (PaaS) alternatives in the cloud.
The same is true on the infrastructure side: application software components often need particular network configurations and controls that aren’t available from a typical infrastructure-as-a-service (IaaS) offering. (An example would be the way Oracle Real Application Clusters have to allow the cluster software to manipulate network settings, such as controlling IP addresses and network interfaces.)

2. Tightly coupled components

Today’s cloud application architectures are based on “microservices” — collections of services that perform specific tasks. When combined, these answer the whole of the application requirements. With enterprise applications, there are so many interdependencies between the various software components that it can be extremely difficult to change, upgrade, move, or scale an individual component without having a huge impact on the rest of the system.

3. Siloed IT departments

Enterprise applications are usually supported by siloed enterprise IT operations — DBAs, system administrators, storage administrators, network administrators and the like — each with their own responsibilities. Cloud deployment, on the other hand, requires much greater focus on collaboration across the IT environment. This means breaking down traditional silos to create full-stack teams with vertical application ownership. Some teams are likely to resist this change as they could end up with significantly less work and responsibility once the management of application components has shifted to the cloud vendor. So migrating to the cloud isn’t just a technical decision; it has people-process implications, too.

4. Costly infrastructure upgrades

Every company knows upgrading enterprise applications is a major undertaking and can often cause downtime and outages. This is true when the application stays inside your own data center — and doubly so when it moves to a cloud provider due to how long it takes to move massive amounts of data through the Internet and risks associated with unknown issues on the new virtual platform. For these reasons, significant financial commitment is often required to build and maintain an IT team with the right skills to do upgrades quickly and effectively as well as maintain the system.

5. Inflexible licensing models

The components used in enterprise applications are often proprietary products with licensing models that are not compatible with the elasticity of the cloud. For example, many Oracle licenses are for legacy applications and can used only on particular systems. Transferring those licenses to a cloud-based infrastructure is not an easy task.

In addition, perpetual software licenses are often not portable to the typical pay-as-you-go model used by most cloud providers. Plus, most software vendors don’t have any incentive to transition their customers from locked-in perpetual licenses with a steady maintenance revenue stream to a model that allows them to switch to a competitive product at any time.

Even though the nature of enterprise applications makes them difficult to migrate to the cloud, the benefits of doing so — in costs savings, availability, and business agility — still make it a very compelling proposition. In my next blog, I’ll take a look at some of the paths available to you should you decide to move your enterprise applications to the public cloud.

For more on this topic, check out our white paper on Choosing the Right Public Cloud Platform For Your Enterprise Applications Built on Oracle Database.

Oracle-White-Paper-Blog-CTA

 

Categories: DBA Blogs

Sources of Inspiration on International Women’s Day

Tue, 2016-03-08 08:58

True inspiration comes in many forms and I consider myself fortunate to be inspired every day by the women around me. International Women’s Day is the perfect opportunity to reflect on the women in our lives who positively influence us.

This post is my heartfelt thank you to the women in my ‘circle’ who have made an indelible mark in my life. They are women who continue to inspire, challenge and motivate me.

The women on Pythian HR’s team: These women continually teach me valuable life lessons. They are mothers, partners, sisters, care providers, aunts, cousins, and friends to many. They are strong, spirited, supportive and have generous natures that are contagious. They demonstrate an unwavering commitment to working hard, they’re incredibly talented and they have a steady focus on doing what’s best for our employees. These women go above and beyond and approach every puzzle with optimism.

My mother:  My mother is the most positive and ‘glass half full’ person that I know. She is a person who never fails to find the bright side to life’s most thought-provoking issues and one of her favourite questions to ask her loved ones is “Are You Happy?” (Spoiler alert: she’s not satisfied unless the answer is a truthful “yes”). Her love, guidance and support have helped sustain me through so much and over the years she has evolved into my BFF.

My friend, Jen:  Jen is a breast cancer survivor who decided to fight back and co-found Vixens Victorious. In October 2015, the dynamic duo of Vixens Victorious successfully launched Lights! Camera! CURE! which showcases female film makers from Canada and the proceeds go to support the Ottawa Regional Cancer Society. Jen’s positive spirit and take charge attitude empowers everyone who meets her.

My friend, Kate:  Kate moved to Canada with her three month old daughter to start a new journey with her husband. She took the initiative to make new friends, develop a network and often navigate a new city on her own when her partner travelled for work. Kate isn’t one to complain about adversities in life; she is courageous and gratefully embraces her adventure.

My fitness trainer JulesJules gets out of bed every morning, puts on her workout gear and travels across Ottawa to provide the most fun and effective workouts to her clients. She generously shares her own personal health journey and always finds a way to connect with her clients so they can experience the one on one attention they need. She is full of greatness.

Our family physician, Dr. Judy:  Dr. Judy’s medical practice is thriving because of her commitment to patient care. She ensures you are her priority in the moments that you are with her. She makes each of her patients feel important, cared for and heard. Dr. Judy emulates a kind and caring nature that everyone could benefit from.

My neighbor, Anne Marie:  In her late forties, Anne Marie taught herself to swim so she could begin competing in triathlons. She now travels internationally to compete in races. I’m inspired by her hard work, determination and strategic ability to set and meet goals.

The influences (sometimes subtle) of these women make an impact on how I choose to live my life. I am thankful for all of them.

On this International Women’s Day, I encourage you to think about who inspires you and why. Bonus points if you honour them with words of appreciation!

Categories: DBA Blogs

SQL Server on Linux – No April fools

Mon, 2016-03-07 20:58

Years ago, I got burned by an “April Fools” joke published by Steve Jones on sqlservercentral.com. He republished it as one of his favorites here.

Naturally, I had to rub my eyes today when I read that Microsoft announced that SQL Server 2016 would be coming to Linux.

There were mixed reactions on the internal SQL Server teams. I was afraid to respond to the thread, fearing I would get burned again. I quickly checked the date to confirm that the article hadn’t been resurrected.

One of the sentiments expressed in our internal chatter was that some of the DBAs love “Satya’s Microsoft” and I agree. I like what they’re doing, but I am very skeptical about the move to port SQL Server onto Linux.

I doubt this will enable new markets and new customer bases. I don’t think there are any large organizations who will suddenly decide to adopt the product because it will run on Linux.

One theory was that this move was to attract new developers who want to deploy multi-platform tech stacks. That could certainly be right, but I think PaaS satisfies that and many of the startup natures.

Other articles I read theorized it was a move towards SQL Server on Linux-powered containers.

I’m wondering what this will mean for future features. Will PowerShell or .NET be ported to Linux? What will change in the security model? Will clustering be available? Will a more RAC-like feature be available?

These are very interesting times and while this wasn’t a move that I was pining for, or even expected, I am excited to see where this is going.

I “applied” to test one of the early versions, and you can too.

What do you think? Are you excited about running SQL Server on Linux? When would you choose Linux over Windows?

Categories: DBA Blogs

Pythian at the 2016 Percona Live Data Performance Conference

Mon, 2016-03-07 10:22

The Percona Live Data Performance Conference in Santa Clara is being held April 18-22, 2016. It is quickly approaching, and Pythian is going to show you how we Love Your Data in a big way!

We have an awesome lineup of speakers this year:

Mark these down in your schedule because you are not going to want to miss any of them! Although, you might have a tough time choosing between the tutorials if you can’t clone yourself.

Also, please join us for the 2016 Annual Community Dinner @ Pedro’s. You can register now through EventBrite.

Categories: DBA Blogs

RSA Conference 2016

Mon, 2016-03-07 08:41

RSAConference 2016 Where the world talks security 

40,000 attendees, 500+ vendors and 700 sessions

RSAC is my annual check in to learn new approaches to information security, discover new technology, learn from industry experts and build my community.

In the three years that I have been attending RSAC, I have learned that Pythian is unique and so are our clients. Each year, we continue to improve our security program with our clients in mind.

RSAC Day 1

It’s Day 1 of the RSAConference 2016. Monday’s are typically a quiet day with vendors setting up in the expo halls, conference staff getting organized, attendees registering and a few press/analysts looking for optimal interview spots. It has been the calm before the storm of attendees descend on San Francisco and RSAC.

This Monday was a whirlwind of activity; CSA Summit, DevOps Connect, Information Security Leadership Development and IAPP: Privacy and Security to name only a few. Chances are you may have missed sessions if you weren’t early enough.

Privacy and Security were hot topics given the European General Data Protection Regulation (GDPR) agreement reached December 2015.

For more information on the event visit the RSAC website, or follow the conference on Twitter.

 

 

 

 

Categories: DBA Blogs

2016 Annual Pythian MySQL Community Dinner

Thu, 2016-03-03 10:26

Once again, Pythian is organizing an event that by now may be considered a tradition: The MySQL community dinner at Pedro’s! This dinner is open to all MySQL community members since many of you will be in town for Percona Live that week. Here are the details:

What: The MySQL Community Dinner

When: Tuesday April 19, 2016 –  7:00 PM at Pedro’s (You are welcome to show up later, too!)

Where: Pedro’s Restaurant and Cantina – 3935 Freedom Circle, Santa Clara, CA 95054

Cost: Tickets are $40 USD, Includes Mexican buffet, non-alcoholic drinks, taxes, and gratuities (see menu)

How: Purchase your ticket below or RSVP through Eventbrite

Powered by Eventbrite

Pythian Attendees:

Derek Downey
Alkin Tezuysal
Okan Buyukyilmaz
Emanuel Calvo
John Schulz
Martin Arrieta
Gabriel Cicilliani
Christos Soulios
Theresa Nova

Categories: DBA Blogs

MySQL on FreeBSD: old genes

Thu, 2016-03-03 10:02

Maintaining mission critical databases on our pitchfork wielding brother, the “Daemon” of FreeBSD, seems quite daunting, or even absurd, from the perspective of a die-hard Linux expert, or from someone who has not touched it in a long time. The question we ask when we see FreeBSD these days is “why?”.  Most of my own experience with FreeBSD was obtained 10-15 years ago.  Back then, in the view of the team I was working on, a custom compiled-from-source operating system like FreeBSD 5.x or 6.x was superior to a Linux binary release.

Package managers like YUM and APT were not as good.  They did not always perform MD5 checks and use SSL like today’s versions. RedHat wasn’t releasing security updates 5 minutes after a vulnerability was discovered. Ubuntu didn’t exist. Debian stable would get so very old before receiving a new version upgrade. FreeBSD was a great choice for a maintainable, secure, free open source UNIX-like OS with tight source control and frequent updates.

Most people do not understand why FreeBSD remains a great choice for security and stability. The main reason is that the entire source of the base OS and the kernel (not just the kernel) are tightly maintained and tested as a whole, monolithic, distribution.

FreeBSD 10.2 is different than versions I worked on many years ago, in a good way, at least from the standpoint of getting started. First, “pkg” has gotten quite an overhaul, making installing packages on FreeBSD as easy as with YUM or APT.  portsnap and portmaster make port upgrades much easier than they used to be. freebsd-update can take care of wholesale updates of the operating system from trusted binary sources without having to “build the world”. These are welcome changes; ones that make it easier to get to production with FreeBSD, and certainly made the task of rapidly building and updating a couple of “lab” virtual machines easier.

In my effort to get re-acquainted with FreeBSD, I hit some snags. However, once I was finished with this exercise, FreeBSD had re-established itself in my mind as a decent flavor to host a mission critical database on. Open Source enthusiasts should consider embracing it without (much) hesitation. Is there some unfamiliar territory for those who only use MySQL on MacOS and Linux? Sure. But it is important to remember that BSD is one of the oldest UNIX like operating systems. The OSS world owes much heritage to it. It is quite stable and boring, perhaps even comfortable in its own way.

Problem 1: forcing older versions of MySQL

I needed to install MySQL 5.5 first, in order to test a mysql upgrade on FreeBSD.  However, when installing percona-toolkit either via “pkg install” (binary) or /usr/ports (source), the later 5.6 version of the mysql client would inevitably be installed as a dependency. After that point, anything relating to MySQL 5.5 would conflict with the 5.6 client. If I installed in the opposite order, server first, percona-toolkit second, the percona-toolkit installation would ask me if it is OK to go ahead and upgrade both server and client to 5.6.

TIP: don’t forget make.conf

my /etc/make.conf:
MYSQL_DEFAULT?= 5.5

Once I added MYSQL_DEFAULT into make.conf, the installations for MySQL 5.5 became seamless. Note: if you want another flavor of MySQL server such as Percona Server, install the server “pkg install percona55-server” prior to “pkg install percona-toolkit” so that the client dependencies are met prior to installation.

Problem 2: Some tools don’t work

pt-diskstats does not work, because it reads from /proc/diskstats, which does not exist on FreeBSD. Other favorites like htop don’t work right out of the box. So far I have had good luck with the rest of the Percona toolkit besides pt-diskstats, but here’s how you get around the htop issue (and perhaps others).

TIP: Get the linux /proc mounted

dynamic commands:
# kldload linux
# mkdir -p /compat/linux/proc
# mount -t linprocfs linproc /compat/linux/proc

to make permanent:
# vi /boot/loader.conf (and add the following line)
linux_load="YES"
# vi /etc/fstab (and add the following line)
linproc /compat/linux/proc linprocfs rw 0 0

As you may have determined, these commands ensure that the linux compatibility kernel module is loaded into the kernel, and that the linux style /proc is mounted in a different location than you might be used to “/compat/linux/proc”. The FreeBSD /proc may also be mounted.

Problem 3: I want bash

# pkg install bash
… and once that’s done
# pw user mod root -s /usr/local/bin/bash
…and repeat ^^ for each user you would like to switch. It even comes with a prompt that looks like CentOS/RHEL.
[root@js-bsd1 ~]#

Problem 4: I can’t find stuff

BSD init is much simpler than SysV and upstart init frameworks so your typical places to look for start files are /etc/rc.d and /usr/local/etc/rc.d. To make things start on boot, it’s inevitably a line in /etc/rc.conf.

In our case, for MySQL, our start file is /usr/local/etc/rc.d/mysql-server. To have MySQL start on boot, your rc.conf line is:
mysql_enable="YES".

If you do not wish to make MySQL start on boot, you may simply say "/usr/local/etc/rc.d/mysql-server onestart"

Notes on binary replacement

Please note, just like in the Linux world, MariaDB and Percona Server are drop in replacements for MySQL so, the startfiles and enable syntax does not change. Your default location for my.cnf is /etc/my.cnf just like in the rest of the known universe.

This command lists all installed packages.
pkg info -a

use pkg remove and pkg install to add new versions of your favorite mysql software.

I ran into no greater issues with pkg than I would with yum or apt doing binary removals and installations, and no issues at all with mysql_upgrade. Remember: If you had to alter make.conf like I did earlier, remember to update it to reflect versions you want to install.

For those who like ZFS, the FreeBSD handbook has a very detailed chapter on this topic. I for one like plain old UFS. It might be the oldest filesytem that supports snapshots and can be implemented very simplistically for those who like low overhead.

Happy tinkering with FreeBSD and MySQL, and thanks for reading!

Categories: DBA Blogs

Sydney Gets New AWS Availability Zone

Thu, 2016-03-03 09:47

On a scorching November day in 2012, Sydneysiders were bracing themselves for yet another heat wave when all of a sudden they became pleasantly surprised as an elastic cloud occupied the tech skies. On November 12, 2012, Amazon announced  the New Asia Pacific (Sydney) Region in Australia.

Before that, Australian customers had to reach out to Japan or Singapore for their cloud needs. That was not really feasible, as it increased up-front expenses, long-term commitments, and scaling challenges. Amazon recognized that and Sydney became another region in the world.

They have now taken it a step further. They have rendered a new Availability Zone (AZ) in Sydney. Availability zone (AZ) is basically an isolated location within data centre regions from which public cloud services originate and operate.

The new availability zone is ap-southeast-2c. This is all set to provide enhanced performance and sociability to Australian customers. This will enable them to fully leverage the potential of technologies like Lambda, the Elastic File System shared filesystem, and Amazon RDS for MS SQL Server.

Pythian’s established presence in Australia and New Zealand coupled with round the clock and world class support for AWS, SQL Server, and other cloud technologies, enables it to support Australian and New Zealand customers from the word go.

Categories: DBA Blogs

SQL Injection with MySQL SLEEP()

Wed, 2016-03-02 11:40

Recently we’ve received an alert from one of our clients that running threads are high on one of their servers. Once we logged in, we noticed that all the selects were waiting for table level read lock. We scrolled through the process list, and found the selects which were causing the problems. After killing it, everything went back to normal.
At first we couldn’t understand why the query took so long, as it looked like all the others. Then we noticed, that one of the WHERE clauses was strange. There, we found a SLEEP(3) attached with OR to the query. Obviously, this server was the victim of a SQL injection attack.

What is SQL injection?

I think most of us know what SQL injection is, but as a refresher, SQL injection is when someone provides malicious input into WHERE, to run their own statements as well.
Typically this occurs when you ask a user for input, like username, but instead of a real name they give you a MySQL statement that will be run by your server without you knowing it.
Exploits of a Mom
Based on the picture, let’s see a few examples.
We have a simple table:

mysql> describe post;
+-------+------------------+------+-----+---------+----------------+
| Field | Type             | Null | Key | Default | Extra          |
+-------+------------------+------+-----+---------+----------------+
| id    | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| test  | varchar(127)     | YES  |     | NULL    |                |
+-------+------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

mysql> select * from post;
+----+--------+
| id | test   |
+----+--------+
|  1 | text1  |
|  2 | text2  |
|  3 | text3  |
|  4 | text4  |
|  5 | text5  |
|  6 | text6  |
|  7 | text7  |
|  8 | text8  |
|  9 | text9  |
| 10 | text10 |
+----+--------+
10 rows in set (0.00 sec)

Lets run a select with LIKE, which we know for sure won’t have a match:

mysql> select * from post where test like '%nomatch%';
Empty set (0.00 sec)

But what, happens if we don’t filter the inputs and someone wants to get all the data?
mysql> select * from post where test like '%nomatch ' || '1==1' && '1%';
+----+--------+
| id | test   |
+----+--------+
|  1 | text1  |
|  2 | text2  |
|  3 | text3  |
|  4 | text4  |
|  5 | text5  |
|  6 | text6  |
|  7 | text7  |
|  8 | text8  |
|  9 | text9  |
| 10 | text10 |
+----+--------+
10 rows in set, 2 warnings (0.00 sec)

That was a very mild injection, but it could be much more malicious: we could drop another table!

mysql> show tables;
+----------------------+
| Tables_in_injecttest |
+----------------------+
| game                 |
| post                 |
+----------------------+
2 rows in set (0.01 sec)

mysql> select * from post where test like '%nomatch'; drop table game;-- %';
Empty set (0.00 sec)

Query OK, 0 rows affected (0.28 sec)

mysql> show tables;
+-----------------------+
| Tables_in_inject_test |
+-----------------------+
| post                  |
+-----------------------+
1 row in set (0.00 sec)

mysql>

If we don’t know the name of the table, we can still cause trouble by blocking access to the database
If we insert SLEEP() in the WHERE part, then it will be executed for every matching row… if we inject it like: “OR SLEEP(n)”, it will be executed to every row in the table!
Okay, this will be “just” a long running select. It shouldn’t cause much trouble thanks to InnoDB and transaction isolation, unless something needs a table lock.

Some common examples of what causes table locks are:

  • explicit lock table
  • insert/update/delete on MyISAM
  • ALTER table on InnoDB

Once statements start waiting for lock on the table, all proceeding selects will wait for the previous locking statement to finish

Terminal 1:
mysql> select * from post where test like '%nomatch ' OR sleep(300) AND '1%';
….
Terminal 2:
mysql> alter table post engine=innodb;
...
Terminal 3:
mysql> select SQL_NO_CACHE count(*) from post;
…
Processlist:
+----------+----------------------+-----------+--------------------+---------+-------+---------------------------------+-----------------------------------------------------------------------+
| Id       | User                 | Host      | db                 | Command | Time  | State                           | Info                                                                  |
+----------+----------------------+-----------+--------------------+---------+-------+---------------------------------+-----------------------------------------------------------------------+
| 17170817 | root                 | localhost | janitest           | Query   |    19 | User sleep                      | select * from post where test like '%nomatch ' OR sleep(300) AND '1%' |
| 17170918 | root                 | localhost | janitest           | Query   |    11 | Waiting for table metadata lock | alter table post engine=innodb                                        |
| 17170957 | root                 | localhost | janitest           | Query   |     4 | Waiting for table metadata lock | select * from post                                                    |
+----------+----------------------+-----------+--------------------+---------+-------+---------------------------------+-----------------------------------------------------------------------+
3 rows in set (0.00 sec)

As we see in the example, ALTER table will wait until it can get a lock on post table, and this blocks every other select from now on to the table.
Or, if you are using MyISAM table, a simple update/insert will block access to the table, because it needs table level lock during them.

How can we defend ourselves from SQL injection?

There are several ways to secure yourself from SQL injection.

  • First of all, validate the input. If you expect only letters and numbers, filter it with regexp for example, to make sure there are no special characters there. Also escape the inputs on application side; programming languages have built-in function to do that (eg.: mysql_real_escape_string() in PHP)
  • Use prepared statement! It won’t allow 2 clause if you specified only 1. When you use prepared statements, the variables are transmitted as MySQL variables. Even if the string is not escaped, it will end up in one variable, and MySQL treats is as a longer string.
    (For more details see: http://dev.mysql.com/doc/refman/5.7/en/sql-syntax-prepared-statements.html )
  • Use a tool like MySQL Enterprise Firewall, which is a plugin for MySQL and can filter your statements to make sure there are no things like: || 1==1

I would like to start a little talk about this, so if you encountered SQL injection before, would you share it with us, how they did it, or in general how do you prevent SQL injections in your application?

 

Categories: DBA Blogs

Log Buffer #463: A Carnival of the Vanities for DBAs

Tue, 2016-03-01 15:27

As the winter in the Northern hemisphere is giving way to spring, slowly but surely, blog posts are blooming in the gardens of Oracle, SQL Server and MySQL. This Log Buffer plucks some of them for your reading pleasure.

Oracle:

Providing A Persistent Data Volume to EMC XtremIO Using ClusterHQ Flocker, Docker And Marathon

There is sliced bread in SQL.

Oracle Cloud – Your service is suspended due to exceeding resource quota !

EM12c Compliance ‘Required Data Available’ flag – Understanding and Troubleshooting

How can I see my invisible columns

SQL Server:

Auto Generate Your Database Documentation

A Lightweight, Self-adjusting, Baseline-less Data Monitor

Keeping POST and GET Separated

How often should I run DBCC CHECKDB?

Disabling SQL Server Optimizer Rules with QUERYRULEOFF

MySQL:

MySQL Contributions status

Planets9s: Building scalable database infrastructures with MariaDB & HAProxy

High availability with asynchronous replication… and transparent R/W split

mysql_real_connect is not thread safe

Now available in swanhart-tools: NATIVE asynchronous query execution for any MySQL client!

Categories: DBA Blogs

GoldenGate 12.2 Big Data Adapters: part 1 – HDFS

Mon, 2016-02-29 10:27

December 2015 brought us a new version of GoldenGate, and a new version for Big Data adapters for the GoldenGate. Let’s have a look at what we have now and how it works. I am going to start from the HDFS adapter.

As a first step, we need to prepare our source database for replication. It becomes easier with every new GoldenGate version. We will need to perform several steps:
a) Enable archive logging on our database. This particular step requires downtime.

orcl> alter database mount;

Database altered.

orcl> alter database archivelog;

Database altered.

orcl> alter database open;

b) Enable force logging and minimal supplemental logging. No need to shutdown database for this.

orcl> alter database add supplemental log data;

Database altered.

orcl> alter database force logging;

Database altered.

orcl> SELECT supplemental_log_data_min, force_logging FROM v$database;

SUPPLEME FORCE_LOGGING
-------- ---------------------------------------
YES	 YES

c) Switch parameter “enable_goldengate_replication” to “TRUE”. Can be done online.

orcl> alter system set enable_goldengate_replication=true sid='*' scope=both;

System altered.

orcl>

And we are almost done. Now we can create a schema for a GoldenGate administrator, and provide required privileges. I’ve just granted DBA role to the user to simplify process. In any case you will need it in case of integrated capture. For a production installation I advise you to have a look at the documentation to verify necessary privileges and roles.

orcl> create user ogg identified by welcome1 default tablespace users temporary tablespace temp;
orcl> grant connect, dba to ogg;

Let’s create a test schema to be replicated. We will call it schema on the source as ggtest and I will name the destination schema as bdtest. It will allow us also to check how the mapping works in our replication.

orcl> create tablespace ggtest; -- optional step 
orcl> create user ggtest identified by welcome1 default tablespace ggtest temporary tablespace temp;
orcl> grant connect, resource to ggtest;

Everything is ready on our source database for the replication.
Now we are installing Oracle GoledenGate for Oracle to our database server. We can get the software from the Oracle site on the download page in the Middleware section, GoldenGate, Oracle GoldenGate for Oracle databases. We are going to use 12.2.0.1.1 version of the software. The installation is easy – you need to unzip the software and run Installer which will guide you through couple of simple steps. The installer will unpack the software to the destination location, create subdirectories, and register GoldenGate in the Oracle global registry.

[oracle@sandbox distr]$ unzip fbo_ggs_Linux_x64_shiphome.zip
[oracle@sandbox distr]$ cd fbo_ggs_Linux_x64_shiphome/Disk1/
[oracle@sandbox Disk1]$ ./runInstaller

We continue by setting up parameters for Oracle GoldenGate (OGG) manager and starting it up. You can see that I’ve used a default blowfish encryption for the password. In a production environment you may consider another encryption like AES256. I’ve also used a non-default port for the manager since I have more than one GoldenGate installation on my test sandbox.

[oracle@sandbox ~]$ export OGG_HOME=/u01/oggora
[oracle@sandbox ~]$ cd $OGG_HOME
[oracle@sandbox oggora]$ ./ggsci

GGSCI (sandbox.localdomain) 1> encrypt password welcome1 BLOWFISH ENCRYPTKEY DEFAULT
Using Blowfish encryption with DEFAULT key.
Encrypted password:  AACAAAAAAAAAAAIARIXFKCQBMFIGFARA
Algorithm used:  BLOWFISH

GGSCI (sandbox.localdomain) 2> edit params mgr
PORT 7829
userid ogg@orcl,password AACAAAAAAAAAAAIARIXFKCQBMFIGFARA, BLOWFISH, ENCRYPTKEY DEFAULT
purgeoldextracts /u01/oggora/dirdat/*, usecheckpoints


GGSCI (sandbox.localdomain) 3> start manager
Manager started.

Let’s prepare everything for initial load, and later online replication.
I’ve decided to use GoldenGate initial load extract as the way for initial load for the sake of consistency for the resulted dataset on Hadoop.
I prepared the parameter file to replicate my ggtest schema and upload all data to the trail file on remote site. I’ve used a minimum number of options for all my processes, providing only handful of parameters required for replication. Extract options is a subject deserving a dedicated blog post. Here is my simple initial extract:

[oracle@sandbox oggora]$ cat /u01/oggora/dirprm/ini_ext.prm
SOURCEISTABLE
userid ogg@orcl,password AACAAAAAAAAAAAIARIXFKCQBMFIGFARA, BLOWFISH, ENCRYPTKEY DEFAULT
--RMTHOSTOPTIONS
RMTHOST sandbox, MGRPORT 7839
RMTFILE /u01/oggbd/dirdat/initld, MEGABYTES 2, PURGE
--DDL include objname ggtest.*
TABLE ggtest.*;

Then we run the initial load extract in passive node and it will create a trail file with the data. The trail file will be used later for our initial load on the target side.

[oracle@sandbox oggora]$ ./extract paramfile dirprm/ini_ext.prm reportfile dirrpt/ini_ext.rpt
[oracle@sandbox oggora]$ ll /u01/oggbd/dirdat/initld*
-rw-r-----. 1 oracle oinstall 3028 Feb 16 14:17 /u01/oggbd/dirdat/initld
[oracle@sandbox oggora]$

We can also prepare our extract on the source site as well. I haven’t used datapump in my configuration limiting the topology only by simplest and strait-forward extract to replicat configuration. Of course, in any production configuration I would advise using datapump on source for staging our data.
Here are my extract parameters, and how I added it. I am not starting it yet because I must have an Oracle GoldenGate Manager running on the target, and the directory for the trail file should be created. You may have guessed that the Big Data GoldenGate will be located in /u01/oggbd directory.

[oracle@sandbox oggora]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.2.0.1.1 OGGCORE_12.2.0.1.0_PLATFORMS_151211.1401_FBO
Linux, x64, 64bit (optimized), Oracle 12c on Dec 12 2015 02:56:48
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.



GGSCI (sandbox.localdomain) 1> edit params ggext

extract ggext
userid ogg@orcl,password AACAAAAAAAAAAAIARIXFKCQBMFIGFARA, BLOWFISH, ENCRYPTKEY DEFAULT
--RMTHOSTOPTIONS
RMTHOST sandbox, MGRPORT 7839
RMTFILE /u01/oggbd/dirdat/or, MEGABYTES 2, PURGE
DDL include objname ggtest.*
TABLE ggtest.*;


GGSCI (sandbox.localdomain) 2> dblogin userid ogg@orcl,password AACAAAAAAAAAAAIARIXFKCQBMFIGFARA, BLOWFISH, ENCRYPTKEY DEFAULT
Successfully logged into database.

GGSCI (sandbox.localdomain as ogg@orcl) 3> register extract GGEXT database

2016-02-16 15:37:21  INFO    OGG-02003  Extract GGEXT successfully registered with database at SCN 17151616.

GGSCI (sandbox.localdomain as ogg@orcl) 4> add extract ggext, INTEGRATED TRANLOG, BEGIN NOW
EXTRACT (Integrated) added.


Let’s leave our source site for a while and switch to the target . Our target is going to be a box where we have hadoop client and all requirement java classes.
I used the same box just to save resources on my sandbox environment. You may run different GoldeGate versions on the same box considering, that Manager ports for each of them will be different.
Essentially we need a Hadoop client on the box, which can connect to HDFS and write data there. Installation of Hadoop client is out of the scope for this article, but you can easily get all necessary information from the Hadoop home page .

Having all required Hadoop classes we continue by installing Oracle GoldenGate for Big Data, configuring and starting it up. In the past I received several questions from people struggling to find the exact place where all adapters could be uploaded. The Adapters were well “hidden” on “Oracle edelivery”, but now it is way simpler. You are going to GoldenGate download page on Oracle site and find the section “Oracle GoldenGate for Big Data 12.2.0.1.0” where you can choose the OGG for Linux x86_64, Windows or Solaris. You will need an Oracle account to get it. We upload the file to our linux box, unzip and unpack the tar archive. I created a directory /u01/oggbd as our GoldenGate home and unpacked the tar archive there.
The next step is to create all necessary directories. We start command line utility and create all subdirectories.

[oracle@sandbox ~]$ cd /u01/oggbd/
[oracle@sandbox oggbd]$ ./ggsci

Oracle GoldenGate Command Interpreter
Version 12.2.0.1.0 OGGCORE_12.2.0.1.0_PLATFORMS_151101.1925.2
Linux, x64, 64bit (optimized), Generic on Nov 10 2015 16:18:12
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.



GGSCI (sandbox.localdomain) 1> create subdirs

Creating subdirectories under current directory /u01/oggbd

Parameter files                /u01/oggbd/dirprm: created
Report files                   /u01/oggbd/dirrpt: created
Checkpoint files               /u01/oggbd/dirchk: created
Process status files           /u01/oggbd/dirpcs: created
SQL script files               /u01/oggbd/dirsql: created
Database definitions files     /u01/oggbd/dirdef: created
Extract data files             /u01/oggbd/dirdat: created
Temporary files                /u01/oggbd/dirtmp: created
Credential store files         /u01/oggbd/dircrd: created
Masterkey wallet files         /u01/oggbd/dirwlt: created
Dump files                     /u01/oggbd/dirdmp: created


GGSCI (sandbox.localdomain) 2>

We are changing port for our manager process from default and starting it up. I’ve already mentioned that the port was changed due to the existence off several GoldenGate managers running from different directories.

GGSCI (sandbox.localdomain) 2> edit params mgr
PORT 7839
.....

GGSCI (sandbox.localdomain) 3> start manager
Manager started.

Now we have to prepare parameter files for our replicat processes. Let’s assume the environment variable OGGHOME represents the GoldenGate home and in our case it is going to be /u01/oggbd.
Examples for the parameter files can be taken from $OGGHOME/AdapterExamples/big-data directories. There you will find examples for flume, kafka, hdfs, hbase and for metadata providers. Today we are going to work with HDFS adapter.
I copied files to my parameter files directory ($OGGHOME/dirprm) and modified them accordingly:

oracle@sandbox oggbd]$ cp /u01/oggbd/AdapterExamples/big-data/hdfs/* /u01/oggbd/dirprm/
oracle@sandbox oggbd]$ vi /u01/oggbd/dirprm/hdfs.props

Here are my values for the hdfs.props file:

[oracle@bigdata dirprm]$ cat hdfs.props

gg.handlerlist=hdfs

gg.handler.hdfs.type=hdfs
gg.handler.hdfs.includeTokens=false
gg.handler.hdfs.maxFileSize=1g
gg.handler.hdfs.rootFilePath=/user/oracle/gg
gg.handler.hdfs.fileRollInterval=0
gg.handler.hdfs.inactivityRollInterval=0
gg.handler.hdfs.fileSuffix=.txt
gg.handler.hdfs.partitionByTable=true
gg.handler.hdfs.rollOnMetadataChange=true
gg.handler.hdfs.authType=none
gg.handler.hdfs.format=delimitedtext
#gg.handler.hdfs.format.includeColumnNames=true

gg.handler.hdfs.mode=tx

goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE

gg.log=log4j
gg.log.level=INFO

gg.report.time=30sec

gg.classpath=/usr/lib/hadoop/*:/usr/lib/hadoop/lib/*:/usr/lib/hadoop-hdfs/*:/usr/lib/hadoop/etc/hadoop/:/usr/lib/hadoop/lib/native/*

javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

You can find information about all those parameters in oracle documentation here, but there are parameters you will most likely need to change from default:

  • gg.handler.hdfs.rootFilePath – it will tell where the directories and files have to be written on HDFS.
  • gg.handler.hdfs.format – you can setup one of the four formats supported by adapter.
  • goldengate.userexit.timestamp – it will depend from your preferences for transactions timestamps written to your hdfs files.
  • gg.classpath – it will depend from location for your hadoop jar classes and native libraries.

You can see I’ve mentioned the gg.handler.hdfs.format.includeColumnNames parameter. It is supposed to put column name before each value in the output file on HDFS. It may be helpful in some cases, but at the same time it makes the file bigger. If you are planning to create an external Hive table, you may consider commenting on it as I have.
The next parameter file is for our data initialization replicat file. You may consider using a Sqoop or another way to make the initial load for your tables, but I think it makes sense to use the GG replicat if the table size is relatively small. It makes the resulting file-set more consistent with the following replication, since it will be using the same engine and format. So, here is my replicat for initial load:

[oracle@sandbox dirprm]$ cat /u01/oggbd/dirprm/irhdfs.prm
--passive REPLICAT for initial load irhdfs
-- Trail file for this example is located in "dirdat/initld" 
-- Command to run REPLICAT:
-- ./replicat paramfile dirprm/irhdfs.prm reportfile dirrpt/ini_rhdfs.rpt
SPECIALRUN
END RUNTIME
EXTFILE /u01/oggbd/dirdat/initld
--DDLERROR default discard
setenv HADOOP_COMMON_LIB_NATIVE_DIR=/usr/lib/hadoop/lib/native
DDL include all
TARGETDB LIBFILE libggjava.so SET property=dirprm/hdfs.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP ggtest.*, TARGET bdtest.*;

I was running the initial load in passive mode, without creating a managed process and just running it from command line. Here is an example:

[oracle@sandbox oggbd]$ ./replicat paramfile dirprm/irhdfs.prm reportfile dirrpt/ini_rhdfs.rpt
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/gg/
Found 2 items
drwxr-xr-x   - oracle oracle          0 2016-02-16 14:37 /user/oracle/gg/bdtest.test_tab_1
drwxr-xr-x   - oracle oracle          0 2016-02-16 14:37 /user/oracle/gg/bdtest.test_tab_2
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/gg/bdtest.test_tab_1
Found 1 items
-rw-r--r--   1 oracle oracle        624 2016-02-16 14:37 /user/oracle/gg/bdtest.test_tab_1/bdtest.test_tab_1_2016-02-16_14-37-43.376.txt
[oracle@sandbox oggbd]$ hadoop fs -tail /user/oracle/gg/bdtest.test_tab_1/bdtest.test_tab_1_2016-02-16_14-37-43.376.txt
IBDTEST.TEST_TAB_12016-02-16 19:17:40.7466992016-02-16T14:37:43.37300000000000-100000020121371O62FX2014-01-24:19:09:20RJ68QYM52014-01-22:12:14:30
IBDTEST.TEST_TAB_12016-02-16 19:17:40.7466992016-02-16T14:37:44.89600000000000-100000021552371O62FX2014-01-24:19:09:20HW82LI732014-05-11:05:23:23
IBDTEST.TEST_TAB_12016-02-16 19:17:40.7466992016-02-16T14:37:44.89600100000000-100000022983RXZT5VUN2013-09-04:23:32:56RJ68QYM52014-01-22:12:14:30
IBDTEST.TEST_TAB_12016-02-16 19:17:40.7466992016-02-16T14:37:44.89600200000000-100000024414RXZT5VUN2013-09-04:23:32:56HW82LI732014-05-11:05:23:23
[oracle@sandbox oggbd]$

You can see the Hadoop directories and files created by the initial load.
As soon as the initial load has run we can start our extract and replicat to keep the destination side updated.
We are moving to source and starting our extract prepared earlier.

GGSCI (sandbox.localdomain as ogg@orcl) 6>start extract ggext

Sending START request to MANAGER ...
EXTRACT GGEXT starting

So, moving back to target and preparing our replicat. I used the replicat with the following parameters:

[oracle@sandbox oggbd]$ cat /u01/oggbd/dirprm/rhdfs.prm
REPLICAT rhdfs
-- Trail file for this example is located in "dirdat/or" directory
-- Command to add REPLICAT
-- add replicat rhdfs, exttrail dirdat/or
--DDLERROR default discard
setenv HADOOP_COMMON_LIB_NATIVE_DIR=/usr/lib/hadoop/lib/native
DDL include all
TARGETDB LIBFILE libggjava.so SET property=dirprm/hdfs.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP ggtest.*, TARGET bdtest.*;

We are adding the replicat to our configuration and it is going to carry on the replication.

GGSCI (sandbox.localdomain) 1> add replicat rhdfs, exttrail dirdat/or
REPLICAT added.


GGSCI (sandbox.localdomain) 2> start replicat rhdfs

Sending START request to MANAGER ...
REPLICAT RHDFS starting


GGSCI (sandbox.localdomain) 3>

Our replication is up and running, the initial load worked fine, and we can test and see what we have on source and target.
Here is the data in one of our source tables:

orcl> select * from ggtest.test_tab_1;

	   PK_ID RND_STR    USE_DATE	      RND_STR_1  ACC_DATE
---------------- ---------- ----------------- ---------- -----------------
	       1 371O62FX   01/24/14 19:09:20 RJ68QYM5	 01/22/14 12:14:30
	       2 371O62FX   01/24/14 19:09:20 HW82LI73	 05/11/14 05:23:23
	       3 RXZT5VUN   09/04/13 23:32:56 RJ68QYM5	 01/22/14 12:14:30
	       4 RXZT5VUN   09/04/13 23:32:56 HW82LI73	 05/11/14 05:23:23

orcl>

I’ve created an external Hive table for the table test_tab_1 to have better look.

hive> CREATE EXTERNAL TABLE BDTEST.TEST_TAB_1  (tran_flag string, tab_name string, tran_time_utc timestamp, tran_time_loc string,something string, something1 string,
    > PK_ID INT, RND_STR VARCHAR(10),USE_DATE string,RND_STR_1 string, ACC_DATE string)
    > stored as textfile location '/user/oracle/gg/bdtest.test_tab_1';
OK
Time taken: 0.327 seconds
hive> select * from BDTEST.TEST_TAB_1;
OK
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:43.373000	00000000-10000002012		1	371O62FX	2014-01-24:19:09:20	RJ68QYM5	2014-01-22:12:14:30
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:44.896000	00000000-10000002155		2	371O62FX	2014-01-24:19:09:20	HW82LI73	2014-05-11:05:23:23
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:44.896001	00000000-10000002298		3	RXZT5VUN	2013-09-04:23:32:56	RJ68QYM5	2014-01-22:12:14:30
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:44.896002	00000000-10000002441		4	RXZT5VUN	2013-09-04:23:32:56	HW82LI73	2014-05-11:05:23:23
Time taken: 0.155 seconds, Fetched: 4 row(s)
hive>

You can see the table definition is a bit different from what we have on the source site and you will see why.
We got additional columns on the destination side. Interesting that while some of them have a pretty clear purpose, the other columns are not totally clear and have null values.
The first column is a flag for operation, and it shows what kind of operation we have gotten in this row. It can be “I” for insert, “D” for delete and “U” for an update. The second column is table name. The third one is a timestamp in UTC showing when the transaction occurred. The next one is another time in local timezone informing the time of commit, and the next column has a commit sequence number. Those columns can help you to construct the proper data set for any given time.
Let’s insert and update some row(s) on source and see how it will show up on the target:

orcl> insert into ggtest.test_tab_1 values (5,'TEST_1',sysdate,'TEST_1',sysdate);

1 row created.

orcl> commit;

orcl> update ggtest.test_tab_1 set RND_STR='TEST_1_1' where PK_ID=5;

1 row updated.

orcl> commit;

Commit complete.

orcl>

Let’s check how it is reflected on the destination site. We see two new rows, where each row represents a DML operation. One was for the insert and the second one was for the update.

[highlight="7,8"
hive> select * from BDTEST.TEST_TAB_1;
OK
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:43.373000	00000000-10000002012		1	371O62FX	2014-01-24:19:09:20	RJ68QYM5	2014-01-22:12:14:30
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:44.896000	00000000-10000002155		2	371O62FX	2014-01-24:19:09:20	HW82LI73	2014-05-11:05:23:23
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:44.896001	00000000-10000002298		3	RXZT5VUN	2013-09-04:23:32:56	RJ68QYM5	2014-01-22:12:14:30
I	BDTEST.TEST_TAB_1	2016-02-16 19:17:40.746699	2016-02-16T14:37:44.896002	00000000-10000002441		4	RXZT5VUN	2013-09-04:23:32:56	HW82LI73	2014-05-11:05:23:23
I	BDTEST.TEST_TAB_1	2016-02-16 20:43:32.000231	2016-02-16T15:43:37.199000	00000000000000002041		5	TEST_1	2016-02-16:15:43:25	TEST_1	2016-02-16:15:43:25
U	BDTEST.TEST_TAB_1	2016-02-16 20:43:53.000233	2016-02-16T15:43:56.056000	00000000000000002243		5	TEST_1_1
Time taken: 2.661 seconds, Fetched: 6 row(s)
hive>

It work for deletes too, only flag will be “D” instead of “I” for insert and “U” for updates.

What about DDL support? Let’s truncate the table.

orcl> truncate table ggtest.test_tab_1;

Table truncated.

orcl>

And here, there is nothing in our hdfs files. Maybe I’ve missed something, but it looks like the truncate operation is not creating any record. I need to dig a bit more. I will try to make a separate blog post about DDL support for the Big Data Adapters.
It works pretty well when we create a new table and insert new rows.

It also works if we change one of the existing tables, adding or dropping a column. Let’s try to create a new table here:

orcl> create table ggtest.test_tab_4 (pk_id number, rnd_str_1 varchar2(10),acc_date date);

Table created.

orcl> insert into ggtest.test_tab_4 select * from ggtest.test_tab_2;

1 row created.

orcl> commit;

Commit complete.

orcl>

You can see that it has created a new directory and file for the new table. Additionally, if you add a column the new file will be used for all DML operations for the updated table. It will help to separate tables with different structure.

[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/gg/
Found 3 items
drwxr-xr-x   - oracle oracle          0 2016-02-16 15:43 /user/oracle/gg/bdtest.test_tab_1
drwxr-xr-x   - oracle oracle          0 2016-02-16 14:37 /user/oracle/gg/bdtest.test_tab_2
drwxr-xr-x   - oracle oracle          0 2016-02-16 15:56 /user/oracle/gg/bdtest.test_tab_4
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/gg/bdtest.test_tab_4/
Found 1 items
-rw-r--r--   1 oracle oracle        127 2016-02-16 15:56 /user/oracle/gg/bdtest.test_tab_4/bdtest.test_tab_4_2016-02-16_15-56-50.373.txt
[oracle@sandbox oggbd]$ hadoop fs -tail /user/oracle/gg/bdtest.test_tab_4/bdtest.test_tab_4_2016-02-16_15-56-50.373.txt
IBDTEST.TEST_TAB_42016-02-16 20:56:47.0009532016-02-16T15:56:50.371000000000000000000068327IJWQRO7T2013-07-07:08:13:52
[oracle@sandbox oggbd]$

At first glance it looks good, but let’s try to create a table as select.

orcl> create table ggtest.test_tab_3 as select * from ggtest.test_tab_2;

Table created.

orcl>

Not sure if it is an expected behavior or bug, but apparently it is not working. Our replicat is broken and complains that “DDL Replication is not supported for this database”.

[oracle@sandbox oggbd]$ tail -5 ggserr.log
2016-02-16 15:43:37  INFO    OGG-02756  Oracle GoldenGate Delivery, rhdfs.prm:  The definition for table GGTEST.TEST_TAB_1 is obtained from the trail file.
2016-02-16 15:43:37  INFO    OGG-06511  Oracle GoldenGate Delivery, rhdfs.prm:  Using following columns in default map by name: PK_ID, RND_STR, USE_DATE, RND_STR_1, ACC_DATE.
2016-02-16 15:43:37  INFO    OGG-06510  Oracle GoldenGate Delivery, rhdfs.prm:  Using the following key columns for target table bdtest.TEST_TAB_1: PK_ID.
2016-02-16 15:48:49  ERROR   OGG-00453  Oracle GoldenGate Delivery, rhdfs.prm:  DDL Replication is not supported for this database.
2016-02-16 15:48:49  ERROR   OGG-01668  Oracle GoldenGate Delivery, rhdfs.prm:  PROCESS ABENDING.
[oracle@sandbox oggbd]$

[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/gg/
Found 2 items
drwxr-xr-x   - oracle oracle          0 2016-02-16 15:43 /user/oracle/gg/bdtest.test_tab_1
drwxr-xr-x   - oracle oracle          0 2016-02-16 14:37 /user/oracle/gg/bdtest.test_tab_2
[oracle@sandbox oggbd]$

What can we say in summary? The replication works, and supports all DML, and some DDL commands. You will need to prepare to get consistent datasets for any given time using flag for operation, and time for the transaction. In my next few posts, I will cover other Big Data adapters for GoldenGate.

Categories: DBA Blogs

A Repeatable Test Environment with Terraform and Chef-solo

Mon, 2016-02-29 08:57

I have been refreshing my Chef skills, and have been looking for ways to incorporate simple strategies for deployment of temporary lab environments in the cloud, that provide a secure, flexible, repeatable, and cost-effective set of resources for testing code. One of the challenges of working in a Vagrant environment is how to protect our client’s intellectual property (i.e. the code we wish to test, in our case a Chef cookbook) when it is distributed around the globe on our consultants’ laptops. I wanted to tackle the problem simply, one step at a time.

As I see it, cloud services offer a flexible and secure environment in which policies can be managed, costs can be minimized, and access can be controlled. Three things are needed:

  1. A standard, flexible, easy to use deployment framework and template / example environment library
  2. An access control and environmental separation framework that allows lab resources to be deployed securely
  3. A set of templates and policies to manage and control shared infrastructure dependencies and to retire idle resources

Number one is how I spent my Sunday afternoon last week, with this as the result:

https://github.com/kpeder/theseeker

Following the README should get you started with deploying the host if you are familiar with Terraform already. This template uses AWS services, but the concept is portable to any of the multiple providers supported by Terraform.

What this is good for:

I think that the benefits of this approach to testing deployment code are clear, but to highlight:

  • We can add the cloud provider’s management framework and services to our test framework, providing a secure location and environmental separation as needed for testing of intellectual property (i.e. code).
  • We can store our test data in the cloud securely, and access it when required by keeping it on persistent storage (EBS in this case).
  • We use templated environments to ensure consistent and repeatable deployments for developers and engineers to collaborate on, as we would with a locally deployed Vagrant test environment.
  • We leverage scale and flexibility in the cloud to create practical testing environments that match our needs and those of our customers.
  • We leverage automation and environmental policies to ensure resource usage is minimized to that required for testing activities; systems are shut down and minimal storage costs incurred when the environments are not in use.

Prerequisites:

In preparation, you require an Amazon Web Services account with:
– an EC2 keypair with the private key available on your local host (usually in ~/.ssh/ somewhere).
– an IAM access key and secret access key for the AWS account that will be used to deploy.

To deploy the lab:

  1. Install Terraform (http://terraform.io) from Hashicorp (http://hashicorp.com) and ensure that the ‘terraform’ directory is somewhere in your path.
  2. Create a build directory and change into it. Clone the repository and change to the project directory. I always set up a remote (origin) at this point in git. We’ll be using the master branch.
  3. Per the README, ensure that you create a ‘terraform.tfvars’ file with 600 permissions and add your authentication token data in the format provided. (This file is excepted in the ‘.gitignore’ to prevent accidental upload of authentication tokens to the repository).
  4. Run ‘terraform plan’ and review the output. If no errors, run ‘terraform apply’. Your environment should start to build in AWS, and Terraform will create some local state files in the working project directory.
  5. Provisioning output will be displayed as the build and bootstrap progresses, with output of the commands run in the remote-exec provisioner section of template file ‘theseeker.tf’. (The Seeker‘ is a song by The Who if you are wondering, and IMO the best name ever given to a test host). Review this file to see what’s happening.
  6. When provisioning completes, we should have a functioning instance running Amazon Linux in US-West/Oregon, by default. If you change the region (in ‘variables.tf’) then don’t forget that you must also update your AMIs (also in ‘variables.tf’) to match the proper region. I’ve used this AMI in my example, specifically the HVM EBS-backed image.
  7. The public DNS address is output by Terraform. Connect to this using your private EC2 key using ‘ssh -i ~/.ssh/myec2key -l ec2-user fully.qualified.name’.

Note:

If you inspect the template noted in step 5 you will notice that the host is configured to not delete the EBS volume on termination, meaning that after instance termination the EBS volume will be preserved and can be booted from again (so we don’t lose our chef cookbook work between deployments). Do note however that for each node that is freshly provisioned using Terraform, a new volume will be created. If you wish to destroy the volume on creation, update the following section and change the value to ‘true’, however keep in mind that terminating the instance will terminate the data in your ‘chef-repo’.

root_block_device {
delete_on_termination = false
}

Chef-solo usage:

Now that we’ve logged into our host, let’s change directory to ‘chef-repo’ and have a look…

  1. From the ‘chef-repo’ directory, knife and knife-solo are already configured. You can see the configuration in ‘.chef/knife.rb’.
  2. The ‘cookbooks’ and ‘site-cookbooks’ directories are in the default cookbook path. No roles are configured. There is one node, ‘localhost’, under the nodes directory.
  3. To see the node configuration, run ‘knife node –local-mode list’… ‘localhost‘ is returned.
  4. Let’s have a look at the run list for this node, using ‘knife node –local-mode show localhost’… ‘recipe[ntp::default]‘ is returned in the Run List.
  5. We can see from the output that the run list for the node contains the ‘ntp::default’ recipe as configured by the remote-exec provisioner in the template.
  6. If we wanted to add additional recipes to the run list for the node, we can use ‘knife node –local-mode run_list add localhost ‘recipe[cookbook::recipe]”.
  7. Finally, we can apply the ntp recipe to the node localhost using ‘knife solo cook ec2-user@localhost -i ~/.ssh/mykey’. Note that a second run of the same command will yield a different result, as the new configuration has already been applied by Chef.
  8. When done with the environment, log out of the host and issue a ‘terraform destroy’ command. Type ‘yes’ in response to the confirmation message and Terraform will dismantle the lab and terminate the instance.

Other notes:

Don’t forget that you can now boot the instance from the EC2 console by selecting the volume we created as the new instance’s root (boot) volume.

Chef-solo also supports Berkshelf for dependency management, just install to your ‘chef-repo’ directory and chef-solo will act accordingly.

It’s worth noting that in my research I found that the the chef client now supports local mode operation and will eventually replace chef-solo for headless operation.

A great next step to get going with Terraform templates would be to create a new template file to deploy a second aws instance and then configure it with knife-solo from our first host. To ensure connectivity between nodes, ssh access must be available. To accomplish this simply, we can add the new host to the existing host group, ‘sg_theseeker_access’ which allows ssh login from anywhere by default. Again, refer to the instance template in step 5 in the deployment section above for an example of this.

In my next blog I will be describing a similar, templated Terraform environment that uses Ansible and Ambari Blueprints to deploy a 6+ node Hortonworks Hadoop cluster in ~45 minutes, ready to receive data and code for testing. For a sneak preview, see this github repository and dive into the README.

Have fun!

Categories: DBA Blogs

Oracle’s CREATE INDEX command can take a hint

Thu, 2016-02-25 15:39

Here’s something really useful I discovered by accident when playing with Auto DOP (parallel_degree_policy) in 12c.

The “create index” command can take a hint – in particular the PARALLEL hint. The syntax is as you would expect:

create /*+PARALLEL*/ index tab_ind1 on tab(col1);

Of course, you can specify the number of parallel servers to be used by specifying PARALLEL(24) for example for 24 threads. The really interesting thing about using a hint vs. the documented syntax ("create index tab_ind1 on tab(col1) parallel 24;") is that once created – the index doesn’t have a default degree of parallelism. So you don’t need a second command to make the index noparallel.

Note that if you put the hint and use the “noparallel” attribute like so:

create /*+PARALLEL*/ index tab_ind1 on tab(col1) noparallel;

Then no parallelism will be used.

I tried using hints like FULL(t) to force an index create to use a full table scan instead of an existing index – but that doesn’t seem to work.

I discovered this under really interesting circumstances. I was testing some unrelated functionality that required some indexes created on my play table called CKK.

Here’s the SQL for the CKK table, which will create a 40 GB table with 2 rows per block:

create table ckk nologging tablespace ckk as
select rownum id, mod(rownum,5) mod5_id, mod(rownum,5000) mod5000_id, sysdate dt_fixed, sysdate - rownum/24/60 dt_dec, sysdate + rownum/24/60 dt_pos, sysdate + ora_hash(rownum,65,535)/24 dt_rand, sysdate+mod(rownum,10) dt_mod10, rpad('x',3500,'x') filler
from (select rownum r from dual connect by level <= 10000) r1, (select rownum r from dual connect by level <= 1000)
;

Then when I attempted to create an index on the table in parallel, Oracle refused to do so:

create index ckk$id on ckk(id) parallel 24; --DOES NOT run in parallel

Instead it created the index with 1 thread only, and then set the parallel degree policy to 24. I have tracked this problem down to the Auto DOP feature. If I turn it off via parallel_degree_policy=manual – the problem goes away. But I never expected this feature to turn off parallelism for index creation when explicitly requested.

Here’s the kicker – once any index is created on the table, future index creations will be automatically done in parallel, regardless if parallel was requested.

For example, this index would be now created in parallel:

create index ckk$mod5_id on ckk(mod5_id);

While before creating the index “ckk$id” – this index would refuse to get created in parallel – when using the parallel attribute.

That’s when I said to myself, “it’s almost like there’s a hint.” I took the hint, and discovered it does work, and it works more consistently than the attribute.

Categories: DBA Blogs

Log Buffer #462: A Carnival of the Vanities for DBAs

Thu, 2016-02-25 15:07

This Log Buffer Edition covers Oracle, SQL Server and MySQL blog posts listing down few new tricks, tips and workarounds plus the news.

Oracle:

Displaying CPU Graphs For Different Time Ranges in Enterprise Manager

One of the cool things in 12c is that (finally after all these years) a sequence can be assigned as the default value for a column.

Jonathan Lewis demonstrates connect by after receiving an email.

Oracle 12c – PL/SQL “White List” via ACCESSIBLE BY

Oracle DBA, IT Manager, or Something Else

SQL Server:

Using R Machine Learning Script as a Power BI Desktop Data Source

SQL Authentication Via AD Groups Part II: Who has what access?

Using R Machine Learning Script as a Power BI Desktop Data Source

Connect to on premise data sources with Power BI Personal Gateway

Exploring SQL Server 2016 Dynamic Data Masking – Part One – Creating a Table that uses Dynamic Data Masking

MySQL:

Data Encryption at Rest

Planets9s: Download the new ClusterControl for MySQL, MongoDB & PostgreSQL

Automate your Database with CCBot: ClusterControl Hubot integration

Loading JSON into MariaDB or even MySQL – mysqljsonimport 2.0 is available

Privileges in MySQL and MariaDB: The Weed Of Crime Bears Bitter Fruit

Categories: DBA Blogs