Feed aggregator

Oracle RDBMS Home Install Using Cloning

Alejandro Vargas - Wed, 2010-11-17 00:40

Using a standard Oracle Home, that is updated to the last patches, as the source to install new Oracle Homes can save a lot of time, compared to installing the same Oracle Home + Patches from scratch.

The procedure to clone an Oracle Home is simple and is well documented on a set of My Oracle Support documents that can be found on Document 1154613.1 ordered by release.

On this post I'm providing a step by step example of cloning a 11g R2 Home: How to clone a 11g R2 Oracle Home

This is nice to have solution if you need to make multiple installs on many servers. Yo do one install + patches, then move that copy over to all other servers.

Categories: DBA Blogs

Oracle RDBMS Home Install Using Cloning

Alejandro Vargas - Wed, 2010-11-17 00:40

Using a standard Oracle Home, that is updated to the last patches, as the source to install new Oracle Homes can save a lot of time, compared to installing the same Oracle Home + Patches from scratch.

The procedure to clone an Oracle Home is simple and is well documented on a set of My Oracle Support documents that can be found on Document 1154613.1 ordered by release.

On this post I'm providing a step by step example of cloning a 11g R2 Home:
How to clone a 11g R2 Oracle Home

This is nice to have solution if you need to make multiple installs on many servers. Yo do one install + patches, then move that copy over to all other servers.

Categories: DBA Blogs

APEX 4.1 Statement of Direction Published

David Peake - Tue, 2010-11-16 17:05
Want to learn more about what we are planning for APEX 4.1?

Then read our new Statement of Direction.
APEX 4.1 is scheduled for release in calendar year 2011.


Simba previews Oracle OLAP MDX Provider connectivity to SAP BusinessObjects Voyager

Keith Laker - Tue, 2010-11-16 15:39
Simba technologies have released a short video to preview 'Using MDX Provider for Oracle OLAP to directly connect SAP BusinessObjects Voyager to Oracle Database OLAP Option'

This will be a great capability for users of both Oracle OLAP and BusinessObjects and will futher extend the reach of Oracle database embedded OLAP cubes.

You can get more details on the Simba website
Categories: BI & Warehousing

Data Guard for Manual Failover, Step by Step

Alejandro Vargas - Sat, 2010-11-13 17:14

about:blankIn this post I'm showing the steps used to implement a manual failover scenario. My customer did not want to enable fast start failover but to leave the decision to failover in case of a major crash to the management team.

In the example I'm providing here I did configure flashback database with a one hour retention time so that the OS team can have this time to solve any issues on the primary, if they succeed to solve the problem in this time then the old primary can be easily reinstated as the new standby, other wise it will need to be recreated from a backup taken from the new primary

All details of this experience can be found on this document "Step by Step Configuration of a Physical Standby Database for Manual Failover"

Categories: DBA Blogs

Data Guard for Manual Failover, Step by Step

Alejandro Vargas - Sat, 2010-11-13 17:14


In this post I'm showing the steps used to implement a manual failover scenario. My customer did not want to enable fast start failover but to leave the decision to failover in case of a major crash to the management team.

In the example I'm providing here I did configure flashback database with a one hour retention time so that the OS team can have this time to solve any issues on the primary, if they succeed to solve the problem in this time then the old primary can be easily reinstated as the new standby, other wise it will need to be recreated from a backup taken from the new primary

All details of this experience can be found on this document "Step by Step Configuration of a Physical Standby Database for Manual Failover"

Categories: DBA Blogs

Database Skills

Kenneth Downs - Sat, 2010-11-13 11:14

It seems strange to me that I've been working on this blog for 3 years or so (with one very long break) and somehow never got around to writing a simple list of skills that all database experts need. So here it is!

Various Job Tiles for Database People

There are three common job titles in the database area, which are Database Administrator (DBA), Database Programmer, and Database Architect. These titles tend to be somewhat variable from shop-to-shop, but generally the "Architect" term indicates the highest level of skill combined with considerable management responsibilities. The "Programmer" term is somewhere below that, but the "DBA" is extremely variable. I have seen shops where a person was called a DBA and filled a relatively constrained role closer to IT or operations (routine tasks, no real programming) and other shops where a person with the DBA title was basically the Architect.

Because of this high variability in what titles mean, I am not going to waste time categorizing skills as belonging to one job title or another, I am simply going to list them all out.

The various levels of skills are these:

  • Before Hello World!: The basics of tables, columns, rows
  • The Hello World! Level: SQL Select
  • Just after Hello World!: Writing Data
  • Commands to create, modify and drop tables, or Data Definition Language (DDL)
  • Knowing how to use a Query Analyzer or optimization tool
  • Understanding Normalization
  • Understanding Denormalization
  • Understanding Primary Keys, Foreign Keys and Constraints
  • Understanding Transactions
  • Understanding ACID
  • Understanding Indexes as optimization tool
  • Views
  • Database Security
  • Upgrades and Installs
  • Efficient access of database from application
  • Bulk operations: loading or exporting large amounts of data
  • Understanding of Contexts and how they lead to different sets of Best Practices
  • Preventing performance degradation through various maintenance tasks
  • Deployment strategies: partitioning, tablespaces
  • Deployment strategies, failure protection, from simple backup to hot standbys
  • Server side coding: stored procedures and functions
  • Server side coding: triggers
  • Temporary tables

As long as that list is, it only covers those of us who use database systems. There is an entire set of skills for those who actually create and maintain these systems, but that is not something that will be treated in this blog.

Before Hello World!: Tables and Columns

If you have never so much as typed a single SQL command, or seen a table diagram, or anything like that, then it is worth a few minutes to go through the basics of what a database does, which is to organize atomic values into tables.

I am going to write an essay on this soon, even though it may seem so obvious as to be completely unnecessary. But I will do it because the most popular essay on this blog is about using GROUP BY, which tells me newer programmers are starving for useful tutorials at the beginner level. So it seems to me, why not put something out there at the very beginning of the beginning?

The Hello World! Level: SQL Select

If you are starting completely from scratch and want to know about database programming, you want to start with the SQL SELECT command. This is the (almost) only command used to extract data from a database, and all of the possible ways to combine, filter and extract data are expressed in the many clauses of this command.

Just after Hello World!: Writing Data

When it comes time to change the data in a database there are three commands, listed below. These commands are based on the tables-and-rows nature of databases, and allow to add a row (or rows), change a row (or rows) and delete a row (or rows).

  • The INSERT command
  • The UPDATE command
  • The DELETE command
Commands to create, modify and drop tables, or Data Definition Language (DDL)

The term "DDL" stands for "Data Definition Language" and includes all of the commands use to build the tables that will hold the data for the INSERT, UPDATE, DELETE and SELECT statements. The basic list of commands to be familiar with is:

  • Understanding Data Types (databases are strongly typed)
  • Commands to add and drop primary keys
  • Commands to add and drop foreign keys
  • Commands to add and drop constraints
  • Commands to add and drop indexes

There are also numerous commands that are specific to different products. Those will not be listed here today, but who knows what the future may bring.

Knowing how to use a Query Analyzer or optimization tool

Database programmers, once they get started with the skills listed above, tend to become more and more obsessed with performance. Every major database has some type of tool that lets you examine how the server is going to process a SQL SELECT, and database programmers depend on these tools to discover where they might alter tables or indexes or the SELECT itself to make the queries go faster.

Understanding Normalization

The term "normalization" refers to the process of analyzing the data that your system is required to store, and organizing it so that every fact is stored in exactly one place. Understanding how to normalize data is an absolute requirement for the database programmer who wants to design databases.

We speak of normalization in "forms" as in "first normal form", "second normal form", and so on. It is a good idea to understand The argument for normalization and then to pursue at very least:

Normalization is a a fascinating topic to study, and it extends all they way up to "Domain-key Normal Form" which is considered the most complete normalization for a database.

Understanding Denormalization

Every database programmer, after fully understanding normalization, realizes that there are severe practical problems with a fully normalized database, such a database solves many problems but generates problems of its own. This has led programmer after programmer down the path of denormalization, the deliberate re-intoduction of redundant values to improve the usability of the database.

There is a surprising lack of material available on the web regarding denormalization strategies. Most of what you find is arguments and flame wars about whether or not to do it, with little to nothing on how to actually do it. For this reason, I provide my own essays on this blog on the strategies and methods I have worked out over the years:

After reviewing The Argument For Denormalization it is worthwhile to follow up with:

The arguments for and against denormalization are heavily affected by the Pay me now or pay me later design tradeoff.

Understanding Primary Keys, Foreign Keys and Constraints

One might argue that this list of skills belongs much higher up the list, up there with the CREATE TABLE command. However, I have it useful to distinguish between simply knowing the commands to make a primary key and actually understanding the tremendous power of keys.

In this author's opinion it is not truly possible to understand how powerful and beneficial Primary keys and Foreign Keys are for an entire application stack until you have learned the commands, built some databases, and worked through the concepts of normalization and denormalization. Only then can you revisit these humble tools and realize how powerful they are.

Understanding Transactions

The word "transaction" has two meanings in common day-to-day database talk. One meaning is very loose and refers to some individual command or set of commands. You might hear somebody using the term loosely when they say, "We're seeing about 10 transactions per second this week."

The more rigorous use of the term refers to a statement or set of statements that must be guaranteed to either complete in their entirety or fail in their entirety. This is a profoundly important concept once you get beyond simply making tables with keys and get into real-world heavy multi-user activity. And this leads us to the next topic...

Understanding ACID

Modern relational databases expect multiple simultaneous users to be writing and reading data all of the time. The term "ACID Compliance" refers to both the philosophy of how to handle this and the actual methods that implement that philosophy. The term ACID refers to:

  • The Atomic nature of each transaction
  • The Consistentcy of the database during and after simultaneous overlapping transactions
  • The Isolation of each transaction
  • The Durability of the results
Understanding Indexes as optimization tool

An index is a special tool databases use to provide very rapid access to large amounts of data. Just like keys, it is not enough to know the commands, it is necessary to understand the subtle power of indexes when used with some craftsmanship. The basic uses of indexes are:

  • A simple index on a column to provide rapid search on that column
  • A "covering index" that includes extra columns that can further speed up certain common access patterns
  • Clustered indexes (MS SQL Server) and what they give and what they take away
  • The cost of indexes on write operations

A view looks like a table to the SQL SELECT command. The view itself is a stored SQL SELECT command that encodes some query that is either used very often or is very compex. In all cases, views are used to present the database data to the application in some simplified convenient or secure form. The two major uses of views are:

  • To simplify the application programmer's job
  • To provide a read-only interface for some applications
Upgrades and Installs

If you are a single programmer or hobbyist working with a database, it is all well and good to just add and drop tables as you wish. But as soon as you get into development with quality control stages and multiple programmers, it becomes evident that you need a strategy for handling the schema changes that come with with new versions of the system. There are multiple essays available on this blog, covering:

Database Security

Databases provide incredible security provisions that are just about completely ignored by modern web developers. Sometimes there is good reason for this, but overall anybody who wants to become a truly accomplished Database Programmer or Database Architect must have a thorough understanding of database security and how it can simplify the entire system stack.

Database security comes down to specifying who is allowed to perform the 4 basic operations of INSERT, UPDATE, DELETE and SELECT against which tables:

My basic introduction to security is here.

  • Understanding roles (we used to say users and groups)
  • Simple table-level security
  • Column-level security (not widely supported)
  • Row-level security (not widely supported)
Efficient access of database from application

Imagine you have the perfectly designed database, with every nuance and subtlety excellently crafted in the ares of keys, indexes, normalization, denormalization and security. At this point your job branches out into several new areas, but one of the most important is knowing how to write application code that efficiently accesses the database.

Bulk operations: loading or exporting large amounts of data

Some database applications involve a large number of small transactions, where each trip to the database writes only a single row or reads only a dozen or so rows.

But in many cases you need to bulk load large amounts of data in one shot, thousands or even millions of rows. In these cases the techniques that govern small transactions are useless and counter-productive, and you need to learn some new commands and strategies to handle the bulk loads.

Understanding Contexts and how they lead to different sets of Best Practices

Not all databases are created for the same purpose. If you have a very large operations then it will likely have multiple independent databases that fill the classical roles, while in a smaller shop the roles may be combined in one database. I like to refer to these roles as "contexts" because they determine how the tables will be designed and how acess to the tables will be governed. The major contexts are:

  • OLTP or Online Transaction Processing, characterized by simultaneous reads and writes, generally assumes little or no periods of inactivity, and generally assumes that the individual transactions are very small. The apps we were all writing in the 80s and 90s to do accounting, ERP, MRP, Job control, payroll, airline reservations and many others fall into this context.
  • Data Warehouse context, characterized by periodic bulk loads of new information with most activity being reads. The Data Warehouse context is largely associated with the "Star Schema" table design. Data in a Warehouse is historical, it never changes after it is loaded.
  • CMS or Content Management System, also characterized by very few writes compared to reads, but more likely to have a normalized structure. Unlike a Data Warehouse, the data is subject to change, just not that often.
  • Any other Read Only Context. I include this category because I spent some time working on Direct Marketing databases, which are like a Data Warehouse in that they are updated periodically and the data does not change, but the Star Schema is completely inappropriate for them.

If you consider a huge online shopping system, you can see that within that application there are at least two contexts. The product catalog is likely to see vastly fewer writes than reads, but the shopping cart tables will be in a constant state of reads and writes.

Preventing performance degradation through various maintenance tasks

Once the database and its application stack is up and running, and the reads and writes and coming through, the laws of thermodynamics come into play and system performance can begin to degrade even if the database stays the same size and the load on the system is steady.

Different vendors have different tools for combatting this, but they tend to come down to reclaiming temporary space and occassionally rebuilding indexes. There are also log files that have to be purged, regular backups to be made, and other operations along those lines.

Deployment strategies: partitioning, tablespaces

When systems become sufficiently large, it is no longer possible to just attach some disks to a box and run a database server. The Database Architect must consider breaking different tables out onto different sets of spindles, which is usually done with "tablespaces", and moving older data onto slower cheaper spindles, which is often done with Partitioning.

Deployment strategies, failure protection, from simple backup to hot standbys

Because a database typically experiences simultaneous reads and writes from multiple sources, and may be expected to be up and running 24/7 indefinitely, the concept of making a backup and recovering from a failure becomes more complicated than simply copying a few files to a safe location.

In the most demanding case, you will need to provide a second complete box that can become fully live within seconds of a disastrous failing of the main box. This is called various things, but Postgres calls it a "hot standby" in version 9 and some MS SQL Server shops call it a "failover cluster."

The ability to come up live on a second box when the first one fails is made possible by the way databases handle ACID compliance, and the fact that they produce something called a Write-Ahead-Log (WAL) that can be fed into a second box that "replays" the log so that its copy of the database is getting the same changes as the master copy.

Server side coding: stored procedures and functions

I really could not figure out where to put this entry in the list, so I just punted and put it near the end. It could really go anywhere.

Stored procedures or functions are procedural routines (not object oriented) that are on the database server and can be invoked directly from an application or embedded inside of SQL commands. Generally speaking they provide various flow-control statements and rudimentary variable support so that you can code multi-step processes on the server itself instead of putting them in application code.

Server side coding: Triggers

Triggers are quite possibly the most elegant and beautiful technology that servers support, absolutely the least understood, and definitely the most reviled by the ignorant. You will find virtually no web content today that will explain why and how to use triggers and what they are good for.

Except of course for my own essay on triggers that discusses them in terms of encapsulation. Temporary tables

Temporary tables are like Stored Procedures inasmuch as I had no idea where to put them in the list, so they just ended up at the end.

As the name implies, a temporary table is a table that you can create on-the-fly, and which usually disappears when your transaction is complete. They are most often found in Stored Procedures. They can impact performance for the worst in many ways, but can be extremely useful when you are doing multi-staged analsysis of data in a Data Warehouse (that's where I use them the most).

Categories: Development

Do not forget to set gpgkey when installing the oracle-validated rpm

Yasin Baskan - Tue, 2010-11-09 03:22
In my previous post I had five lines added to /etc/yum.conf.


The ones except gpgkey are self-explanatory. The parameter gpgkey is used to point to a file that contains the public key for the packages you install so that yum can verify the package's authenticity if needed. The file I use is the key file that contains the public key to verify the oracle-validated rpm.

oracle-validated rpm is used to install the necessary packages for Oracle installations, it also updates the kernel parameters and creates a default oracle user. Using it is an easy way to prepare your server for Oracle installations, the other option is to check the installation prerequisites from the documentation and install the packages, update the kernel parameters and create the user yourself.

MOS Note 579101.1 explains how to install the oracle-validated rpm.

I tried to install this rpm without checking the note and I did not use the gpgkey parameter in /etc/yum.conf initially. This is what you get if you do not set it.

[root@oeltest tmp]# yum install oracle-validated-1.0.0-18.el5.i386.rpm
Loaded plugins: security
Setting up Install Process
Parsing package install arguments
Examining oracle-validated-1.0.0-18.el5.i386.rpm: oracle-validated-1.0.0-18.el5.i386
Marking oracle-validated-1.0.0-18.el5.i386.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package oracle-validated.i386 0:1.0.0-18.el5 set to be updated
--> Processing Dependency: compat-db for package: oracle-validated
--> Processing Dependency: compat-gcc-34 for package: oracle-validated
--> Processing Dependency: compat-gcc-34-c++ for package: oracle-validated
--> Processing Dependency: elfutils-libelf-devel for package: oracle-validated
--> Processing Dependency: gcc for package: oracle-validated
--> Processing Dependency: gcc-c++ for package: oracle-validated
--> Processing Dependency: gdb for package: oracle-validated
--> Processing Dependency: glibc-devel for package: oracle-validated
--> Processing Dependency: glibc-headers for package: oracle-validated
--> Processing Dependency: kernel-headers for package: oracle-validated
--> Processing Dependency: libXp for package: oracle-validated
--> Processing Dependency: libaio-devel for package: oracle-validated
--> Processing Dependency: libstdc++-devel for package: oracle-validated
--> Processing Dependency: sysstat for package: oracle-validated
--> Processing Dependency: unixODBC for package: oracle-validated
--> Processing Dependency: unixODBC-devel for package: oracle-validated
--> Running transaction check
---> Package kernel-headers.i386 0:2.6.18-128.el5 set to be updated
---> Package libaio-devel.i386 0:0.3.106-3.2 set to be updated
---> Package compat-gcc-34-c++.i386 0:3.4.6-4 set to be updated
---> Package gdb.i386 0:6.8-27.el5 set to be updated
---> Package libXp.i386 0:1.0.0-8.1.el5 set to be updated
---> Package compat-db.i386 0:4.2.52-5.1 set to be updated
---> Package unixODBC-devel.i386 0:2.2.11-7.1 set to be updated
---> Package gcc.i386 0:4.1.2-44.el5 set to be updated
--> Processing Dependency: libgomp >= 4.1.2-44.el5 for package: gcc
---> Package compat-gcc-34.i386 0:3.4.6-4 set to be updated
---> Package glibc-headers.i386 0:2.5-34 set to be updated
---> Package sysstat.i386 0:7.0.2-3.el5 set to be updated
---> Package elfutils-libelf-devel.i386 0:0.137-3.el5 set to be updated
--> Processing Dependency: elfutils-libelf-devel-static-i386 = 0.137-3.el5 for package: elfutils-libelf-devel
---> Package unixODBC.i386 0:2.2.11-7.1 set to be updated
---> Package glibc-devel.i386 0:2.5-34 set to be updated
---> Package libstdc++-devel.i386 0:4.1.2-44.el5 set to be updated
---> Package gcc-c++.i386 0:4.1.2-44.el5 set to be updated
--> Running transaction check
---> Package libgomp.i386 0:4.3.2-7.el5 set to be updated
---> Package elfutils-libelf-devel-static.i386 0:0.137-3.el5 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

 Package                                          Arch                     Version                              Repository                                                  Size
 oracle-validated                                 i386                     1.0.0-18.el5                         oracle-validated-1.0.0-18.el5.i386.rpm                      15 k
Installing for dependencies:
 compat-db                                        i386                     4.2.52-5.1                           local                                                      1.7 M
 compat-gcc-34                                    i386                     3.4.6-4                              local                                                      4.1 M
 compat-gcc-34-c++                                i386                     3.4.6-4                              local                                                       11 M
 elfutils-libelf-devel                            i386                     0.137-3.el5                          local                                                       24 k
 elfutils-libelf-devel-static                     i386                     0.137-3.el5                          local                                                       66 k
 gcc                                              i386                     4.1.2-44.el5                         local                                                      5.2 M
 gcc-c++                                          i386                     4.1.2-44.el5                         local                                                      3.4 M
 gdb                                              i386                     6.8-27.el5                           local                                                      3.3 M
 glibc-devel                                      i386                     2.5-34                               local                                                      2.0 M
 glibc-headers                                    i386                     2.5-34                               local                                                      612 k
 kernel-headers                                   i386                     2.6.18-128.el5                       local                                                      926 k
 libXp                                            i386                     1.0.0-8.1.el5                        local                                                       22 k
 libaio-devel                                     i386                     0.3.106-3.2                          local                                                       11 k
 libgomp                                          i386                     4.3.2-7.el5                          local                                                       67 k
 libstdc++-devel                                  i386                     4.1.2-44.el5                         local                                                      2.9 M
 sysstat                                          i386                     7.0.2-3.el5                          local                                                      170 k
 unixODBC                                         i386                     2.2.11-7.1                           local                                                      830 k
 unixODBC-devel                                   i386                     2.2.11-7.1                           local                                                      743 k

Transaction Summary
Install     19 Package(s)
Update       0 Package(s)
Remove       0 Package(s)

Total download size: 37 M
Is this ok [y/N]: y
Downloading Packages:
Total                                                                                                                                            2.6 GB/s |  37 MB     00:00
warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID 1e5e0159

Public key for oracle-validated-1.0.0-18.el5.i386.rpm is not installed

The error indicates that yum cannot verify this rpm so it does not install it. When you update /etc/yum.conf with the gpgkey parameter the error goes away.

How to use the Oracle Enterprise Linux installation media as the yum repository

Yasin Baskan - Tue, 2010-11-09 02:48
If you are using Oracle Enterprise Linux (OEL) 5, the installation media comes with a yum repository on it. The repository is in the directory /media/Enterprise Linux dvd 20090127/Server/repodata for OEL 5.3 (the location may change).

It is possible to use that repository when installing new packages or components locally without accessing a remote repository or without having the need to copy the rpms to a local directory. If you did a base installation the yum package is already installed, if not you need to install it first. After yum is in place edit /etc/yum.conf to insert lines related to the repository on the installation media.


# Note: yum-RHN-plugin doesn't honor this.

# Default.
# installonly_limit = 3

# PUT YOUR REPOS HERE OR IN separate files named file.repo
# in /etc/yum.repos.d

You need to add only the last 5 lines, the other ones are already there.
After this you can use yum to add packages or components when needed. An example to install the oracle-validated rpm from this repository will be in my next post.

Webinar: Event Processing for Java Developers

Debu Panda - Mon, 2010-11-08 11:00
I'll be speaking at a webinar Event Processing for Java Developers arranged by Starview Technology, Inc tomorrow (Nov 9) at 10am PST. Register here.

Starting a vm in VirtualBox hangs

Yasin Baskan - Mon, 2010-11-08 07:24
When trying to start a new vm in Oracle VM VirtualBox 3.2.10 it hot hung at progress 0%. There is no way to cancel the startup without killing the process from the host OS.

The first thing to look at is the VBox.log file which resides under C:\Documents and Settings\\.VirtualBox\Machines\OELTest\Logs in Windows XP. Or you can access the log using the menu like below.

I had these lines at the bottom of the log file.

00:00:04.228 AIOMgr: Endpoint for file 'E:\vm\OELTest\OELTest_boot.vdi' (flags 000c0723) created successfully
00:00:04.228 AIOMgr: I/O manager 0x3f3f898 encountered a critical error (rc=VERR_INVALID_PARAMETER) during operation. Falling back to failsafe mode. Expect reduced performance
00:00:04.228 AIOMgr: Error happened in D:\tinderbox\win-3.2\src\VBox\VMM\PDMAsyncCompletionFileNormal.cpp:(1631){pdmacFileAioMgrNormal}
00:00:04.228 AIOMgr: Please contact the product vendor

This seems like the problem is related to the virtual disk I created as the root disk. I am using the SATA interface and in the virtual machine storage settings there is an option named "Use host I/O cache" which is unchecked in my case. Checking it and starting up the vm again resolves the issue.

There are lots of Google results when you search for messages in the log file but the issue has been explained in http://www.virtualbox.org/ticket/7363.

Recursive Queries with Common Table Expressions

Kenneth Downs - Sat, 2010-11-06 12:20

This week The Database Programmer returns after almost 18 months with an entry on using Common Table Expressions (CTEs) to do recursive queries. Relational databases were plagued from their inception with a lack of meaningful treatment for recursive operations, and CTEs have finally plugged that hole.

Common Table Expressions appeared in SQL Server 2005, and in PostgreSQL 8.4, and are also available in Oracle. As for mySQL, since I don't use it, I did a quick Google search and looked at the Docs for 5.5, and couldn't really find anything. I generally tend to assume mySQL cannot do the tough stuff.

But First, A Rant

There have always been plenty of people who claimed SQL was a bloated and clumsy language to work with. Most of the time I tend to agree, but I find the advantages of relational/SQL system to be so large that I'm willing to pay that price.

But with Commom Table Expressions (CTEs) I just can't help drifting into conspiracy theories involving the enemies of SQL infiltrating the committees and deliberately suggesting the most twisted, bloated, complicated way they could think of to do what is really a very basic operation. In other words, I am profoundly unimpressed with the syntax of CTEs, but as long as they are here and they work, we'll go along.

The Basic Example

Your basic recursive table contains a foreign key to itself, so that some rows in the table are children of some other row in the table. This recursion can nest to any depth, and the chart below shows a very simple example:

Primary_key   |   Parent_Key  |  Notes  
     A        |     null      |   top level row, no parent
     B        |      A        |   first level child of A
     C        |      B        |   child of B, grandchild
              |               |   of A
     D        |      C        |   child of C, grandchild 
              |               |   of B, great-grandchild 
              |               |   of A
     X        |     null      |   top level row, no parent
     Y        |      X        |   child of X
     Z        |      Y        |   child of Y, grandchild 
              |               |   of X

What we want is a query that can return a given row and all of its children out to any level, including helpful information about the structure of the recursion, something like this:

Primary_key | Level | Top_Key | Immediate_Parent_key 
     A      |   1   |  A      | null
     B      |   2   |  A      | A    
     C      |   3   |  A      | B   
     D      |   4   |  A      | C
     X      |   1   |  X      | null
     Y      |   2   |  X      | X   
     Z      |   3   |  X      | Y   
And Another Rant

At this point the mind boggles at how long this blog entry needs to be to explain this simple operation. But lets get going anyway.

The First Step and Last Step

A Common Table Expression begins with the "WITH" clause and ends with a standard SQL Select:

;WITH myCTEName (primary_key,level,top_key,immediate_parent_key)
as (
  ....we'll get to this below....
select * from myCTEName

The basic idea is that we are going to define a CTE with a name and a list of columns, and then SELECT out of it. Without that final SELECT statement the CTE does not actually do anything. The SELECT can also be arbitrarily complex, doing aggregrations, WHERE clauses and anything else you might need.

The first thing to notice is the leading semi-colon. This is a trick adopted by MS SQL Server users. SQL Server does not require statements to be terminated with a semi-colon, but a SQL Server CTE requires the previous statement to have been terminated with a semi-colon (nice huh?). So SQL Server programmers adopted the strategy of starting the CTE with a semi-colon, which keeps the syntactical requirement with the CTE, where it belongs.

A given CTE sort of has a name. That is, you have to name it something, but think of it as a table alias in a SQL SELECT, such as "Select * from myTable a JOIN otherTable b...", it exists only during the execution of the statement.

The columns listed in the parantheses can have any names (at least in SQL Server). But these column names are what you will refer to in the final SQL SELECT statement.

Coding The Inside of the CTE, Step 1

Now we code the inside of the CTE in two steps. The first step is called the "anchor", and it is a straightforward query to find the top-level rows:

;WITH myCTEName (primary_key,level,top_key,immediate_parent_key)
as (
    select primary_key   as primary_key
         , 1             as level
         , primary_key   as top_key
         , null          as immediate_parent_key
      from myRecursiveTable
     where Parent_key is null
select * from myCTEName

This should be self-explanatory, we are querying only for rows that have no parent (WHERE Parent_key is null) and we are hardcoding the "level" column to 1, and we are also hardcoding the "immediate_parent_key" column to null.

This query alone would return two of the rows from our desired output:

Primary_key | Level | Top_Key | Immediate_Parent_key 
     A      |   1   |  A      | null
     X      |   1   |  X      | null
Coding The Inside of the CTE, Step 2

Now we are going to add the actual recursion. When I first learned CTEs this was the hardest part to figure out, because it turned out my hard-won set-oriented thinking was actually slowing me down, I had to think like a procedural programmer when defining the second half of the query.

;WITH myCTEName (primary_key,level,top_key,immediate_parent_key)
as (
    select primary_key,1,primary_key,null
      from myRecursiveTable
     where Parent_key is null
    select chd.primary_key,par.level+1,par.top_key,chd.parent_key
      FROM myCTEName        par
      JOIN myRecursiveTable chd ON chd.parent_key = par.primary_key
select * from myCTEName

Thinking step-wise, here is what is going on under the hood:

  1. The server executes the "anchor" query, generating a result set called "myCTEName" containing just the top level rows.
  2. The server then executes the second query. At this point the result set "myCTEName" exists and can be referenced, so that you can link children to their parents. (That's why you see the JOIN)
  3. Step 2 is repeated recursively, adding grand-children, great-grand-children, and so on, until no more rows are being added, at which point it stops, and...
  4. The final result set is passed to the trailing SELECT, which pulls results out of "myCTEName" as if it were a table or view.

So when we code the 2nd part of the inside of the CTE, the part after the UNION ALL, act as if the first query has already run and produced a table/view called "myCTEName" that can be referenced. Once you understand that, the query is pretty easy to understand:

  • The "From myCTEName par" clause tells us we are pulling from the previously generated set. I like to use the alias "par" for "parent" to remind myself that the prior result is the parent row.
  • We then join to the original source table and use the alias "chd" to remind ourselves we are pulling child rows from there. The "ON chd.parent_key = par.primary_key" defines how children are joined to parents.
  • Our first column, "chd.primary_key", is the unique key for the results.
  • Our second column, "par.level+1" gives us a nifty automatically incremented "level" column.
  • Our third column, "par.top_key" ensures that all rows contain a reference to their top-most parent.
  • Our final column, "chd.parent_key", makes sure each row contains a reference to its immediate parent.
Finding Various Statistics

Once you have the inside of the CTE coded, the fun part moves to the final SELECT, which is operating on the complete set of results. You do not necessarily have to pull the complete list. For instance, you may want to find out the maximum nesting level for each parent, or the count of children for each parent:

;WITH myCTEName (primary_key,level,top_key,immediate_parent_key)
as (
    select primary_key,1,primary_key,null
      from myRecursiveTable
     where Parent_key is null
    select chd.primary_key,par.level+1,par.top_key,chd.parent_key
      FROM myCTEName        par
      JOIN myRecursiveTable chd ON chd.parent_key = par.primary_key
select top_key
     , max(level) as maxNestingLevel
     , count(*)   as countRows
     , count(*)-1 as countChildren
  from myCTEName

Common Table Expressions give SQL-based databases the (very) long-needed ability to execute recursive queries, albeit with a rather complex syntax. Once you grasp the basics of how to code them, there are many possible uses that go far beyond the simple example shown here.

Categories: Development

Introducing an Event Server Platform

Debu Panda - Thu, 2010-11-04 15:30
After working for about seven years shaping and evangelizing for a Java Application Server platform and J2EE, I decided to move on. For the last year or so I’ve been spending my energy shaping up the next-generation middleware platform for building, deploying, and managing event-processing applications.

Most vendors have focused on providing limited functionality such as windowing, filtering, and pattern matching, often known as Complex Event Processing (CEP).These vendors have also been targeting limited use cases in the Financial Services sector. As CEP did not take off, many vendors have buried their offerings inside their SOA and BPM solutions. However, I think event processing is pervasive inside all businesses. Whenever you tweet or send a text/sms it generates an event. Whenever you plug in your electric vehicle to the grid, whenever you have a power outage, or whenever a machine breaks down in a factory – events are generated. A smart business needs to analyze and exploit these messages to make the right decision to take the right decision at the right time. Many of the new generation of applications are being built using an event-driven paradigm and need a new generation of middleware platform named an Event Server Platform. In this article, I will introduce an event server platform.

What is an Event Server?

Why do you use an application server? Because you do not want to reinvent the wheel and take advantage of several services the application server provides to quickly build your application. An event server provides similar functionality for users to rapidly build and deploy event-processing applications – optimized for event processing. I will discuss why traditional application servers are not suitable for event processing in a future article. One of the key points here is that traditional application servers are optimized for request-response applications and not for event processing.

In all practical senses an event server is an application server optimized for event processing applications. Let us look at an example architecture. The following figure shows the architecture for the Starview Event Server that is built on OSGi:

You have to build an application before you deploy it to an event server. So you need tools and languages to build an application.

Development Tool

You will need to build, test, and debug your event-driven application and hence you will need an IDE. Here is an example of Starview ACE that uses a model-driven approach to build an event-driven application. Starview ACE is an Eclipse plug-in and application models are based on the Eclipse Model Framework:

Connectivity Adapters

You will need to capture an event stream at its source and in-bound adapters provide this connectivity. The event source can be a messaging system, SNMP traps, socket reader, log files, database updates, and so on. An event server provides out–of-the-box adapters to simplify reading event sources without much programming. The adapters also generate outbound events or integrate with third-party systems and resources for correlating events.

Programming Language aka Event Processing Language

In order to process the events you need an Event Processing Language. The CEP vendors often refer to Stream SQL as their EPL. However, as you know SQL is quite limiting in nature and you will need the full semantics of a programming language built for event processing that provides fast and efficient in-memory structures to represent complex data types, andin-stream processing and analytics. The Event Processing Language must provide the ability to maintain state and support the concept of an event-processing agent for implementing complex event-processing rules.

Also, you do not want your event-processing rules to be static in nature: you want to enable your business users to author rules. Hence the programming language must provide a foundation to develop Domain-Specific Languages.

Here is a typical architecture for such an Event Processing Language:

This diagram shows the architecture for the Star language.

You may ask, “where is Java in this equation?” The event servers must integrate with existing Java applications, and provide the ability to build applications using Java. You have to remember, though, that Java has its limits and you have to explore the capabilities provided by Event Processing Languages.

Distributed Application

Many of the event processing applications are distributed in nature and require event processing at the edge. These are prevalent in many use cases such as Quality of Service, Smart Grid optimization, and manufacturing automation, where you want to process events locally and filter out unnecessary events at the edge. The event server platform must provide mechanisms to deploy a lightweight version of the event server at the edge and collaborate with a centralized event server without requiring hundreds of lines of code!

Management Infrastructure

You need a good management infrastructure for managing your event servers and applications. This becomes challenging when applications are distributed in nature. The management infrastructure should provide the ability to deploy, manage, and monitor applications, event servers, and server groups. And the infrastructure must be built using an event-driven paradigm.

The following screen shot shows the management console for Starview Enterprise Hub that provides such a management infrastructure:

These are the basics of an Event Server Platform. You will several advanced features such as high-availability, caching, etc.

We will discuss some of these topics in detail in future blog entries.

References and Suggested Reading

Microsoft Certifies Simba’s MDX Provider for Oracle OLAP as “Compatible with Windows 7”

Keith Laker - Thu, 2010-11-04 15:30
Simba announced today that Microsoft has certified its MDX Provider for Oracle Database OLAP Option for Windows 7.

This is great news for MS Office connectivity to your Database OLAP cubes. Already supported was Excel 2010 with its cool new BI features, as well as 2007 and 2003 versions. With Windows 7 support too, even the most up-to-date PCs are in good shape to exploit the OLAP Option.

Remember - via the native Excel pivot table connectivity, other features of MS-Office can also leverage the power, performance and calculation functionality of Oracle OLAP. So for example, your Powerpoint presentation to the senior management team, or the one you use when talking to suppliers or key customers can now contain live charts onto the latest information available in the Oracle Database cubes. Instead of spending time cutting and pasting static images into your slide shows each day, week or period-end you can be focusing on the news in the data and doing a better analysis of the results.

This is the perfect complement to your use of Oracle BI Enterprise Edition (10g or 11g) onto the exact same cubes for the broader BI use-case.

In addition to Microsoft Excel, Simba is gradually certifying other MDX front ends for the OLAP Option MDX Provider - with SAP-BusinessObjects Voyager already available and others due soon.

You can learn all about it on the Simba web site : http://www.simba.com/MDX-Provider-for-Oracle-OLAP.htm

Connectivity into Oracle Database OLAP 11g is market leading: with the same cubes and calculations being accessible via MDX as well as a huge range of SQL based tools and applications, it has never been easier to deploy multidimensional analytics to the masses.

Categories: BI & Warehousing

Beware of the Byte

Duncan Mein - Wed, 2010-11-03 07:29
Recently our test department raised a bug against one our applications that occurred when trying to insert a record into a table.

The error message encountered was a fairly innocuous "ORA-01704: string literal too long".

Following the test case to the letter, I successfully generated the same error and located the table that the APEX form was inserting into. A quick check of the Data Dictionary confirmed that the column in question was of type VARCHAR2(10). At this stage, I though the obvious cause was that there was no limit on the APEX form item (a Text Area) of 10 characters. Having checked the item in question, not only was there a “maxWidth” value of 10, the text area had been created with a “Character Counter”. Strange then how a form item accepting 10 characters was erroring whilst inserting into a column of VARCHAR2(10).

A little while later...... (after some head scratching and several discussions with our DBA’s and a colleague) the problem was all too clear. Somewhere between Database Character Sets, VARCHAR2 column definitions and non ASCII characters lay the answer.

Please forgive the rather verbose narrative but allow me to delve a little deeper.

Firstly the facts:

1. The character set of our database is set to AL32UTF8

FROM v$nls_parameters

2. Definition of table causing the error:

col1 VARCHAR2(10 BYTE)

3. SQL Statement causing the error:

INSERT INTO nls_test VALUES ('““““““““““');

NB: 10 individual characters.

The character used in this insert is typical of a double quote produced by MS Word (it was in fact a copy and paste from a Word document into our Text Area that caused our error).


The reason we encountered the error was all to do with the attempt to insert a 'Multi Byte' character (a double quote from word in our case) into our table as opposed to typical single byte characters (A-Z, 0-9 etc).

Performing a simple LENGTHB to return the number of bytes this character uses demonstrated this perfectly:

SELECT lengthb('“') from dual;


Because our column definition is of type VARCHAR2(10 BYTE), we are only permitted to store values that do not exceed 10 bytes in length.

Beware, 1 character in our case definitely does not = 1 byte. As already proved, our single character (a Word double quote) occupies 3 bytes so the maximum number of this Multi Byte Character we could possible insert according to our table definition is worked out simply as:

10 bytes (Column Definition) / 3 (length in bytes of our character) = 3

So whilst the APEX form item does not distinguish between single and multi byte characters and will allow you to input the full 10 characters, Oracle Database will bounce it back in our case as the total bytes in our string is 30 hence the error.

One solution suggested was to alter the Data Type to be of type VARCHAR2(10 CHAR) instead of BYTE. This in theory would force the database to respect the actual number of characters entered and not worry too much about single vs. multi byte occupancy. This would allow us resolve our immediate issue of 10 multi byte characters inserting into our table however there are further considerations.

As it turns out, even when you define your columns to use CHAR over BYTE, Oracle still enforces a hard limit of up to 4000 BYTES (given a mixed string of single and multi byte characters, it implicitly works out the total bytes of the string).

So beware that even if your column for example accepts only 3000 CHAR and you supply 2001 multi byte characters in an insert statement, it may still fail as it will convert your sting into BYTES enforcing an upper limit of 4000 BYTES.

Sorry for the really long post but it was a much for my own documentation as anything else.

Oracle IRM and Device Control

Simon Thorpe - Tue, 2010-11-02 04:47

device-stack.pngAnother question from a colleague - what controls and options does Oracle IRM provide over the use of multiple devices? What happens if a user has a laptop and a PC and wants to use sealed content on both?

The Default Configuration

By default, each user can use one device at a time. The IRM Desktop provides the server with some information to uniquely identify the user's device. If the user connects from a different device, the server informs the user that their rights are already in use and declines to issue rights to the second device. Simple.

The Rationale

This device control helps prevent credential sharing. If the user gives their credentials to another user, or is the victim of key-logging or some other exposure of their credentials, the other user cannot simply contact the IRM Server and gain the benefit of the first user's rights.

This is an important control in many deployments, including publishing deployments where users might try to avoid paying for content individually.

Any attempt to share credentials in this way will show up in the audit trail. Some customers tell me that this constraint and auditability for multi-device usage is a key reason for choosing Oracle IRM.

So, Oracle IRM defaults to the most secure configuration - limiting each user to one device at a time.

The Catch with the Default

In many organisations, it is standard to have a desktop PC and a laptop. Users also need to be able to switch devices when, for example, they buy a new laptop.

The default configuration is good for security, but not always so good in usability terms. As always, our goal is to give you options that let you choose the right balance of security, usability, and manageability for your organisation.

Using Multiple Devices Despite the Default Configuration

Before discussing non-default options, what choices do you have with the default state?


  • Wait for the offline period to expire on your first device. The server can issue rights to your second device as soon as the cached rights have expired on the first.


    This is not ideal. In most deployments, the first device is constantly refreshing its offline period by synching regularly with the server. Even where this is not true, you might have to wait a couple of days or more for the offline period to expire.

  • Manually check in your rights from the first device and then use the second device.


    Checking in is easy enough, but it is preferable to avoid users needing to understand such details of the solution.

  • Ask the administrator to check in your rights at the server end.


    This caters for situations where, for example, you have lost your laptop and therefore cannot check the rights in from the desktop end. However, it adds to the management burden.


In all cases, these options enable you to switch from one device to another in a controlled, audited way, but the user is limited to one device at a time. Depending on your deployment, the default configuration could be undesirable, although it does help defend against password theft or sharing.

The Configurable Option

The Device Count parameter enables you, as a matter of service policy, to define how many devices users can use.


The server will issue rights to the specified number of devices per user, such that the above check-in options are rarely necessary - but there is still a limit.

The Benefit

The Device Count parameter enables a customer to define their own balance of security, usability, and manageability. By setting a limit of two or three, you enable legitimate usage of multiple devices and reduce the management burden. There is a slightly increased risk of account sharing, but it is defined by your policy and backed up by the audit trail. As a simple example, the following image shows that the user "mabrahams" is consistently using a device with an obviously corresponding name.


If you see evidence that "mabrahams" is using several different devices - some apparently belonging to other users - you might want to investigate. It would be pretty simple to write a report to flag up such evidence.

By contrast, some solutions offer no device control, or enforce a large, hard-coded device limit such as 25. Either way, you don't get to choose your own level of risk. In addition, audit facilities are sometimes very technical in content, requiring considerable expertise to identify potential abuse.

Why should i upgrade to oam 11g ?

Pankaj Chandiramani - Tue, 2010-11-02 00:37

One of my reader has some questions :

1)I use osso , what is oam ? Why should i upgrade to oam 11g ?
2)Can i upgrade from oam 10g to oam11g ?

So here are the details
1) You should look to upgrade to oam11g as osso will not be supported after 2011 (or will need extended support) . OAM 11g is the supported product that will replace the osso in fusion middle-ware . So if you are running osso 10g , you should start looking at upgrade options .

2)Current rollout of 11g oam is intended for osso customers mostly . So you can wait for further announcements or follow-up with PM team.

Categories: DBA Blogs

Concurrent request error

Bas Klaassen - Fri, 2010-10-29 06:44
A concurrent request was running into an error. The following error was shown in the logfile of the request.APP-FND-00362: Routine afpbep cannot execute request &REQUEST for program &PROGRAM, because the environment variable &BASEPATH is not set for the application to which the concurrent program executable &EXECUTABLE belongs.I noticed the $CUSTOM_TOP variable was not set, but after setting thisBas Klaassenhttp://www.blogger.com/profile/04080547141637579116noreply@blogger.com0
Categories: APPS Blogs

Berkeley DB Java Edition 4.1 Improvements

Charles Lamb - Fri, 2010-10-29 05:16

The new release of Berkeley DB Java Edition (JE), Release 4.1, includes several new features which drastically improve out-of-cache performance. When a JE application has a data set that does not fit entirely in cache, and there is no particular working set of data that fits in cache, the application will see the best performance when as much of the internal btree nodes (the index) are kept in cache. 4.1 includes improvements to make JE's cache management more efficient, and provides statistics to help determine cache efficiency.

It's worth giving a shout-out to the Sun ISV Engineering lab people who were invaluable in this effort. They let us use a lot of their big-iron hardware for 3 months of intense tuning and performance analysis, all before the merger was completed.

The first important new feature is Concurrent Eviction. In past releases, cache eviction has been carried out by JE daemon threads, application threads which call JE operations, and an optional single evictor thread. The actual eviction operation was serialized, and could create a bottleneck where many threads could be seen waiting upon the method com.sleepycat.je.evictor.Evictor.doEvict().

In 4.1.6, cache eviction is no longer serialized and can be executed concurrently. In addition, JE now has a dedicated configurable thread pool which will do cache eviction when memory limits are reached. Eviction is done by this dedicated pool, by other JE daemon threads, and by application threads. JE attempts to skew the eviction workload toward the pool and daemon threads, in order to offload application threads.

The second important feature is in-memory btree Internal Node (IN) compression which is targeted at reducing the memory needed to hold these nodes in cache. One of the optimizations reduces the in-memory footprint of an IN when only a small portion of it has been referenced, as would be the case when data records are accessed in a random order, or when a subset of data is accessed. It does not help if the application is doing (e.g.) a database-wide cursor traversal. A second optimization in this area occurs when a key's length is less than or equal to 16 bytes, which can be true either when the key is naturally small, or when key prefixing is enabled through Database.setKeyPrefix().

A user ran tests comparing JE 4.0.103 and JE 4.1 in a read-only workload and shared the results with us. When the database fits completely in the cache (4 GB of memory), performance is about the same. Dropping the cache to 2 GB (all INs still fit into memory), performance (throughput and latency) improves 5%. When the cache is further reduced to various values between 1 GB and 512 MB (only some of the INs fit in memory), performance improves more than 3x.

One other interesting note about these tests is that the test configuration has enough memory to hold the database in the file system cache (even though they did not allocate enough memory to the JE cache to hold all of the database). The net of this is that there is no "true" IO occurring, but rather all IO is only to the file system cache. By putting the data in the file system cache rather than on the Java heap (and therefore the JE cache), GC overhead is reduced while still maintaining "in-memory" performance (since the data was in the file system cache).

What is also interesting about this is that the existing JE-tuning adage that out-of-cache scenarios should adjust the je.evictor.lruOnly and je.evictor.nodesPerScan parameters is changing. By varying these values in 4.1 from the recommended norms (false and 100, respectively), the user is able to achieve even better performance. We will of course be updating our FAQ entries to state the new recommended values.

Naturally we're very excited about these results and want to share them with you. Stay tuned for more news when we have the results of read/write workloads.


Subscribe to Oracle FAQ aggregator