Antonio Romero

Subscribe to Antonio Romero feed
Oracle Blogs
Updated: 16 hours 10 min ago

ODI 11g – Faster Files

Thu, 2012-06-07 13:28

Deep in the trenches of ODI development I raised my head above the parapet to read a few odds and ends and then think why don’t they know this? Such as this article here – in the past customers (see forum) were told to use a staging route which has a big overhead for large files. This KM is an example of the great extensibility capabilities of ODI, its quite simple, just a new KM that;

  1. improves the out of the box experience – just build the mapping and the appropriate KM is used
  2. improves out of the box performance for file to file data movement.

This improvement for out of the box handling for File to File data integration cases (from the 11.1.1.5.2 companion CD and on) dramatically speeds up the file integration handling. In the past I had seem some consultants write perl versions of the file to file integration case, now Oracle ships this KM to fill the gap. You can find the documentation for the IKM here. The KM uses pure java to perform the integration, using java.io classes to read and write the file in a pipe – it uses java threading in order to super-charge the file processing, and can process several source files at once when the datastore's resource name contains a wildcard. This is a big step for regular file processing on the way to super-charging big data files using Hadoop – the KM works with the lightweight agent and regular filesystems.

So in my design below transforming a bunch of files, by default the IKM File to File (Java) knowledge module was assigned. I pointed the KM at my JDK (since the KM generates and compiles java), and I also increased the thread count to 2, to take advantage of my 2 processors.

For my illustration I transformed (can also filter if desired) and moved about 1.3Gb with 2 threads in 140 seconds (with a single thread it took 220 seconds) - by no means was this on any super computer by the way. The great thing here is that it worked well out of the box from the design to the execution without any funky configuration, plus, and a big plus it was much faster than before,

So if you are doing any file to file transformations, check it out!

ODI 11g – How to override SQL at runtime?

Fri, 2012-04-06 11:31

Following on from the posting some time back entitled ‘ODI 11g – Simple, Powerful, Flexible’ here we push the envelope even further. Rather than just having the SQL we override defined statically in the interface design we will have it configurable via a variable….at runtime.

Imagine you have a well defined interface shape that you want to be fulfilled and that shape can be satisfied from a number of different sources that is what this allows - or the ability for one interface to consume data from many different places using variables. The cool thing about ODI’s reference API and this is that it can be fantastically flexible and useful.

When I use the variable as the option value, and I execute the top level scenario that uses this temporary interface I get prompted (or can get prompted to be correct) for the value of the variable.

Note I am using the <@=odiRef.getObjectName("L","EMP", "SCOTT","D")@> notation for the table reference, since this is done at runtime, then the context will resolve to the correct table name etc.

Each time I execute, I could use a different source provider (obviously some dependencies on KMs/technologies here). For example, the following groovy snippet first executes and the query uses SCOTT model with EMP, the next time it is from BOB model and the datastore OTHERS.

m=new Properties();
m.put("DEMO.SQLSTR", "select empno, deptno from <@=odiRef.getObjectName("L","EMP", "SCOTT","D")@>");
s=new StartupParams(m);
runtimeAgent.startScenario("TOP", null, s, null, "GLOBAL", 5, null, true);

m2=new Properties();
m2.put("DEMO.SQLSTR", "select empno, deptno from <@=odiRef.getObjectName("L","OTHERS", "BOB","D")@>");
s2=new StartupParams(m);
runtimeAgent.startScenario("TOP", null, s2, null, "GLOBAL", 5, null, true);

You’ll need a patch to 11.1.1.6 for this type of capability, thanks to my ole buddy Ron Gonzalez from the Enterprise Management group for help pushing the envelope!

OWB – How to update OWB after Database Cloning

Wed, 2012-04-04 05:10

One of the most commonly asked questions led to one of the most commonly accessed support documents (strange that) for OWB is the document describing how to update the OWB repository details after cloning the Oracle database. The document in the Oracle support site has id 434272.1, and is titled 'How To Update Warehouse Builder After A Database Cloning (Doc ID 434272.1)'. This post is really for me to remember the document id;-)

ODI 11g – Oracle Multi Table Insert

Tue, 2012-03-27 05:24

With the IKM Oracle Multi Table Insert you can generate Oracle specific DML for inserting into multiple target tables from a single query result – without reprocessing the query or staging its result.

When designing this to exploit the IKM you must split the problem into the reusable parts – the select part goes in one interface (I named SELECT_PART), then each target goes in a separate interface (INSERT_SPECIAL and INSERT_REGULAR).

So for my statement below…

/*INSERT_SPECIAL interface */ insert  all
when 1=1 And (INCOME_LEVEL > 250000) then
into SCOTT.CUSTOMERS_NEW
(ID, NAME, GENDER, BIRTH_DATE, MARITAL_STATUS, INCOME_LEVEL, CREDIT_LIMIT, EMAIL, USER_CREATED, DATE_CREATED, USER_MODIFIED, DATE_MODIFIED)
values
(ID, NAME, GENDER, BIRTH_DATE, MARITAL_STATUS, INCOME_LEVEL, CREDIT_LIMIT, EMAIL, USER_CREATED, DATE_CREATED, USER_MODIFIED, DATE_MODIFIED)

/* INSERT_REGULAR interface */ when 1=1  then
into SCOTT.CUSTOMERS_SPECIAL
(ID, NAME, GENDER, BIRTH_DATE, MARITAL_STATUS, INCOME_LEVEL, CREDIT_LIMIT, EMAIL, USER_CREATED, DATE_CREATED, USER_MODIFIED, DATE_MODIFIED)
values
(ID, NAME, GENDER, BIRTH_DATE, MARITAL_STATUS, INCOME_LEVEL, CREDIT_LIMIT, EMAIL, USER_CREATED, DATE_CREATED, USER_MODIFIED, DATE_MODIFIED)
/*SELECT*PART interface */ select   
    CUSTOMERS.EMAIL EMAIL,
    CUSTOMERS.CREDIT_LIMIT CREDIT_LIMIT,
    UPPER(CUSTOMERS.NAME) NAME,
    CUSTOMERS.USER_MODIFIED USER_MODIFIED,
    CUSTOMERS.DATE_MODIFIED DATE_MODIFIED,
    CUSTOMERS.BIRTH_DATE BIRTH_DATE,
    CUSTOMERS.MARITAL_STATUS MARITAL_STATUS,
    CUSTOMERS.ID ID,
    CUSTOMERS.USER_CREATED USER_CREATED,
    CUSTOMERS.GENDER GENDER,
    CUSTOMERS.DATE_CREATED DATE_CREATED,
    CUSTOMERS.INCOME_LEVEL INCOME_LEVEL
from    SCOTT.CUSTOMERS   CUSTOMERS
where    (1=1)

Firstly I create a SELECT_PART temporary interface for the query to be reused and in the IKM assignment I state that it is defining the query, it is not a target and it should not be executed.

Then in my INSERT_SPECIAL interface loading a target with a filter, I set define query to false, then set true for the target table and execute to false. This interface uses the SELECT_PART query definition interface as a source.

Finally in my final interface loading another target I set define query to false again, set target table to true and execute to true – this is the go run it indicator!

To coordinate the statement construction you will need to create a package with the select and insert statements. With 11g you can now execute the package in simulation mode and preview the generated code including the SQL statements.

Hopefully this helps shed some light on how you can leverage the Oracle MTI statement. A similar IKM exists for Teradata. The ODI IKM Teradata Multi Statement supports this multi statement request in 11g, here is an extract from the paper at www.teradata.com/white-papers/born-to-be-parallel-eb3053/

Teradata Database offers an SQL extension called a Multi-Statement Request that allows several distinct SQL statements to be bundled together and sent to the optimizer as if they were one. Teradata Database will attempt to execute these SQL statements in parallel. When this feature is used, any sub-expressions that the different SQL statements have in common will be executed once, and the results shared among them.

It works in the same way as the ODI MTI IKM, multiple interfaces orchestrated in a package, each interface contributes some SQL, the last interface in the chain executes the multi statement.

ODI 11g – How to Load Using Partition Exchange

Wed, 2012-03-21 19:38

Here we will look at how to load large volumes of data efficiently into the Oracle database using a mixture of CTAS and partition exchange loading. The example we will leverage was posted by Mark Rittman a couple of years back on Interval Partitioning, you can find that posting here. The best thing about ODI is that you can encapsulate all those ‘how to’ blog posts and scripts into templates that can be reused – the templates are of course Knowledge Modules.

The interface design to mimic Mark's posting is shown below;

The IKM I have constructed performs a simple series of steps to perform a CTAS to create the stage table to use in the exchange, then lock the partition (to ensure it exists, it will be created if it doesn’t) then exchange the partition in the target table. You can find the IKM Oracle PEL.xml file here.

The IKM performs the follows steps and is meant to illustrate what can be done;

So when you use the IKM in an interface you configure the options for hints (for parallelism levels etc), initial extent size, next extent size and the partition variable;

  The KM has an option where the name of the partition can be passed in, so if you know the name of the partition then set the variable to the name, if you have interval partitioning you probably don’t know the name, so you can use the FOR clause. In my example I set the variable to use the date value of the source data

FOR (TO_DATE(''01-FEB-2010'',''dd-MON-yyyy''))

Using a variable lets me invoke the scenario many times loading different partitions of the same target table.

Below you can see where this is defined within ODI, I had to double single-quote the strings since this is placed inside the execute immediate tasks in the KM;

Note also this example interface uses the LKM Oracle to Oracle (datapump), so this illustration uses a lot of the high performing Oracle database capabilities – it uses Data Pump to unload, then a CreateTableAsSelect (CTAS) is executed on the external table based on top of the Data Pump export. This table is then exchanged in the target. The IKM and illustrations above are using ODI 11.1.1.6 which was needed to get around some bugs in earlier releases with how the variable is handled...as far as I remember.

Pages