Rittman Mead Consulting
Oracle Endeca Information Discovery v3.0 Integration with the OBIEE 11g BI Server
One of the first moves towards integrating Oracle Endeca Information Discovery (OEID) with other Oracle products is the ability to source data from an Oracle BI Server repository for loading into an Oracle Endeca Server “data domain” (the new name for an Endeca Server datastore). As mentioned in my previous post, this functionality was initially introduced in OEID version 2.4 and works much the same in the new version 3.0.
This feature is based around Oracle Endeca Integrator Designer (“Integrator” for short), the client part of Integrator Suit that is used to design data loading graphs and optionally run them, and uses a “template project” which allows users to connect to an Oracle BI Server, run a select-query and then load repository data into an Endeca Server data domain. Some data domain attribute and indexing settings will also be configured, so that guided navigation and attribute/record search within Endeca Studio are set up for users looking to analyze the data we’re loading.
Two different scenarios come to mind that could make use of this feature. First, you might want to join some modeled data from your Oracle BI Server with other data that you have stored in Endeca Server; for example, allowing you to combine and analysis unstructured and structured data. In other words, you can create dashboards on your unstructured data that has been enriched by some descriptive structured data!
The second use case could be if you wanted to create some quick dashboards against data modeled within the BI Server repository, to find answers to new questions without the need to make extensive changes within the actual BI Repository – in other words, to carry out information discovery!
There are probably other use cases too, but whatever the reason you need to bring data from Oracle BI server to Integrator has made this very easy. In this post, therefore, I’m going to provide a step-by-step guide on how to setup this particular functionality, and load data from the OBI Server repository to the Oracle Endeca Server using the 3.0 release of OEID.
1. Firstly, open Integrator and select File > New > Project from the menu bar.
Note that with the earlier 2.4 release of Integrator, you had to make sure that a .jar plug-in file (found inside the installation package) for connecting to the Oracle BI Server had been installed in order to be able to see the “Load data from Oracle BI Server” wizard under Information Discovery category in the New Project displayed dialog. This step is not required for version 3.0. Click Load Data from OBI Server and press Next to proceed.
2. The Load Data from OBI Server wizard should be displayed. On the first page, enter the project name you are creating (for example, OBIConnection) or alternatively you can Use an existing project. In which case, after pressing Next again, the wizard will add new files to the project you have. I would suggest creating a new project to prevent any conflict, as you can always load the data into an existing data domain.
3. Next, In the Endeca Data Domain Configuration page, provide your up and running Endeca Server details and press Next to continue, for example:
Endeca Server Host: localhost
Endeca Server Port: 7002
Data Domain Name: Sample_Sales
The data domain name that you provide will be used to create a data domain as one of the initial steps of the Integrator project in a graph called “InitDataDomain.grf”.
Note, where the Endeca Server has been installed on a secure mode, you need to provide the ssl-listen port.
4. Now you’ll need to enter the OBI Server connection details such as User, Password, OBI Server host and OBI Server port; For example:
User: Weblogic
Password: Password01
OBI Server host: 10.67.70.117
OBI Server port: 9703
9703 is the port that I could connect to. You should validate the credentials by Connecting to OBI Server and once you see the Connected message on the top of the dialog, proceed to the Next step.
5. You are now able to choose the repository you want and select the relevant objects/tables you need from the list of Tables by checking them. Press Next to continue.
6. The wizard will automatically prepare the required meta-data information for the attributes that you selected. Using this page, you can configure some OEID features such as Search Interface by editing the displayed values (Note that this property is only available for STRING attributes). Check the Edit Finished box (located towards the bottom left of the window) when you are happy with all configurations and then click Finish.
7. Having now finished the wizard, a new project with all the configurations, metadata files, connection, SQL statement and graphs required to connect to the OBI Server will be added to the Navigator pane.
8. You can now see the required database connection, as shown in the screen shot below, where the Oracle JDBC driver has been used.
9. It’s now time to run the Baseline graph in order to load the OBI Server repository data into the Oracle Endeca Data Domain. Expand the Graph folder within Navigator pane and open Baseline.grf. Press the green Run button from the toolbar. Check the Console to see Execution of graph successful.
Some graph components may require SSL configuration, if you are using a secure Endeca Server implementation; For example in the Load_Data graph, the Bulk_Add/Replace component has a SSL Enabled property that should be updated to true. Also, you’ll also need to change all http requests to https.
The “Load Data” graph uses a Query-Statement, which is located in the data-in directory and contains a SELECT statement from OBI Server repository objects.
10. Now its time to log into Endeca Studio and create a Data Source in the Control Panel pointing to the data domain you created in the Integrator project.
Test connection to ensure that it is working properly.
11. One more change in OEID version 3.0 is that the old ‘Liferay portal’ term ‘Community’ has been updated to ‘Application’, which to me, feels more related to the subject of Information Discovery and thus makes sense, but that’s just a personal opinion. An ‘Application’ refers to a subject area where users can create different dashboards on different views of one and only one data-source. Being in the home page of Oracle Endeca Studio, select New Application using the approved, newly created and validated data-source.
12. OEID version 3.0 kindly, generously and automatically generates a new dashboard with all the required components such as a searchbox, a breadcrumb, a guided navigation and of course a results table.
So far we have created an Oracle Endeca Information Discovery dashboard upon our Oracle BI Server repository, but what is the advantage of doing this? Rather than being able to create reports very quickly, Perhaps joining some unstructured data from other data-sources to the Oracle BI Server repository and a bit of Text tagging, enrichment and salience analysis could be done in Integrator and will make the dashboards much more interesting.
Good luck to those of you trying this out. Look out for my next post where I’ll explain how to read from Endeca Server and join its data to those coming from other sources.
Endeca Event in Birmingham
Just a quick note to highlight that we are running an Oracle Endeca Information Discovery (OEID) event at Oracle’s Birmingham office on Wednesday 26th June. Providing a great opportunity to learn how OEID can complement your current BI tools, allowing you to answer previously unanswerable questions through insight from both structured and unstructured data sources (such as social feeds and word documents).
There will also be experienced experts available before, during and after the event, providing a rare opportunity to get your questions answered by people who have been there and done it.
Click here for more information and to register.
Photos and Presentation Downloads from the Rittman Mead BI Forum 2013, Brighton & Atlanta
Well, we’re all back home now after two very successful Rittman Mead BI Forum events in Brighton, and then Atlanta, earlier this month in May 2013. Around 70 OBIEE, ODI, Endeca and Essbase developers from around Europe got together in the first week in Brighton, followed by around 60 in Atlanta, and we were joined by Cary Millsap (Method R Corporation), Alex Gorbachev (Pythian) and Toby Potter (Data Sift) as special guest speakers over the two events. Thank you again to everyone who came along and supported the event, and a special thanks to the speakers without whom, of course, the BI Forum couldn’t take place. In addition, sincere thanks to Mike, Adam, Philippe, Alan, Marty, Jack and Florian from Oracle for coming along and sharing plans and insights around the Oracle product roadmap, and finally; congratulations to Antony Heljula (Peak Indicators Ltd) and Jeremy Harms (CD Group) who won the “Best Speaker” award for Brighton and Atlanta respectively.

Photos from the two events (a selection from Brighton are above, some from Atlanta below this paragraph) are available in these Flickr photo sets:
- “Rittman Mead BI Forum 2013, Brighton” (Flickr photo set)
- “Rittman Mead BI Forum 2013, Atlanta” (Flickr photo set)
As we always do, we’re also making the slides (where allowed by the speaker, and not under NDA) available for download using the links below, including the one-day Oracle Data Integration Masterclass provided by Stewart Bryson, Michael Rainey and myself. Note that Christian Screen’s and Jeremy Harms slides are actually online, so I don’t think you’ll be able to download them from whatever service is hosting them – sorry.
Oracle Data Integration Masterclass (Stewart Bryson, Michael Rainey, Mark Rittman, Rittman Mead)
- “Introduction to Oracle Data Integrator 11g”
- “ODI and the Oracle Reference Architecture for Information Management”
- “ODI and GoldenGate – A Perfect Match…”
- “ODI and Hadoop, MapReduce and Big Data Sources”
- “The Three R’s of ODI Fault Tolerance : Resuming, Restarting and Restoring”
- “Scripting and Automating ODI using Groovy and the ODI SDK”
Brighton RM BI Forum, May 8th – 10th 2013
- “OBIEE SampleApp 11.1.1.7 functional highlights” (Philippe Lions, Oracle Corporation)
- “OBI Performance Tuning – Real Customer Success Stories” (Antony Heljula, Peak Indicators Ltd)
- “Secrets of OBIEE implementation at LGI” (Marco Klaassens, Liberty Global)
- TED Session 1: ”Why I want to be working with Business Intelligence in 5 years time” (Jon Mead, Rittman Mead)
- TED Session 3 : “Incrementally loading Exalytics using Notepad” (Antony Heljula, Peak Indicators Ltd)
- “Oracle Data Integrator 11g Best Practices. Busting your performance, deployment, and scheduling headaches.” (Uli Bethke/Maciek Kocon, Independent)
- “New Developments in BI Multi-tenancy and Cloud” (Adam Bloom, Oracle Corporation
- “The Magic of Aggregates” (Michael Wilcke, sumIT AG)
- “Integrating Oracle BI, BPM and BAM 11g: The complete cycle of information” (Edelweiss Kammermann, Awen Consulting)
- “Endeca – Beyond the Demos” (Adam Seed, Rittman Mead)
Atlanta RM BI Forum, May 15th – 17th 2013
- “It’s all in the genes – The power of Oracle Exadata and the Oracle Database” (Rene Kuipers, VX Company)
- “In Memory Analytics – Times Ten, Essbase 11.1.2.2 – Analysis – A Comparison” (Venkatakrishnan J, Rittman Mead)
- TED Session 3 : “A BI Publisher Beginner’s MacGyver-Hack for Financial Reporting with OBIEE: A Quickie!” (Jeremy Harms, CD Group)
- “Performance Tuning the BI Apps with a Performance Layer” (Jeff McQuigg, KPI Partners Inc)
- “Thinking Clearly about Performance” (Cary Millsap, Method R Corporation) – see also the accompanying technical paper
- ”Forecasting and Time Series Analysis in Oracle BI” (Tim & Dan Vlamis, Vlamis Software Solutions Inc)
- “Hadoop versus the Relational Data Warehouse.” (Alex Gorbachev, Pythian)
- “How to Create a Plug-In for Oracle BI 11g” (Christian Screen, Capgemini)
- “ODI and Hadoop / Big Data” (Alan Lee & Marty Gubar, Oracle Corporation)
- “BI Applications 11g and ODI” (Florian Schouten, Oracle Corporation)
- “OBIA 11G – What You Need To Know: Part 1″ (Kevin McGinley, Accenture)
So once again – thank you to everyone who came along, especially the speakers but also everyone from our Brighton and Atlanta offices who helped set the event up, and made sure it all ran so smoothly. See some of you again in Brighton and Atlanta next year, and our next outing is to ODTUG KScope’13 in New Orleans – another great event with the BI Track organised by Kevin McGinley – make sure you’re there!
Testing aggregate navigation on OBIEE and Exalytics
One of OBIEE’s many great strengths is aggregate navigation; the ability to choose from a list of possible tables the one which will probably give the optimal performance for a given user query. Users are blissfully unaware of which particular table their query is being satisfied from, since aggregate navigation happens on the BI Server once the user’s request comes through from an Analysis or Dashboard.
This seamless nature of aggregate navigation means that testing specific aggregates are working can be fiddly. We want to ensure that the aggregates we’ve built are (i) being used when appropriate and (ii) showing the correct data. This is the particularly the case in Exalytics when aggregates are put into in-memory (TimesTen) by the Summary Advisor and we need to validate them.
Whilst the log file nqquery.log (or Usage Tracking table S_NQ_DB_ACCT) tells us pretty easily which table a query used, it is nice to be able to switch a query easily between possible aggregate sources to be able to compare the data. This blog demonstrates how we can use the INACTIVE_SCHEMAS variable (as described in my previous blog on loading Exalytics incrementally) to do this.
INACTIVE_SCHEMAS is a Logical SQL variable that tells the BI Server to exclude the specified physical schema(s) from consideration for resolving an inbound query. Normally, the BI Server will parse each incoming query through the RPD, and where a Logical Table has multiple Logical Table Sources it will evaluate each one to determine if it (a) can satisfy the query and (b) whether it will be the most efficient one to use. By using INACTIVE_SCHEMAS we can force the BI Server to ignore certain Logical Table Sources (those associated with the physical schema specified), ensuring that it just queries the source(s) we want it to.
In the following example, the data exists on both Oracle database, and TimesTen (in-memory). Whilst the example here is based on an Exalytics architecture, the principle should be exactly the same regardless of where the aggregates reside. This is how the RPD is set up for the Fact table in my example:
The GCBC_SALES schema on Oracle holds the unaggregated sales data, whilst the EXALYTICS schema on TimesTen has an aggregate of this data in it. The very simple report pictured here shows sales by month, and additionally uses a Logical SQL view to show the contents of the query being sent to the BI Server:
Looking at nqquery.log we can see the query by default hits the TimesTen source:
[...]
------------- Sending query to database named TimesTen aggregates
WITH
SAWITH0 AS (select distinct T1528.Sale_Amoun000000AD as c1,
T1514.Month_YYYY000000D0 as c2
from
SA_Month0000011E T1514,
ag_sales_month T1528
[...]
Now, for thoroughness, let’s compare this to what’s in the TimesTen database, using a Direct Database Request:
OK, all looks good. But, is what we’ve aggregated into TimesTen matching what we’ve got in the source data on Oracle? Here was can use INACTIVE_SCHEMAS to force the BI Server to ignore TimesTen entirely. We can see from the nqquery.log that OBI has now gone back to the Oracle source of the data:
[...]
------------- Sending query to database named orcl
WITH
SAWITH0 AS (select sum(T117.FCAST_SAL_AMT) as c1,
T127.MONTH_YYYYMM as c2
from
GCBC_SALES.TIMES T127 /* Dim_TIMES */ ,
GCBC_SALES.SALES T117 /* Fact_SALES */
[...]
and the report shows that actually we have a problem in our data, since what’s on the source doesn’t match the aggregate:
A Direct Database Request against Oracle confirms the data we’re seeing – we have a mismatch between our source and our aggregate:
This is the kind of testing that it is crucial to perform. Without proper testing, problems may only come to light in specific reports or scenarios, because by the very nature of aggregate navigation working silently and hidden from the user.
So this is the feature we can use to perform the testing, but below I demonstrate a much more flexible way that having to build multiple reports.
Implementing INACTIVE_SCHEMASUsing INACTIVE_SCHEMAS in your report is very simple, and doesn’t require modification to your reports. Simply use a Variable Prompt to populate INACTIVE_SCHEMAS as a Request Variable. Disable the Apply button for instantaneous switching when the value is changed.
A Request Variable will be prepended it to any logical SQL sent to the BI Server. Save this prompt in your web catalog, and add it to any dashboard on which you want to test the aggregate:
Even better, if you set the security on the dashboard prompt such that only your admins have access to it, then you could put it on all of your dashboards as a diagnostic tool and only those users with the correct privilege will even see it:



So far this is all negative , in that we are specifying the data source not to use. We can examine nqquery.log etc to confirm which source was used, but it’s hardly convenient to wade through log files each time we execute the report. Ripped off from Inspired by SampleApp is this trick:
- Add a logical column to the fact table
- Hard code the expression for the column in each Logical Table Source
- Bring the column through to the relevant subject area
- Incorporate it in reports as required, for example using a Narrative View.
Bringing it all together gives us this type of diagnostic view of our reports:
Summary
There’s a variety of ways to write bespoke test reports in OBI, but what I’ve demonstrated here is a very minimal way of overlaying a test capability on top of all existing dashboards. Simply create the Request Variable dashboard prompt, set the security so only admins etc can see it, and then add it in to each dashboard page as required.
In addition, the use of a ‘data source’ logical column in a fact table tied to each LTS can help indicate further where the data seen is coming from.
Introduction to the BI Apps 11.1.1.7.1 – Use of ODI11g for ETL
In the two previous postings in this series on the Oracle BI Apps 11.1.1.7.1, we looked at the release at a high-level, and then at the product architecture including the new configuration and functional setup tools. From a technology and developer perspective though probably the most interesting thing about this new release is its use of Oracle Data Integrator as the ETL tool rather than Informatica, and the doing-away with the DAC for load orchestration and monitoring.
This introduction of ODI brings a number of potential benefits to customers and developers and gives Oracle the opportunity to simplify the product architecture, but bear in mind that there’s no migration path from the earlier 7.9.x releases to this version, with Informatica customers instead having to wait until the “patch set 2″ version due in the next twelve months; even then, migration between tools won’t be automatic, with existing Informatica-based installations expected to stay on Informatica unless they choose to re-implement using ODI.
So how does ODI work within this new release, and how has the DAC been replaced? Let’s take a look in this final piece in our short series on Oracle BI Apps 11.1.1.7.1, starting by looking at the overall role that ODI plays in the platform architecture.

Existing ODI developers will know that the tool uses two repositories, known as the Master and Work repositories, to store details of data sources and targets, mappings, data models and other aspects of an ETL project. Within the BI Apps these two repositories are stored in a schema called prefix_ODI_REPO, for example DEV_ODI_REPO, and are accompanied by a new schema called prefix_BIACOMP, again for example DEV_BIACOMP. The BIACOMP schema contains tables used by the various new WebLogic-based BI Apps supporting applications, and contain details of the functional setup of the BI Apps, load plans that have been generated and so forth. There’s also another schema called prefix_BIACOMP_IO which is used for read-write access to the BIACOMP schema, and all of these are held in a repository database alongside the usual schemas used for OBIEE, MDS and so forth.
The major difference in using ODI within this environment is that it’s treated as an “embedded” ETL tool, so that in most circumstances you won’t need to use ODI Studio itself to kick-off load plans, monitor their execution, set up sources and targets and so forth. This was the original vision for Informatica within the original BI Apps, but Oracle are able to do this far more effectively with ODI as they own all parts of the tech stack, can alter ODI to make it easier to embed, they’e got control over ODI’s various metadata APIs and so forth. What this means in practice is that the setup of the ODI topology (to connect to the ERP sources, and the target data warehouse) is done for you via a web-based application called the Oracle BI Applications Configuration Manager, and you can kick-off and then monitor your running ETL jobs from Configuration Manager and from ODI Console, the web-based operator tool that’s been around since the 11g release of ODI. The screenshot below shows Configuration Manager setting up the source database ODI topology entry, with the details that you provide then being pushed through to the ODI master repository:

Setting up a new BI Apps system involves using the Configuration Manager to define the connections through to the various source systems, then select the BI Apps modules (Financial Analytics, for example, and then the various subject areas within it) that you wish to implement. There are then a number of steps you can perform to set up system-wide settings, for example to select default currencies or languages, and then you come to run your first ODI load plan – which in this instance copies settings from your source system into the relevant tables in the BIACOMP schema, performing automatically the task that you had to do via the various domain configuration spreadsheets in the earlier 7.9.x releases – the screenshot below shows this ODI load plan listed out and having run successfully.

You can then view the execution steps and outcome either in ODI Console (embedded within Configuration Manager), or over at ODI Studio, using the Operator navigator.

Moving over to ODI Studio, the folders (or “adapters”) that in Informatica used to hold workflows and mappings for the various source systems, are contained with the BI Apps project within the Work repository and the Designer navigator. In the screenshot below you can also see the Fusion Apps adapter that’s not supported in this particular release, and the ETL Data Lineage adapter that should get enabled in an upcoming patch release.

In the screenshot above you can also see one of the loading tasks, SDE_ORA_APAgingBucketsDimenson, is a package that (were you to expand the Interfaces entry) makes reference to a regular, and also a temporary, interface.

Packages in ODI perform the same role as Informatica workflows in earlier releases of the BI Apps, and each package runs some steps to refresh variables, work out if its doing a full or incremental load, and then call the relevant ODI interface. Interfaces in ODI for the BI Apps typically load from other temporary interfaces, with these temporary interfaces performing the role of maplets in the Informatica version of the BI Apps, as you can see in the screenshot on the left below. On the right, you can see the flow for another mapping, along with one of the custom KMs that come as part of the BI Apps 11.1.1.7.1 package.

Individual packages are then assembled into the equivalent of BI Apps 7.9.x “execution plans” through a new JEE application called the Load Plan Generator, which also gets installed into ODI Studio as a plug-in so you can develop new data loading routines away from the full production setup. As you can see in the final screenshot below, these load plans are then visible from within ODI Studio (whether you generated them there, or from Configuration Manager), and like all ODI 11g load plans you can view the outcome of each load plan instance run, restart it if this feature is enabled, and so forth.

So there you have it – how ODI is used within the BI Apps 11.1.1.7.1. I’m going to take a break now as it’s almost time for the Atlanta run of the Rittman Mead BI Forum 2013, but once I’m back in the UK I’ll try and put something together for the blog on pulling together your first ETL run. Until then – have fun with the release.
Agenda and Details for the Atlanta RM BI Forum 2013
Well, Brighton is now a wrap and we’re all now over in Atlanta, getting ready for the second leg of the 2013 Rittman Mead BI Forum, running from this Wednesday, 15th May 2013 through to Friday, 17th May. Photos from the Brighton event are up on Flickr now, but for anyone who’s coming down to the Georgia Tech Hotel & Conference Center for later this week, this posting contains the detailed agenda for the event, along with a preview of what’s coming in terms of social events, guest speakers and the masterclass.

Wednesday starts with the optional one-day masterclass, this year on Oracle Data Integration and led by myself, Stewart Bryson and Michael Rainey. I previewed the data integration masterclass previously on the blog, and the planned timetable for the masterclass looks like this:
Day 1 : Optional Oracle Data Integration Masterclass, followed by Registration, Drinks and Keynote/Meal
10.00 – 11.00 : Welcome, and Introduction to Oracle Data Integrator 11g (Stewart Bryson)
11.00 – 11.15 : Morning Coffee
11.15 – 11.45 : ODI and the Oracle Reference Architecture for Information Management (Stewart Bryson)
11.45 – 12.45 : ODI and GoldenGate – A Perfect Match… (Michael Rainey)
12.45 – 13.30 : Lunch
13.30 – 14.30 : ODI and Hadoop, MapReduce and Big Data Sources (Mark Rittman)
14.30 – 15.30 : The Three R’s of ODI Fault Tolerance : Resuming, Restarting and Restoring (Stewart Bryson)
15.30 – 16.30 : Scripting and Automating ODI using Groovy and the ODI SDK (Michael Rainey)
The event itself officially opens at 4pm on Wednesday, May 15th 2013 with registration taking place then, and a drinks reception in the hotel bar from 5pm to 6pm. At 6pm we have the Oracle keynote led by Jack Berkowitz and Philippe Lions, and then an informal meal in the hotel restaurant from 7pm – 10pm.

The main conference then opens at 8am on the Thursday morning, with registration open from 8am – 8.45am, opening remarks from myself at 8.45am and the first session starting at 9am. Here’s the timetable as planned for Thursday:
Day 2 : Main Conference Sessions, Guest Speaker and Gala Meal
8.45am – 9.00am : Opening Remarks Mark Rittman, Rittman Mead
9.00am – 10.am : Rene Kuipers, VX Company, “It’s all in the genes – The power of Oracle Exadata and the Oracle Database”
10.00am – 10.30am : Morning coffee
10.30am – 11.30am : Jack Berkowitz, Oracle : “OBI Presentation, Interaction and Mobility”
11.30am – 12.30am : Venkatakrishnan J, Rittman Mead, “In Memory Analytics – Times Ten, Essbase 11.1.2.2 – Analysis – A Comparison”
12.30pm – 1.15pm : Lunch
1.15pm – 1.30pm : TED Session 1 : Kevin McGinley – “OBIEE and OEID: What if…?”
1.30pm – 1.45pm : TED Session 2 : Jon Mead, Rittman Mead, “Why I want to be working with Business Intelligence in 5 years time”
1.45pm – 2.00pm : TED Session 3 : Jeremy Harms – “A BI Publisher Beginner’s MacGyver-Hack for Financial Reporting with OBIEE: A Quickie!”
2.15pm – 3.15pm : Alan Lee, Oracle, “Update on BI Metadata Architecture and Design Tool”
3.15pm – 3.45pm : Afternoon coffee and beers
3.45pm – 4.45pm : Jeff McQuigg, KPI Partners Inc, “Performance Tuning the BI Apps with a Performance Layer”
After the first day’s presentations we’ll take a short break, and then convene again back in the conference room at 5pm for our special guest speaker session, this year being provided by Method R’s Cary Millsap, who many of you will know from his Optimizing Oracle Performance book and his “response time” approach to performance tuning. Just after Cary’s session at around 6.30pm we’ll then be taken by coach to “4th and Swift”, the venue for the gala meal, where we’ll be from around 7pm through to around 10pm.
5.00pm – 6.00pm : Guest Keynote: Cary Millsap– “Thinking Clearly about Performance”
6.30pm – 7.00pm : Depart for Restaurant
7.00pm – 10.00pm : Gala Meal – 4th and Swift, Atlanta
Day 3 : Main Conference Sessions, and Close
The final day of the BI Forum is all about big data, and the BI Apps, with a special session from Pythian’s Alex Gorbachev on Hadoop and Oracle Data Warehousing, sessions by Oracle on Big Data and OBIEE, a big data debate, an an extended session by Oracle’s Florian Schouten and Accenture’s Kevin McGinley on the BI Apps 11.1.1.7.1.
We also have sessions on Endeca, OBIEE time-series analysis and extending OBIEE using plug-ins, so hopefully everyone will be able to stay until 5pm when the event will close.
8.30am – 9.30am : Tim Vlamis, Vlamis Software Solutions Inc,”Forecasting and Time Series Analysis in Oracle BI”
9.30am – 10.30am : Special Guest: Alex Gorbachev, Pythian – “Hadoop versus the Relational Data Warehouse.”
10.30am – 11.00am : Morning Coffee
11.00am – 12.00pm : Christian Screen, Capgemini, “How to Create a Plug-In for Oracle BI 11g”
12.00pm – 1pm : Marty Gubar and Alan Lee – OBIEE and Hadoop/Big Data
1.00pm – 1.45pm : Lunch
1.45pm – 2.45pm : Debate – “Big Data – Hype, or the Future or Oracle BI/DW?”
2.45pm – 4.15pm : Florian Schouten (Oracle) and Kevin McGinley (Accenture) – Oracle BI Apps 11g and ODI
4.15pm – 5.00pm : Adam Seed, Rittman Mead – “Endeca – Looking beyond the general demos”
You’ll notice we’ve brought back the popular “debate” section this year, with this year’s topic being “Big Data – Hype, or the Future of BI/DW?”. I’ll be looking for volunteers to argue the case for either of the two sides in the debate, so if you’ve got a view on whether big data is going to be the salvation of BI, whether it’ll turn us in to the COBOL programmers of the future, or whether its just a load of hot air (or you just like having an argument), let me know when you arrive and we’ll pull the debating teams together.
Other than that – have a safe journey over, and see at least some of you in Atlanta later in the week!
Introduction to the BI Apps 11.1.1.7.1 – Product Architecture & New Configuration Tools
In my previous posting in this series, I looked at the new 11.1.1..7.1 release of the Oracle BI Applications at a high-level, and talked about how this new release uses ODI as the embedded ETL tool instead of Informatica PowerCenter. Support for Informatica will come with patch set 2 (PS2) of BI Apps 11.1.1.7.x giving customers the choice of which ETL to use (with the caveat that customers upgrading from 7.9.x will typically have to stick with Informatica unless they want to completely re-implement using ODI), but for this initial release at least, ODI and some new Fusion Middleware tools take over from Informatica and the DAC, giving us what could well be a much simpler architecture for supplying the underlying data for the BI Apps dashboards.
In this posting then, I’m going to take a closer look at this new product architecture, and I’ll follow it with a more detailed look at how the various bits of ODI functionality replace the workflows, mappings, transformation operators and execution plans provided in earlier releases by Informatica and the DAC. For anyone familiar with the previous, 7.9.x versions of the BI Applications, the architecture diagram below shows the five tiers that this product typically implemented; tiers for the source data and data warehouse/repository databases, an ETL tier for Informatica and the DAC server, then two more tiers for the OBIEE application server and the client web browser.

Communication between the tiers was – to put it politely – “loosely coupled”, with DAC task names corresponding with Informatica workflow names, each workflow containing a single mapping, and all of the connections and sources having to be named “just so”, so that every part of the stack could communicate with all the others. It worked, but it was a lot of work to implement and configure, and once it was up and running in most cases customers were scared to then change it, in case a name or a connection got out of sync and everything then stopped working. Plus – Informatica skills are scarce in the Oracle world, and the DAC is an extra piece of technology that few DBAs really understood properly.
The 11.1.1.7.1 release of the BI Apps simplifies this architecture by removing the separate ETL tier, and instead using Oracle Data Integrator as the embedded ETL tool, with its server functions running as JEE applications within the same WebLogic domain as OBIEE 11g, giving us the overall architecture in the diagram below.

Now anyone who read my series of posts back in 2009 on the 7.9.5.2 release of the BI Apps, which also used ODI as the embedded ETL tool, will know that whilst ODI 10g could do the job of loading data into the BI Apps data warehouse, it lacked the load orchestration capabilities of Informatica and the DAC and wasn’t really set up to dynamically generate what have become, in ODI 11g, load plans. BI Apps 7.9.5.2 turned-out to be a one-off release and in the intervening years Oracle have added the aforementioned load plans along with other functionality aimed at better supporting the BI Apps, along with two new JEE applications that run in WebLogic to replace the old DAC. These new applications, along with the ODI JEE agent, ODI Console and the ODI SDK, are shown in the more detailed BI Applications 11.1.1.7.1 logical architecture diagram shown below.

Oracle BI Applications 11.1.1.7.1 has two main product tiers to it, made up of the following components:
- The Middleware (BI and ETL) tier; a WebLogic domain and associated system components, comprising BI components delivered as part of OBIEE 11.1.1.7 (including Essbase and related applications) as one managed server, and another managed server containing ODI Java components, including three new BI Apps-related ones; Configuration Manager, Functional Setup Manager, and ODI Load Plan Generator
- The Database (DW and Repositories) tier; for the time-being, Oracle only, and comprising a data warehouse schema (staging + performance layer), and a repository database containing the OBIEE repository schemas plus new ones to hold the ODI repository and other ETL/configuration metadata used for configuring your system.
Essbase at this stage is installed, but not used for the main BI applications, and all of it uses Fusion Middleware security (application roles and policies) along with the WebLogic Embedded LDAP server to hold users and groups. A special version of RCU is used to set up the new BI Apps-related schemas, and import data into them using Oracle database export files, so that the ODI repository, metadata tables and so forth are all populated prior to the first load taking place. Enterprise Manager Fusion Middleware Control is still used to manage and monitor the overall platform, but there’s now an entry for ODI along with Essbase, the latter of course being delivered as part of the 11.1.1.7 OBIEE platform release.

In the next posting in the series we’ll take a closer look at how ODI uses its JEE agent and mappings imported into its repository to load the BI Apps data warehouse, but what about the two new web-based configuration tools, Oracle BI Applications Configuration Manager (BIACM) and Oracle BI Applications Functional Setup Manager (FSM) – what do they do?
After you install OBIEE 11.1.1.7 and then the BI Applications 11.1.1.7.1, the BI Apps installer extends the BI domain to include FSM, BIACM and the ODI Load Plan Generator, along with some other supporting applications and libraries required for the full product. Load Plan Generator works behind the scenes to build new load plans in a similar way to the Execution Plan “Build” feature in the DAC, and the two web-based tools perform the following functions:
- Oracle BI Applications Configuration Manager performs system-wide setup tasks such as defining sources, selecting BI Apps modules and performing other, “one-only” tasks similar to the Setup feature in the DAC Console.
- Oracle BI Applications Functional Setup Manager is then used to list out, and track progress against, the various tasks required to configure the BI Applications modules, or “Offerings”, that you selected in the Configuration Manager
Most importantly though, these tools connect directly through to the ODI repository, so data sources you set up here will get pushed down to ODI as data servers in the ODI master repository; load plans you set up to, as in the screenshot below, load configuration tables, are ODI load plans and you can track their progress either from within ODI, or from within these applications themselves.

I haven’t had a chance to properly “diff” the RPD used in BI Apps 11.1.1.7.1 with the previous 7.9.x ones, or do a similar exercise for the underlying database data model, but on first glance the new RPD is at least recognisable, albeit with new sources and subject areas for the Fusion Apps, Oracle Transactional BI (OTBI), Real-Time Decisions and the like. The web catalog also looks familiar, but also has new content around the new applications along with additional content for the existing ones.

So, we’re at the point now where can start to think about loading data into the BI Apps data warehouse, and in tomorrows post we’ll take a look at what’s involved in a BI Apps 11.1.1.7.1 ETL load, and also look into how GoldenGate can now be used to extract and stage data prior to loading via ODI. Back tomorrow…
Introduction to the BI Apps 11.1.1.7.1 – Release Overview
The long-awaited and anticipated 11.1.1.7.1 (PS1) release of the Oracle BI Applications became available early last week, with the software and documentation available for download on OTN. Over the next few blog posts, I’ll be taking an in-depth look at this new release, starting today with an overview of what’s new and any limitations in his initial version, and then over the next few posts taking a look at the product architecture, how it uses Oracle Data Integrator instead of Informatica to do the data loads, and what new content the 11g dashboards contain. For a bit of background into this release you’re best off taking a look at a series of posts I put together towards the end of last year on the BI Apps product roadmap, and I’ll recap on those posts a bit in this one as I go through at a high level what’s in this release.
Although the focus in BI Apps 11.1.1.7.1 is on ODI as the ETL tool, this new release actually delivers a whole new product architecture along with new dashboards, new content, and a new security framework. In addition, there’s now an option to use Oracle GoldenGate to create a new layer in the BI Apps data warehouse data architecture that replicates source data into the warehouse environment, giving you the ability to run the more large-scale ETL processes when you like, rather than when there’s an ETL window for the source systems.
Let’s start off though with a summary of what’s new from a functional perspective, and also what limitations there are for this first release in terms of sources, scope and so forth. BI Apps 11.1.1.7.1 delivers the following set of new features and capabilities:
- Oracle Data Integrator as the embedded ETL tool, along with a whole new FMW11g-centric architecture and set of utilities
- Two new analytic applications - Student Information Analytics, and Indirect Spend Planning
- New content for existing analytic applications including Financial Analytics, HR, Projects, CRM and Procurement & Spend
- Dashboards that are now written for OBIEE 11g rather than 10g, including 11.1.1.7 visualisations such as performance tiles
Now although, in general terms, BI Apps 11.1.1.7.1 covers all (or most…) of the existing analytic application modules along with all of the 7.9.x-era sources (EBS, PeopleSoft, JDE and Siebel), there are some important restrictions that you’ll need to be aware of when making any plans to use this new release, starting with upgrade paths (or lack of them):
- There’s no automatic upgrade path from BI Apps 7.9.x, and no automated migration routine to take you from Informatica to ODI; if you want BI Apps 11.1.1.7.1 now, you’ll have to reimplement rather than upgrade, or you can wait for BI Apps 11.1 PS2 which will support upgrades from earlier releases, but (important to note) keeps you on Informatica – any move from Informatica to ODI will need to be done yourself, as a re-implementation
- Only Oracle database sources and targets are supported in this initial release, in practice not a real issue for new implementations, but worth bearing in mind if you planned to use Teradata, for example, as your target data warehouse platform
- Oracle Fusion Applications aren’t supported as a source either, yet, so anyone using this will need to stay on BI Apps 11.1.1.6.x until an upgrade version becomes available
- A few edge-case analytic applications and sources aren’t supported in this release yet – Enterprise Asset Management, for example, is not yet supported for any source, whereas some other applications only support more recent PeopleSoft versions and not JDE, for example. As always, get the most up-to-date supported sources and applications list from Oracle before making any major investment in an implementation or upgrade project.

From a technical perspective though the major difference in this release, compared to the 7.9.6.x versions that preceded it, is the use of Oracle Data Integrator 11g as the embedded ETL tool rather than Informatica. To be clear, Informatica will still be supported as an ETL option for the BI Apps going well into the foreseeable future, but Informatica users will need to wait for the PS2 release due in the next twelve months or so before they can upgrade to the new 11g platform.
In addition and perhaps more importantly, it’s not envisaged that Informatica customer will move over to ODI unless they use the upgrade as an opportunity to re-implement their system now on ODI, moving across customisations themselves and essentially starting with a clean sheet of paper (which may not be a bad thing, if you’re thinking of tidying up your system following years of upgrades, customisations and so forth). What this does mean though is no DAC, no Informatica server and client tools, a new (and hopefully simpler) way of setting up and configuring your system, and in-theory a more closely-integrated set of tools all based around the modern, standards-based Fusion Middleware 11g architecture.

In this new world of ODI and the BI Apps, ODI load plans replace Informatica Workflows, whilst ODI packages and interfaces equate to Informatica mappings and maplets. The DAC is no more and is replaced by metadata within the ODI repository and other supporting schemas, with setup and configuration of the warehouse and ETL processes now carried out by two web-based tools, BI Applications Configuration Manager and Functional Setup Manager. The closer integration between these tool along with a chance for Oracle to re-think the BI Apps setup process should lead to easier configuration and customisations, but if you’re an Informatica developer it’s a whole new world, and the 11g platform makes a lot more use of Fusion Middleware platform functionality particularly around security and user provisioning.
So – all very exciting but quite daunting in terms of what needs to be learnt, and new processes that need to be thought through and put together before you can start making use of the new 11.1.1.7.1 feature set. We’ll start tomorrow then by taking a closer look at the BI Apps 11.1.1.7.1 technical architecture including the new configuration tools, and where ODI sits in the new product architecture, based on our first impressions of the product.
Agenda and Details for the Brighton RM BI Forum 2013
It’s Tuesday morning, and later today delegates will be starting to arrive in Brighton for the 5th Annual Rittman Mead BI Forum, running again at the Hotel Seattle down at Brighton Marina. Around seventy of Europe’s most experienced OBIEE, ODI and Essbase developers will be getting together to discuss techniques, share tips and take part in sessions led by some of the OBIEE world’s best speakers, all in our home town of Brighton. We’re also very pleased to be joined by several of the OBIEE and ODI product team members, as well as our special guest speaker, Toby Potter from Datasift. In this posting I’ll be setting out the detailed agenda for the three days, and don’t forget places are still (just) available for the Atlanta event, running the week after at the Georgia Tech Hotel & Conference Center.

Registration for the Brighton BI Forum officially opens at 5pm on Wednesday evening, but before that we’re running an optional one-day Oracle Data Integration Masterclass at the hotel venue led by myself, Stewart Bryson and Michael Rainey. The masterclass will start at 10am and go through until about 4.30pm, with the following estimated timings:
Day 1 : Optional Oracle Data Integration Masterclass, followed by Registration, Drinks and Keynote/Meal
10.00 – 11.00 : Welcome, and Introduction to Oracle Data Integrator 11g (Stewart Bryson)
11.00 – 11.15 : Morning Coffee
11.15 – 11.45 : ODI and the Oracle Reference Architecture for Information Management (Stewart Bryson)
11.45 – 12.45 : ODI and GoldenGate – A Perfect Match… (Michael Rainey)
12.45 – 13.30 : Lunch
13.30 – 14.30 : ODI and Hadoop, MapReduce and Big Data Sources (Mark Rittman)
14.30 – 15.30 : The Three R’s of ODI Fault Tolerance : Resuming, Restarting and Restoring (Stewart Bryson)
15.30 – 16.30 : Scripting and Automating ODI using Groovy and the ODI SDK (Michael Rainey)
Registration will then open at 5pm (17.00), with a drinks reception in the hotel bar between 6pm and 7pm (18.00 – 19.00), followed by the Oracle keynote and opening night meal in the hotel restaurant.

The main conference then opens at 8.45am on the Thursday morning, with the agenda for the first day looking like this:
Day 2 : Main Conference Sessions, Guest Speaker and Gala Meal
08.45 – 09.00 : Opening Remarks Mark Rittman, Rittman Mead
09.00 – 10.00 : “OBIEE SampleApp 11.1.1.7 functional highlights” (Philippe Lions, Oracle Corporation)
10.00 – 10.30 : Morning Coffee
10.30 – 11.30 : “OBI Performance Tuning – Real Customer Success Stories” (Antony Heljula, Peak Indicators Ltd)
11.30 – 12.30 : “Secrets of OBIEE implementation at LGI” (Marco Klaassens, Liberty Global)
12.30 – 13.15 : Lunch
13.15 – 13.30 : TED Session 1: ”Why I want to be working with Business Intelligence in 5 years time” (Jon Mead, Rittman Mead)
13.30 – 13.45 : TED Session 2 : “HA, DR, CFC, WTF?” (Mike Durran, Oracle Corporation)
13.45 – 14.00 : TED Session 3 : “Incrementally loading Exalytics using Notepad” (Antony Heljula, Peak Indicators Ltd)
14.15 – 15.15 : “Oracle Data Integrator 11g Best Practices. Busting your performance, deployment, and scheduling headaches.” (Uli Bethke/Maciek Kocon, Independent)
15.15 – 15.45 : Afternoon Tea/Coffee/Beers
15.45 – 16.45 : “New Developments in BI Multi-tenancy and Cloud” (Adam Bloom, Oracle Corporation)
After the first day’s presentations we’ll take a short break, and then convene again back in the conference room for our special guest speaker session, this year being provided by Toby Potter from Datasift, who’ll talk to us about social media and how it can be used within BI and analytic applications. Then, we’ll be picked-up by coach from outside the Hotel Seattle and taken to the venue for our gala meal, The Ginger Pig pub and restaurant in Hove, just down the road from where Jon and I live (though don’t all expect to come back afterwards, whatever Borkur and Ragnar tell you when you leave the pub.)

18:00 – 18.45 Guest Keynote Toby Potter, Datasift – “Social Data and Business Intelligence”
19:00 – 19.30 Depart for Restaurant
20:00 – 22:00 Gala Meal : The Ginger Pig, Hove
Don’t go too crazy on Thursday evening though, as we’re starting again early on Friday with sessions starting at 9am, going through until around 4.45 when we’ll close. Here’s Friday’s agenda:
Day 3 : Main Conference Sessions, and Close
9.00 – 10.00 : “The magic of aggregates” (Michael Wilcke, sumIT AG)
10.00 – 10.30 : Morning Coffee
10.30 – 11.30 : “Update on BI Metadata Modeler and Metadata Architecture” (Philippe Lions, Oracle)
11.30 – 12.30 : “Integrating Oracle BI, BPM and BAM 11g: The complete cycle of information” (Edelweiss Kammermann, Awen Consulting)
12.30 – 13.15 : Lunch
13.15 – 14.15 : Debate “Big Data – Hype, or the Future or Oracle BI/DW?”
14.15 – 15.15 : “Endeca – Looking beyond the general demos” (Adam Seed, Rittman Mead)
15.15 – 15.30 : Afternoon Tea/Coffee/Beers
15.30 – 16.30 : “Virtualizing Exalytics” (Mike Durran, Oracle Corporation)
16.30 – 16.45 : Closing Remarks (Mark Rittman)
You’ll notice we’ve brought back the popular “debate” section this year, with this year’s topic being “Big Data – Hype, or the Future of BI/DW?”. I’ll be looking for volunteers to argue the case for either of the two sides in the debate, so if you’ve got a view on whether big data is going to be the salvation of BI, whether it’ll turn us in to the COBOL programmers of the future, or whether its just a load of hot air (or you just like having an argument), let me know when you arrive and we’ll pull the debating teams together.
Other than that – have a safe journey over, bring something warm and waterproof as the weather is typically England in May, and see at least some of you in Brighton today and tomorrow!
Previewing the BI Forum 2013 Data Integration Masterclass
I guess it’s a British thing to not blow our own trumpet (does that translate the same over in the US?), but something I’m particularly proud about with the upcoming Rittman Mead BI Forum 2013 events is our Oracle Data Integration Masterclass, running on the Wednesday before each event properly starts, and put together by myself, Stewart Bryson and Michael Rainey. Although the main theme for the BI Forum is OBIEE, virtually every BI system that we all work with has a data warehouse of some sort underneath it, and most OBIEE professionals to one extent or another have to understand data warehousing principles, and how Oracle’s data integration tools work. So this year, we thought we’d take a deep-dive into Oracle Data Integrator and the wider Oracle Data Integration Suite, and in this preview positing I’ll be giving you a bit of a preview of what’s coming in the session – and places are still available for the US BI Forum event, and for the masterclass itself if you’ve only registered for just the main conference event.
The masterclass is made up of six sections, delivered by myself, Stewart and Michael, assumes a basic understanding of data warehousing and ETL tools but otherwise gets down into the detail of what we’ve found works well “in the field”. Stewart Bryson, Oracle ACE and Managing Director for Rittman Mead America, will open the session with an overview of ODI and the Oracle Data Integration Suite, taking a look at the product history and walking the audience through the major elements of the ODI product architecture. If you’ve ever wondered what agents do within ODI, why there are two repositories and where WebLogic comes into it, Stewart’s session will make everything clear before we get into the rest of the details.

Then, after coffee, Stewart will carry on and talk about what’s called the Oracle Information Management Reference Architecture, Oracle’s next-generation blueprint for data warehousing and information management that combines the best of Kimball and Inmon with new thinking around “big data” and “data discovery”. ODI and Oracle Data Integration Suite is the enabling technology for this new framework. At Rittman Mead, we use this framework for the majority of our DW customer engagements and we’ll be talking later on in the masterclass about how big data sources, for example, can be leveraged by ODI and brought into your BI environment in the same way as any other regular, relational datasource.

The third section of the masterclass sees Michael Rainey take over the stage and talk to us about ODI’s integration with Oracle GoldenGate, Oracle’s data integration product for real-time analysis and data loading. Michael has taken part in several ODI & GoldenGate customer engagements over in the States, and has worked with Stewart in producing a number of custom ODI knowledge modules to better make use of this powerful new data integration tool. If you’ve read through any of Michael’s blog posts on ODI and Golden Gate and are interested in hearing a bit more detail on how it all works, as well as some real-world practical tips and tricks, this will be an invaluable session for you.

So far I’ve got away with just making the tea, but straight-after Michael is my session, where I’ll be talking about ODI and its new integration with Hadoop, NoSQL and the wider “big data” technology area. I’ve been covering ODI and Hadoop in some blog posts over the past week, but there’s only so much that I can get into a blog post and this session will be the first airing of this new material, where I’ll be demoing all the main integration points and talking about what works well, and where the main value is, with this very interesting new feature.

Then it’s back to Stewart again, where he’ll be talking about creating highly-resilient ETL code that’s also resumable, using features such as ODI 11g’s load plans and the Oracle Database’s resumable space allocation feature. Stewart and I were particularly keen to put together this session as it brings together work Stewart did a few years ago on fault-tolerant ETL in the Oracle Database, with some blog posts I put together over the 2012 Christmas break around highly-resilient ETL with ODI11g. What this session does is explain the background to the ETL resilience features in the Oracle Database, and ODI’s use of WebLogic JEE agents, and demonstrates through some custom knowledge modules how they can be brought together for your project.

Finally, Michael concludes the masterclass with a look at a feature you’re probably vaguely aware of, intend to learn something about, but sounds a bit complex; Groovy scripting and the ODI SDK. In fact, like WLST scripting for OBIEE, learning Groovy and the SDK is the key to automating tedious tasks such as mass-importing and reverse-engineering tables and files, as well as making it possible to add functionality to ODI or integrate it with other standards-based products. In a session almost entirely made-up of live demos, Michael will take us through the basics of Groovy and the SDK, and show us a few examples of where this could add value to your data integration projects.
So there we have it – Brighton is now fully-booked up, but if you’ve already registered for the main event but want to come to the masterclass now too, you can log back into the registration site and update your booking to include the additional masterclass fee. Atlanta is running a week later and so still has a few main event passes left, and again if you’ve already registered for the main conference, use the link in your registration confirmation to go back in and add the masterclass to your booking. And – hopefully we’ll see you all in Brighton or Atlanta for the Rittman Mead BI Forum 2013 in the next two weeks!
New Oracle Magazine Article on Integrating OBIEE 11g with Oracle WebCenter Portal: Spaces
Just a short note to say that my latest Oracle Magazine article is now online, and talks about integrating OBIEE 11g with Oracle WebCenter Portal: Spaces, the “spiritual successor” to Oracle Portal. “New Spaces for BI” introduces Portal: Spaces talks about the high-level benefits of integration, and then goes through a step-by-step example where BI content gets added to a Portal : Spaces team website.
Writing this article brought back a few memories, as I wrote the article for OTN back in 2005 on integrating Oracle Discoverer with Oracle Portal, with the same end-objective of creating a composite intranet-style web application that placed BI content alongside more “line-of-business” web portlets. The technology has come on a lot since the time of that article – apart from Discoverer, we were using Oracle Application Server 10g (pre-BEA and WebLogic), Portal itself and Oracle Database 10g, and whilst the approach in the other article worked, you didn’t have a lot of control over the layout and there was a weird mechanism in the background that showed you cached data, rather than live data in the Discoverer port lets. But the basic principal was the same though – register the Discoverer End-User Layer as a Portlet provider, as we do with OBIEE 11g’s Presentation Services catalog in the Oracle Magazine article, and then users can drag and drop their favourite BI content into their corporate portal.
Also, if you’re interested, I covered OBIEE 11g and WebCenter integration in a lot more detail back towards the end of 2012, with a series of postings on the blog starting with an introduction to Oracle WebCenter and WebCenter Portal, a more detailed look at OBIEE and WebCenter Portal : Spaces integration, and then finally a posting on leveraging the WebCenter Portal : Framework ADF toolkit with OBIEE 11g to programmatically bring together BI content and a portal framework as a custom ADF application.
An Oracle BI “Blast from the Past”: Ten Products from Oracle’s BI History You May Not Have Heard Of…
With Oracle BI Enterprise Edition, the BI Apps and Hyperion EPM Suite, Oracle have a set of reporting and analysis tools that can be considered “best of breed” and compete with any other tool on the market. Coupled with the Oracle database, the engineered systems strategy and the Fusion Apps/Apps Unlimited ERP suites, as a BI developer it’s pretty clear we “bet on the right horse” in terms of which company we backed as developers. But it’s not always been as plain sailing as this, and like every other software company Oracle have released a few products over the years that didn’t take-off so well, never really came together or just got eclipsed by other acquisitions Oracle made over time. So lets take a light-hearted count-down through the Oracle BI, DW and ETL products released over the years that you may not have heard of, but at least some of us bear the scars from trying to implement ….
10. Oracle Warehouse Builder 10gR2′s Data Quality and Enterprise ETL Options
Back in the days before Oracle Data Integrator, Oracle Warehouse Builder was Oracle’s strategic data integration tool, initially sold on a per-developer basis but in time, bundled in with the Oracle database to match similar product packaging from Microsoft.

LIke the old saying about technology reaching the point of perfection before then becoming obsolete, in retrospect OWB10gR2 pretty much got it right in terms of Oracle-centric data integration, and the Data Quality option in particular has yet (to my mind) to be surpassed in term of data quality integration with an Oracle ETL tool. Of course, with the acquisition of Sunopsis Oracle went with ODI as their strategic ETL tool, driven mostly by the fact that it was cross-platform and had a bit more of a SOA/middleware angle than did OWB, but it’s still a shame to see OWB being de-emphasised over the years and we still miss its dimensional data modelling capabilities, integration with Oracle OLAP, and of course the data quality features that were introduced with OWB10gR2′s Data Quality Option.
9. Oracle Browser, Oracle Data Query, and Oracle Discoverer
Again, products made obsolete by newer and better ones coming through, rather than failing in themselves, Oracle Browser was Oracle’s first proper ad-hoc query tool, which in turn begat Oracle Data Query, which then begat Oracle Discoverer, still in use across many EBS sites and still with some features yet to be incorporated into Oracle BI Enterprise Edition.

But its easy to get rose-tinted-spectacles about Discoverer; having to dig out and maintain ancient Oracle JInitiator Java browser plug-ins to get Discoverer Plus to work; Discoverer Viewer only having a fraction of the functionality of Discoverer Plus; the web-based version of Discoverer first appearing with Oracle 9iAS, possibly the worst piece of Oracle software ever released, and so on. But for getting the job done with minimal fuss, Discoverer is still hard-to-beat as an ad-hoc query tool for the Oracle database, and of course its’ still available and runs now as part of the Fusion Middleware 11g setup, and it’ll still be maintained and developer for as long as there’s EBS customers out there wanting to do some simple exploration of their ERP data.
8. Oracle Data Mart Suite
Here’s one for an Oracle BI Trivial Pursuit quiz – what was the name of Oracle’s first combined ETL, data warehousing and reporting product suite, based around Oracle 8i and using ETL code licensed from Sagent? The answer is Oracle Data Mart Suite, a largely-forgotten precursor to Oracle Warehouse Builder that combined Oracle Discoverer, Oracle Designer and a tool called Oracle Data Mart Builder along with Oracle 8i to create Oracle’s first end-to-end BI & data warehousing tool.

Some of the concepts are very familiar to us now – a central repository, agents to collect metadata and run code, graphical tools to create data models and define ETL processes – but Data Mart Builder was Windows-only when Oracle were just about to move wholesale into Linux with Oracle 9i, and of course the ETL functionality pre-dates the inbuilt SQL ETL that came with Oracle 9i. Oracle Warehouse Builder 2.1 came along towards the start of the 2000s and replaced Data Mart Builder and Data Mart Designer, but Discoverer lived on and Oracle still really haven’t got a single install, single metadata store solution to replace it. Data Mart Suite even pre-dates my involvement with Oracle, but I’d be interested if anyone reading this (my guess – Kent Graziano ;-)) has any first-hand experience in working with it.
7. Oracle Darwin
Just around the same time as Oracle 9i was released, Oracle made the acquisition of Thinking Machines, a data-mining and supercomputer company based out in the States who sold a product called Darwin, a Clementine-like GUI analytical workbench that Oracle later rebranded as “Oracle Data Mining”.

Darwin never really saw the light of day with Oracle but the internal algorithms and technologies went on to form the core of the Data Mining Option for the Oracle Database (now part of the wider database Advanced Analytics Option), which now has GUI elements of its own but does all of the processing in the Oracle database. Technology derived from Darwin can also be found today in products like Oracle Spend Classification, working under the covers to classify corporate spend using data mining classification algorithms.
6. Oracle BI Applications 7.9.5.2
Now we’re getting into the interesting ones. Four weeks of my life I’ll never get back were spent back in 2009 getting to grips with the first version of the Oracle BI Apps that used ODI, rather than Informatica, as the ETL tool. Sources and targets in this initial release were limited to just EBS 11.5.10 on Oracle Database 10gR2, but other than that it was fully-functional, with a method for doing customisations, new configuration tools that did away with the need for the DAC, and all of the relevant SDE and SIL mappings re-implemented as ODI interfaces and packages.

But this was back in the days of ODI10g, and there were no load plans or any other features since introduced to ease the move to ODI with the BI Apps, and the customisation approach was slightly scary and complex, to say the least. In the end, only one release of BI Apps on ODI10g ever came out, but of course we’re now just nearing the point where BI Apps 11g gets full support for ODI as an alternative to Informatica as the ETL tool, and seeing the various workarounds Oracle had to do with ODI to get it to work as an Informatica alternative back in 2009 made it obvious to me why features such as load plans were introduced over the past few years.
5. The Oracle BI Spreadsheet Add-in for Excel
A spreadsheet add-in with less useful functionality than Oracle BI Office, the GUI performance of Warehouse Builder and an OLAP server back-end that nobody used. One of a number of entries in this chart based around Oracle 9i OLAP.

This, and No.1 on our list were probably the single biggest reason Oracle ended-up buying Hyperion – the chances of an accountant actually using this Excel add-in, as opposed to say Hyperion’s Essbase add-in, were about as close to zero as you could get, assuming you could find anyone still using Oracle OLAP after the bodged migration from Express Server. But – Oracle 9i OLAP going so badly paved the way, in time and several years later, for the Hyperion acquisition, and now Oracle OLAP sensibly focuses on the Simba MDX Provider for Oracle OLAP along with Microsoft Excel pivot tables, the descendants from this product release are actually pretty darn good.
4. Oracle Business Intelligence Essbase Integrator
Another one for Oracle BI Trivial Pursuit – which product from Oracle integrates OBIEE, Essbase and the BI Apps, so that you can click on links on your BI Apps dashboard and launch Smarview, preserving the POV context from the related analysis on the dashboard?

The answer is the Oracle Business Intelligence Essbase Integrator, a product launched by Oracle back in 2010 and which appeared to me, at the time, as wildly ambitious but solved a real problem – how do you combine the capabilities of Essbase and OBIEE whilst in the background, keeping their metadata in-sync.

Rather predictably, we didn’t really hear much about this product again which was a shame, as the concept was superb (albeit a bit “Heath Robinson”, or “Rube Goldberg” as the Americans would say). I suspect we’ll see something around this same concept going into the 12c timeline as it’s such an obvious “win” for Oracle, but for now, it’s a product best known as the challenge we set Stewart Bryson back at the Brighton BI Forum 2011 when Venkat couldn’t make it over – present Venkat’s OBI Essbase Integrator session as if it were his, and without seeing the content of each slide until it came up on the projector.
3. Oracle Daily Business Intelligence
Prior to the BI Apps and around the time of Oracle E-Business Suite 11i, reporting against EBS data was done either through Discoverer, or through Oracle Reports and FSGs. Oracle Discoverer could integrate with Oracle Portal, but it was fairly primitive integration and Portal wasn’t really cut-out to be a BI Portal, more being used for intranet-style applications like data entry and staff directories.
Oracle Daily Business Intelligence (DBI) improved on this situation in two ways; first, it was based off-off Oracle Database materialised views, speeding up report response times and, in theory, taking the load off of your underlying ERP system; second, it had its own custom portal and dashboard framework that was faster, more responsive and “tighter” than Oracle Portal and its Discoverer portlets.

DBI may well have got more traction over time but probably was eclipsed by the major acquisition spree that Oracle went on in the early 2000′s, buying Peoplesoft (and thereby JD Edwards) and Siebel, and then using Siebel Analytics along with BEA’s WebLogic tooling to create the successor BI-on-ERP platform, OBIEE and the BI Apps. Which was probably a good thing in the end, as I never met anybody actually able to customise Daily Business Intelligence, a task that makes customising the BI Apps seem like plugging your iPod into iTunes running on a Mac.
2. Oracle 9i OLAP, and Oracle Business Intelligence Beans
On to the final two, and they’re both based around Oracle 9i OLAP, Oracle’s (in retrospect) disasterous replacement for Oracle Express Server which lost them the OLAP market they owned back in the late 90′s. I’ve got mixed feelings on 9i OLAP as I know many of the people involved in its development and marketing, and nowadays in its database OLAP option form it’s an excellent product, I think technically and architecturally better than Essbase. But the introduction of Oracle 9i OLAP was a masterclass in suicidal product marketing; first, there was no backward compatibility with Express Server tools, so all of the users of Oracle Financial Analyzer and Oracle Sales Analyzer had to wait years for the product at #1 in our list, with of course most of them decamping to Microsoft OLAP Services or Arbor Essbase instead. The first version of Oracle 9i OLAP was ROLAP only, with terrible, buggy performance and a wait of a year or so before the MOLAP version came out, again without backwards compatibility with Express Server tools (due to the removal of the key SNAPI and XCA communications APIs that Express Server applications used)

All of this made great technical sense, and if the MOLAP version of 9i OLAP had come out at the same time as the ROLAP version, and if Oracle somehow managed to keep SNAPI support so that OFA and OSA customers could technically migrate their OLAP cube to Oracle 9i without loosing their tool access. It may have all worked out. But the “icing on the cake” was the lack of any proper ad-hoc or OLAP query tool support right at the start (Discoverer “Drake” came a few years later), with customers expected to – get this -write their own BI tool using a Java component technology called Oracle Business Intelligence Beans.

A few parters, including myself in this bit of history from around ten years ago, gamely tried to generate interest around 9i OLAP and BI Beans, but combined with Microsoft’s entry into the OLAP market and Arbor (and then Hyperion’s) focus on the finance department, rather than DBAs and IT who never actually buy OLAP servers, Oracle OLAP never regained the market share that Express Server had, even though as I said earlier it’s arguably a better, more scalable and easier-to-manage OLAP Server than Essbase.
The last laugh is on the BI Beans product development team though, as the BI Beans query builder became the inspiration for OBIEE 11g’s “Selection Steps” feature, whilst its data visualisation components found their spiritual successor in ADF’s Data Visualization Tools (DVT) feature, which provides the BI visuals behind OBIEE, the latest version of Endeca Information Discovery, and of course the Oracle Fusion Apps.
1. Oracle Enterprise Planning & Budgeting
Number one in our list of Oracle’s slightly crazy BI tools from the past was Enterprise Planning & Budgeting, the long-awaited replacement for Oracle Financial Analyzer and Oracle Sales Analyzer based around the Oracle 9i OLAP platform. More akin to the Stone Roses’ “Second Coming” and about as well critically received, EPB was the “aircraft carrier” to OFA’s “motor torpedo boat”, had a list as long as your arm of critical patches you had to apply before you could use it, and required installation along with EBS (and knowledge of a set of arcane setup steps) before you could use it.

Coupled with a painfully-slow user interface for users typically used to split-second OFA response-times, EPB was long in the coming but quickly despatched when Oracle first adopted Siebel Analytics and the BI Apps as their new BI Platform, and then bought Hyperion and made Essbase and Hyperion Planning the centrepiece of their performance management strategy, something that carries on to this day.
So there we have it – a light-hearted look through some of the Oracle BI products that didn’t make it to the big time, and a bit of history to explain why OBIEE and EPM Suite are the tools we use today. Most of the Oracle PMs who looked after these tools are still with us, working on OBIEE and its related technologies, so apologies if I’ve inadvertently offended anyone by picking on one of the products they looked after – it was all fun at the time and most of the products would have stayed with us, and gone on to be successes were it not for the massive strategic shift Oracle made back at the turn of the century towards cross-platform, and away from the Oracle database begin the centre of everything. Let me know if you’ve had any experiences with these tools, or if you’ve got any corrections or additions to their stories.
OBIEE, ODI and Hadoop Part 4: Hive Data Transformation & Integration via ODI 11g
In the previous three articles in this series (and well done for everyone that’s still with us), we looked at how OBIEE 11.1.1.7 and ODI 11.1.1.6+ can access Hadoop data sources via a related technology called Hive, looking in the second article in more detail at the OBIEE aspect including how the OBIEE metadata import was set up, and how new HiveODBC drivers supplied by Oracle make this all possible. In the last posting in the series, we saw how ODI can be used to populate the Hive “warehouse” tables, stored on Hadoop’s HDFS (Hadoop Distributed File System) storage grid, that are then analysed via HiveQL and MapReduce, and in this final post, we’ll take a look at how ODI can go beyond simple loading from file into Hive tables and start to do some of the data transformation, and integration tasks that we regularly perform with more traditional data sources and targets.
In the previous ODI and Hive/Hadoop example, we used an ODI interface to load data from a local file into a Hive table, using the IKM File to Hive knowledge module. In this first example columns in the source file were mapped 1:1 into “columns” in the Hive table source, like this:

whilst in the second example, we used a Hadoop “SerDe” serializer-deserializer transformation to parse incoming weblog rows into the target Hive column format, again using this same IKM File to Hive knowledge module.

Which of course is great if you’re loading data directly from files into corresponding Hive warehouse tables, but what if you’ve already got data in Hive but you want to “reshape” or transform it, creating the equivalent of a star schema, say, of a set of “normalised” tables?
Hive, compared to Oracle at least, is pretty simplistic in terms of the data manipulation you can do with it, and is more akin to working with Oracle external tables than a full insert-update-delete-ACID-compliant database (but then again, it was never intended to be that). You can only insert new data into Hive tables, not (ordinarily) update rows or delete them, with the only way to delete data in a Hive table being to drop it, then re-create it new. HiveQL has syntax for joins, group by and some functions, and you can drop in your own custom MapReduce scripts for the more complex stuff – which is not bad going and probably more than enough for your average Hadoop/Big Data analyst.
But the scenarios we’re considering probably won’t have a Hadoop expert around, and are typically something like:
- We’ve got some marketing or analyst-type users who want to access some behavioural, weblog or activity data sitting in Hadoop/HDFS/Hive, they’re happy users of OBIEE, and we want to be able to connect OBIEE to this data so that they can report on it – which probably means “reshaping” it somewhat to fit OBIEE’s preference for star-schema (or at least simplified, denormalized) source data models, or
- We’ve got some data sitting in Hadoop/HDFS that we’d like to add in as a new source into our data warehouse, and ODI is the preferred tool for bringing in new sources
In neither case do we have much in the way of Hadoop or MapReduce skills, so we can either (a) use OBIEE’s BI Repository to do some source data reshaping, or even better (b) do the job properly, maybe as part of a much wider ETL process, using ODI. The key thing is fitting Hadoop into ODI’s way of doing things, and giving ODI the ability to do Hadoop and Hive-specific tasks through its extensible “knowledge module” framework.
In the example above that I used to illustrate ODI’s ability to work with Hive, I used the IKM File to Hadoop knowledge module that comes as part of the ODI Application Adaptor for Hadoop (ODIAAH); ODIAAH is one of a number of Fusion Middleware “application adapters”, and is licensed separately (but in conjunction with) ODI EE costing around the same as GoldenGate (around $17000/processor, according to the current tech price list). For some more background into ODIAAH this self-study training session goes into the basics, but at a high-level this adapter actually ships four mainly Hive-based ODI KMs that enables data loading, transformation and integration between Hadoop/Hive/HDFS and the Oracle database (amongst other sources/targets);
- IKM File to Hive (Load Data) : what we’ve been using so far, used to load Hive tables from local and HDFS-stored files
- IKM Hive Control Append : used for loading data into Hive tables, with data sourced from one or more other Hive tables – the equivalent of loading one relational table by joining, filtering and transforming other tables
- IKM Hive Transform : a variation on the previous KM that allows incoming data to be transformed via Python or Perl scripts
- IKM File-Hive to Oracle : load data from either a Hive table, or from files stored on HDFS, into an Oracle database using the separately-licensed Oracle Loader for Hadoop
- CKM Hive : a check knowledge module for Hive (which because of its “schema on read” rather than “schema on write” approach, doesn’t natively support keys or constraints)
- RKM Hive : a “reverse-engineering” KM that allows ODI to read table definitions from the Hive metastore, and use them to create ODI datastore definitions
A typical Oracle BI&DW project is going to use these KMs for a couple of main reasons; one, as stated before, is because their might be information sitting in Hadoop that the BI/DW system wants access to, and would otherwise have to start writing MapReduce, Pig, Sqoop etc code to get into their data warehouse. The other might be to leverage Hadoop’s ability to crunch and count large sets of data massively parallel, at relatively low cost, with ODI then initiating and monitoring the process, then loading the results into a more traditional Oracle data store. Other technologies, some implemented or licensed by Oracle, also surround this process – Oracle NoSQL database for example – but for now lets concentrate on Hadoop and Hive, and see what else these KMs can do for us.
Starting off with IKM Hive Control Append, you might have a situation where you’ve already got some data in Hive tables (perhaps through loading them up using ODI’s File to Hive (Load Data) KM, but the data needs joining, transforming, filtering or otherwise reshaping before you can connect a tool such as OBIEE to it. In this case it doesn’t make sense to use ODI File to Hive (Load Data) as this KM is for when the data sits outside Hive in source files, so instead we can use IKM Hive Control Append to truncate/insert append new rows into an existing target Hive table.
This process works very similar to regular table loading IKMs (except of course you can’t insert/update into the target, only insert append); for example, in the screenshot below, two Hive tables, one for customer information and one for salesperson information, are joined and the results transformed using HiveQL with the results loaded into another Hive table.

Looking at the loading code generated by ODI for this interface, you can see that the HiveQL used to extract and load the data looks very similar to Oracle SQL, and in fact this KM is about the closest one to “regular” relational database ones out of the set of ODIAAH knowledge modules.

Things get more interesting with the IKM Hive Transform, a knowledge module that takes data from any source and loads it, via custom shell scripts, into a target Hive table. These transformation shell scripts are typically written in Python or Perl, and give you the ability to write your own custom pre-processing or transformation code that (via ODI’s temporary interfaces feature) can then be used as “inline views” or multi-step processes when performing more complex data loading processes around Hive. In the example below, a three-step data loading process first creates and then loads an Apache Weblog file using IKM File to Hive (Load Data), then sessionizes (tags the log file with IDs to identify all activity within a particular browser session) using IKM Hive Transform:

Taking a look at the final “Sessionize Weblog” step in more detail, you can see in the interface Flow tab that this step uses the IKM Hive Transform module, and a script called “sessionize.pl” to do the transform work.

The perl script itself then parses through the log file information and works out the start, and stop points for each individual user session, outputting the results which are then transformed into the correct target columns by the PRE_TRANSFORM_DISTRIBUTION settings in the KM options.

Finally, the IKM File-Hive to Oracle knowledge module takes things in the other direction, extracting from Hive tables or HDFS files into an Oracle database, via the Oracle Loaded for Hadoop big data connector, but that’s really a topic in itself and for another day, when I’d like to look in more detail at the big data connectors in general, and how you can leverage Hadoop and HFDS from within SQL and PL/SQL commands. For now though, this concludes my look at Hadoop connectivity from within OBIEE 11.1.1.7 and ODI.11.1.1.6, but if you’re like me this brings up as many questions as it answers; for example:
- How would I go about setting up my own Hadoop/Hive/HDFS development environment, and can I use Windows or does only Linux make sense?
- Just what could I do with Hadoop as a data source that I can’t do with regular Oracle and file sources; in terms of scale, and also complexity/speed?
- Can ODI also make use of Hadoop data loading/transformation tools like Pig, or Sqoop?
- Do any of the related/commercialized Hive/Hadoop technologies add anything to these scenarios – for example, Cloudera’s Impala (for BI metadata/reporting) or Cloudera Manager (for Hadoop administration)?
- And – is there a time/volume threshold where Hadoop makes more sense as a data processing platform than an Oracle database?
If these sound interesting to you, they’re exactly what I’ll be covering during my part of the Data Integration Masterclass at the Rittman Mead BI Forum 2013 events in Brighton and Atlanta, running in just a few weeks time. We’ve still got a few places left, so if you’re interested and want to see all this technology in action, sign-up now and I’ll hopefully see you soon.
OBIEE, ODI and Hadoop Part 3: A Closer Look at Hive, HFDS and Cloudera CDH3
In the first two parts in this series, I looked at the recently-added support for Apache Hadoop as a data source for OBIEE 11.1.1.7 and ODI 11.1.1.6, and explained how the Hadoop support was really enabled through a related technology called Hive. In the second part in the series I showed how OBIEE 11.1.1.7 could report against “big data” sources using Hadoop and this Hive technology, but this all of course pre-supposes that we have data in Hive in the first place. So what actually is Hive, how do you load data into it, and can ODI help with this process?
To take a few steps back, Apache Hive is a Hadoop-family project that provides a “data warehouse” layer over Hadoop, through a metadata layer not unlike OBIEE’s RPD together with a SQL-like language called HiveQL. Coupled with ODBC and JDBC database drivers, BI tools like OBIEE use Hive to get access to big data sources, as the HiveQL language that Hive uses is very similar to SQL used to access databases such as Oracle, SQL Server or mySQL. Delving a bit deeper in the Hive product architecture, as shown in the diagram below Hive has a number of components including a “database engine”, a metadata store, APIs for client access, and a link through to Hadoop to actually load, process and retrieve data in HDFS (Hadoop Distributed File System).

So what’s HDFS then? HFDS is a fault-tolerant, distributed filesystem that’s a core part of Apache Hadoop, and stores the data that MapReduce jobs then process via job trackers, task trackers and all the other Hadoop paraphernalia. HDFS is accessed through a URI (URL) rather than through your Linux filesystem browser, but distributions such as Cloudera’s CDH3 and CDH4 ship with tools such as Hue, shown below, that provide a web-based interface into HDFS so that you can browse HDFS like a regular OS-level filesystem.

Notice how there’s a “user” folder like we’d get with Linux, and within that folder there’s a home folder for Hive? With Hive, generally the data you manage using Hive is actually loaded into a directory structure under the “hive” user, either using data taken from another directory area in HDFS or from external files. Hive’s data is still in file form and accessed via MapReduce and Hadoop, but it’s in a directory area away from everything else. You can, however, tell Hive to create tables using data held elsewhere in HDFS, analogous to Oracle’s external tables feature, which then skips the data loading process and just maps table structures onto files held elsewhere in the Hadoop filesystem.

In most cases when we’re considering OBIEE accessing Hadoop data via Hive, the data would have been loaded into Hive-mananged tables tables beforehand, though it’s possible that Hive table metadata could have been mapped onto other data in HDFS. In your own particular Hive implementation and assuming you’ve got Hue installed, and Beeswax, a table browser for Hive that usually comes with Hue, you can see where each individual table within your Hive metastore is actually held; in the examples below, the dwh_customer Hive table is a managed table and has its data stored within the /user/hive/warehouse/ HDFS directory, whilst the ratings table has its data stored outside of Hive’s directory structure, but still within the HDFS managed filesystem.

So how does one create a Hive table, load data into it and get it ready for OBIEE access, and can ODI help with this, as we asked earlier? Before we get into ODI then, let’s take a look at how a Hive table is created and loaded, and then we’ll see how ODI does the same job.
With thanks to the ODI product development team’s David Allan, who put together some great Hive and ODI examples in this blog post, let’s start by creating a Hive table against the same movie ratings data in the right-hand screenshot below, but this time with the data actually loaded into Hive’s directory structure (i.e. a “managed” table). From the Hive command-shell, I type in the following commands to create the managed table, after SSH’ing into the VM running Hive:
officeimac:~ markrittman$ ssh oracle@bigdatalite
Warning: Permanently added the RSA host key for IP address '192.168.2.35' to the list of known hosts.
oracle@bigdatalite's password:
Last login: Mon Apr 22 10:59:07 2013 from 192.168.2.47
=====================================================
=====================================================
Welcome to BigDataLite
run startx at the command line for X-Windows console
=====================================================
=====================================================
Host: bigdatalite.us.oracle.com [192.168.2.35]
[oracle@bigdatalite ~]$ hive
Hive history file=/tmp/oracle/hive_job_log_oracle_201304250732_1523047910.txt
hive> create table movie_ratings (user_id string
> , movie_id string
> , rating float
> , tmstmp string)
> row format delimited fields terminated by '\t';
OK
Time taken: 3.809 seconds
hive>
At this point the table is created but there’s no data in it; that part comes in a moment. I can see the table structure and its empty state from the Hive command-line:
hive> describe movie_ratings;
OK
user_id string
movie_id string
rating float
tmstmp string
Time taken: 0.168 seconds
hive> select count(*) from movie_ratings;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201303171815_0021, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0021
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0021
2013-04-25 07:40:51,581 Stage-1 map = 0%, reduce = 0%
2013-04-25 07:40:56,617 Stage-1 map = 0%, reduce = 100%
2013-04-25 07:40:58,640 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201303171815_0021
OK
0
Time taken: 12.931 seconds
hive>
and also from the Beeswax web UI:

So how do we get the data into this table, without any tools such as ODI? I can either load data straight from files on my local workstation, or I can upload them, for example using Hue, into the HDFS filesystem first.

Now I can use the HiveQL LOAD DATA command to load from one of these HDFS tables into Hive, and then count how many rows have been loaded, like this:
hive> load data inpath '/user/oracle/movielens_src/u.data'
> overwrite into table movie_ratings;
Loading data to table default.movie_ratings
Deleted hdfs://localhost.localdomain/user/hive/warehouse/movie_ratings
OK
Time taken: 0.341 seconds
hive> select count(*) from movie_ratings;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201303171815_0022, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0022
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0022
2013-04-25 08:14:24,159 Stage-1 map = 0%, reduce = 0%
2013-04-25 08:14:32,340 Stage-1 map = 100%, reduce = 0%
2013-04-25 08:14:42,420 Stage-1 map = 100%, reduce = 33%
2013-04-25 08:14:43,428 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201303171815_0022
OK
100000
Time taken: 26.32 seconds
hive>
So how does this process look when using ODI to do the Hive data loading? Let’s start with importing the Hive table metadata for the movie_ratings table I just created from the Hive command-line shell, by going over to the Topology navigator in ODI 11.1.1.6 – note that you’ll need to configure ODI to connect to your Hive, HDFS and Hadoop environment beforehand, using the Oracle Data Integrator for Hadoop documentation as a guide, with this adapter being an extra-cost license option on top of base ODI Enterprise Edition.
Hive has its own technology type within the Topology navigator, and you create the connection through to Hive using the HiveJDBC driver, first adding the connection to the Hive server and then specifying the particular Hive database / namespace, in this case selecting the “default” database for my Hive system.

Now I can reverse-engineer the Hive table structures into a Designer navigator model, just like any other relational table structure.

Within the ODI Topology navigator you can then create File technology connections either to files held in HFDS, or more likely with ODI to files on your workstation, or server, filesystem, like this:

and then add the filedata stores to the Designer Navigator Model list, entering the correct delimiter information and reversing the column definitions into the datastore definition.

Now it’s a case of creating an interface to load the Hive table. In this instance, I map each of the source file “columns” into the Hive table’s columns, as the source file is delimited with an easily-usable structure.

Then, over in the Flows tab for the interface, I make sure the IKM File to Hive knowledge module is selected, keep the default values for the KM options (more on these in a moment), and then save the interface.

Now it’s a case of running the interface, and checking the results. Notice in the Operator navigator code panel, the LOAD DATA command that ODI is generating dynamically, similar to the one I wrote manually earlier on in the article.

Going back to my Hive command-line session, I can see that there’s now 100,000 rows in the movie_ratings Hive table.
hive> select count(*) from movie_ratings;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201303171815_0024, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0024
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0024
2013-04-25 16:59:12,275 Stage-1 map = 0%, reduce = 0%
2013-04-25 16:59:18,346 Stage-1 map = 100%, reduce = 0%
2013-04-25 16:59:29,467 Stage-1 map = 100%, reduce = 33%
2013-04-25 16:59:30,475 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201303171815_0024
OK
100000
Time taken: 27.251 seconds
Now in many cases the data going into a Hive table isn’t neatly arranged into columns within delimited files; it could be, for example, web log data that you’ll need to parse using regular expressions or other APIs or standard parsers. When that’s the case, you can use an option with the IKM File to Hive knowledge module to override the normal column-to-column mappings and instead use an expression, something Oracle have done in their demo environment for parsing these types of log files.

“ROW FORMAT SERDE” is a reference to Hive’s “Serializer – Deserializer”, or row-formatting feature, that gives you the ability to use regular expressions and other data manipulation techniques to, in this case, allocate incoming file data to the proper columns in the target hive table.
So now we’re at the point where we can use ODI to populate the Hive tables that OBIEE in turn uses to access Hadoop data sources. But what if the data we want to load into Hive isn’t in the format or shape we need, and we need to join, filter or otherwise work with Hive data and tables before we can report on it. And what if we want to get data out of Hive and into regular tables if a relational data store makes more sense than Hadoop, for a particular reporting requirement? Check back tomorrow for the final part in this series, where we’ll answer these remaining questions.
SmartView as the Replacement for BI Office with OBIEE 11.1.1.7
Apart from system-wide improvements to Essbase integration across OBIEE 11.1.1.7, the other Essbase-related improvement that came with this latest release was the (re-)introduction of SmartView as the replacement for Oracle BI Add-in for Microsoft Office (“BI Office”), OBIEE’s previous MS Office solution. As a reminder, BI Office appeared with OBIEE 10.1.3.3 back in 2007/8 and supported integration with Microsoft Excel and Powerpoint, allowing you to download analysis views from the BI Catalog and then view them within Excel and Powerpoint.

What you couldn’t do with BI Office though was use it to create new analyses, or upload what you’d created back to the BI Catalog. There was also no integration with Microsoft Word or Outlook, which meant it was a feature meant more for viewing and copying analyses into Excel and Powerpoint rather than as a “first class” report authoring environment.
Then when OBIEE 10.1.3.4 was released, a number of EPM Suite products were integrated with OBIEE, including Workspace (now resurrected with OBIEE 11.1.1.7), SmartSpace (where did that go?) and SmartView, the long-term replacement for Essbase’s somewhat minimalist Excel Add-in. This was all good stuff except that, in terms of OBIEE support, this version of SmartView was essentially unusable, rendering OBIEE data in an Essbase-like way that made little sense for an OBIEE user.

“The UI takes a bit of getting used to” was my comment at the time, which in retrospect was a bit of an understatement and this version of SmartView had little to no take-up within the OBIEE world, with BI Office carrying on until now as the only viable MS Office integration approach. Now though, the new improved version of SmartView is with us, so how well does it work with OBIEE data?
SmartView can be download from the BI Presentation Services homepage, but note that this is the 32-bit version and you’ll need to go to My Oracle Support for the 64-bit version, available using patch ID 16238382 (at the time of writing, for SmartView version 11.1.2.2.310). Once its installed, select SmartView > Options > Advanced and enter your general EPM Suite Smartview Provider Services URL into the Shared Connections URL setting (in the format http://[machine_name:port}/workspace/SmartViewProviders), like this:

This setting only covers SmartView connecting to Essbase and Financial Reporting, so to connect to OBIEE's Presentation Services Catalog you'll need to create what's called a Private Connection (or define a shared connection for OBIEE within an XML file, as detailed in the SmartView 11.1.2.2.310 docs), by pressing the Panel button in the menu ribbon, selecting Private Connections from the Smart View menu, then clicking on the Create new connection button.

Then, when prompted for the SmartView connection type, select Oracle BI EE, then type in the OBIEE SmartView URL in the format http://[machine_name:port]/analytics/jbips, and press Finish to complete this part of the process.

Then, when prompted enter the username and password for your OBIEE system, and then save the connection as a private connection to your workstation.

Now you should be able to browse the BI Catalog and select a SmartView report, for example, to view within Excel.

Or you can select any view from a regular analysis, and add that to Excel just as you did with BI Office.

More importantly though, the View Designer feature allows you to create a new report from scratch, selecting from any subject area in the BI Catalog and creating a report from right within Excel.

This report can then be manipulated either as an Excel pivot table (pictured below) or an OBIEE pivot table, giving you an OBIEE-within-Excel experience far more intuitive and usable than the earlier incarnation of SmartView.

Additional calculated fields can be added, in what is arguably a more obvious way than you’d do so in the Analysis Editor…
… and charts can be developed as well, using a similar set of of chart types to the ones provided by the Analysis Editor.

Then, once you’re done, you can either save the Excel (or Word, or Powerpoint, or whatever) document to your workstation’s filesystem, or you can upload to the BI Presentation Catalog using the Publish View button…

… and then – get this – open the report in the Analysis Editor, just like any other analysis in the catalog. Impressive stuff (although the calculation defined in Excel didn’t make it through to OBIEE, and the upload feature only seems to bring a single view at a time, but this is version 1.0)

There’s tons more to Smartview and in reality, presumably some of the new OBIEE stuff won’t work properly in this first release, but it’s a huge improvement over the old OBIEE MS Office plug-in, and it’s also useful being able to use the same MS Office plugin for all Oracle’s BI & EPM tools, with full 32 and 64-bit support for all the modern MS Office versions.
Essbase and EPM Integration Improvements in OBIEE 11.1.1.7
One of the major new feature areas in OBIEE 11.1.1.7, but which has so far got very little attention, is the significant improvement in integration between Essbase, the Hyperion EPM Suite, and OBIEE 11g. The integration between EPM Workspace and OBIEE’s Presentation Services which disappeared when 11g came along is now back, along with installation and security integration, a new version of SmartView that (properly) supports OBIEE as a data source, and the ability to spin-off aggregates from the RPD into Essbase ASO cubes.
Now some of these features of course made an appearance in the earlier, 11.1.1.6.2 BP1 release, and integration between OBIEE 11g and EPM Suite has been happening on-and-off right back from the OBIEE 10g days, but where we’re at now with OBIEE 11.1.1.7 is the delivery of a number of things that customers have long been asking for, including:
- The ability to run OBIEE from within EPM Workspace, with single sign-on between the two
- Shared security provisioning and organisation between Essbase and OBIEE, through application roles and policies
- The ability to install Essbase and the other EPM tools into the same WebLogic domain as OBIEE, using a single installer
- A proper Excel (and Word, Powerpoint, Outlook) add-in for OBIEE, with the ability to author reports as well as run existing Answers-authored ones
This is actually one of a number of new feature areas that came with 11.1.1.7 that have had little publicity; as well as better Essbase integration, there’s actually now support for multi-tenancy in the RPD and catalog, Hadoop integration (which we covered in a blog post last week), the View Suggestion Engine, the inevitable changes to MUD, and quite a few others, some of which I’ll try and cover in the next few days and weeks, but for now let’s look at these new Essbase/EPM integration improvements, starting with installation of Essbase and its related tools into the OBIEE WebLogic domain.
As I mentioned back in my OBIEE 11.1.1.7 New Features posting a few weeks ago, the OBIEE product installer now offers Essbase as an installation option alongside OBIEE, Real-Time Decisions (RTD) and BI Publisher. As with RTD, Essbase isn’t included in the base OBIEE+ license, but it is included in Oracle BI Foundation Suite, the product package that Oracle encourage new customers to take out an includes OBIEE, Scorecard & Strategy Management, Essbase and BI Mobile. Selecting Essbase during the install process installs it, and the other EPM Suite tools, in the same WebLogic domain as OBIEE, and you can see Essbase within Fusion Middleware Control as a product – separate from OBIEE – that you can manage and monitor.

Essbase Server, and Essbase Studio (the client/server tool used to design and build Essbase cubes) are also now controlled and monitored through OPMN, something that’s been a feature of EPM Suite for several releases now but which is, of course, new for OBIEE.
[oracle@obiee11117 ~]$ cd /home/oracle/obiee/instances/instance1/bin
[oracle@obiee11117 bin]$ ./opmnctl status
Processes in Instance: instance1
---------------------------------+--------------------+---------+---------
ias-component | process-type | pid | status
---------------------------------+--------------------+---------+---------
essbasestudio1 | EssbaseStudio | 12682 | Alive
essbaseserver1 | Essbase | 12685 | Alive
coreapplication_obiccs1 | OracleBIClusterCo~ | 12686 | Alive
coreapplication_obisch1 | OracleBIScheduler~ | 12687 | Alive
coreapplication_obijh1 | OracleBIJavaHostC~ | 12683 | Alive
coreapplication_obips1 | OracleBIPresentat~ | 12684 | Alive
coreapplication_obis1 | OracleBIServerCom~ | 12689 | Alive
[oracle@obiee11117 bin]$
So something that’s been an issue for EPM customers upgrading from OBIEE 10g to 11g was the removal, at the time, of the ability to integrate OBIEE’s Presentation Services within EPM Workspace, and the SSO link between the two products. Back with OBIEE 10.1.3.4 there was an admittedly complicated but supported and working process to integrate the two products together, allowing EPM Workspace customers to “skin” OBIEE to look like Workspace and run the two products together, albeit with separate report catalogs, security models and so forth.

This, coupled with the removal of OBIEE’s Hyperion custom authenticator for the RPD left many EPM Suite customers upgrading to OBIEE 11g in the lurch, leading to workarounds such as this one that we put together recently for one of our customers. Well this integration (mostly…) is back with OBIEE 11.1.1.7, so let’s see what it does, and what functionality is still missing compared to OBIEE 10g.
First off, Essbase and EPM Suite as installed as part of an OBIEE installation isn’t quite the same as EPM Suite installed standalone; most importantly, Essbase in this OBIEE incarnation has a different security model than “standalone” EPM Suite, in that it uses the same system of application roles and policies that the Fusion Middleware 11g-centric OBIEE 11g does, rather than the Shared Services and groups that standalone EPM Suite does. Also, the OBIEE 11.1.1.7 install installs just the following EPM Suite products:
- Essbase Server, including Essbase Agent, Essbase Studio, Essbase Administration Services, Provider Services
- Financial Reporting
- Calculation Manager
Therefore you don’t get Planning, Web Analysis and so forth, and you can’t subsequently install them into the domain and Fusion Middleware Control afterwards – so think of Essbase and the EPM Suite tools in this context as an add-on and complement to OBIEE, not a full installation of EPM Suite in their own right. Moreover, the majority of Essbase administration tasks which for standalone EPM Suite installs are performed through MaxL, Shared Services and EAS are performed through Fusion Middleware Control, and Essbase high-availability and clustering works different within this context, for example. The standard product architecture diagram for OBIEE and Essbase combined within the 11.1.1.7 release therefore gets updated, with a number of products added to the Java components, and System components part of the diagram, like this:
Now, when installed as part of OBIEE 11.1.1.7′s WebLogic domain, EPM Workspace is available at http://[machine_name:port]/workspace, and when you launch it you’re presented with a view into the BI Catalog, and menu options to administer the various EPM and BI tools from one place.

Within this catalog are both OBIEE objects such as analyses, dashboards and agents, and EPM objects such as Financial Reporting and SmartView reports.

There are limits to this EPM/BI Catalog integration though – FR reports, for example, can only be opened using the File > Open dialog in EPM Workspace, with an error message showing if you just click on the report itself in the BI Catalog view within EPM Workspace. But SSO between Workspace and OBIEE seems to work (as in, you don’t need to re-enter your BI password when clicking on an analysis in the Workspace Catalog view) as both OBIEE and EPM are working off of the same Fusion Middleware security model, which (the lack of) explains why the feature disappeared for so long after OBIEE 11g was introduced.
Now that OBIEE and Essbase share the same security, the need for the old HSS Custom Authenticator has now gone away, though of course this will only be of use if a customer has moved their Essbase installation into the OBIEE domain, with standalone EPM Suite installations still needing the security workaround mentioned earlier in this article. There’s no upgrade path from standalone EPM Suite installations to this integrated arrangement, so most probably any users of Essbase within this new 11.1.1.7 context will be installing it “net-new”, with the main objective being to enhance their existing BI setup rather than merging their separate BI and EPM platforms into one.
As you’ve probably picked-up by now, much of this new integration ability is down to security harmonised across both Essbase and OBIEE, or more accurately Essbase now having an option to use Fusion Middleware 11g security rather than Hyperion Shared Services. So what does Essbase and FMW11g security look like in practice? Let’s head over to Fusion Middleware Control, in particular the Application Policies administration screen, to take a look.

The big difference when Essbase runs as part of an Oracle BI domain is that authentication, and authorization for Essbase use Fusion MIddleware security rather than Shared Services or Native Essbase security. Although Essbase Administration Services ships with OBIEE 11.1.1.7, you should use Fusion Middleware Control to enable access to particular Essbase databases, and give permission to access tools such as Financial Reporting or Administration Services; the only security role for EAS and MaxL in this setup is to create the Essbase data and metadata filters; these filters are then assigned to users through FMW security resource permissions and application policies, which then are then granted to application roles and thereby to users.
Whilst this probably seems like an over-complicated nightmare to traditional Essbase users, it does have the major advantage that one set of application roles granted to users within a Fusion Middleware system can cover both OBIEE and Essbase permissions, and there’s no need to link to Shared Services or support Native Essbase security. We’ll cover the implications of this more in some future blog posts, but this is the enabling technology that makes the rest of this integration make sense.
With Essbase integrated into the OBIEE BI Domain, you can also now use Essbase as an aggregate persistence target, though this feature comes with the same (slightly strange) approach and limitations that we first encountered when it was first introduced with OBIEE 11.1.1.6.2 BP1; although there’s not the same requirement for the Essbase server only to be used for aggregate persistence, you still have to name the Essbase database in a particular way, it’s ASO-only, and the Aggregate Persistence Wizard still creates a separate ASO database for each aggregation (similar to Oracle Database materialised views) rather than one single cube covering all aggregations. In practical terms – I’m not sure how much you’d use this vs. creating your own Essbase cube in Studio against the whole RPD business area – but it might be useful for OBIEE developers who otherwise don’t know Essbase.
So finally, the other major Essbase-related new feature in OBIEE 11.1.1.7 is SmartView, the successor to Oracle BI Office. But that’s a topic in itself, so I’ll cover that this in the next posting.
Previewing the New OBIEE 11.1.1.7 SampleApp
OBIEE 11.1.1.7 came out a few weeks ago, and recently we looked at Hadoop integration, one of the more interesting new features in this release. Over the next week I’ll be looking in more detail at the changes around Essbase and EPM Suite integration, but today I’ll be looking at a preview release of the upcoming OBIEE 11.1.1.7 SampleApp, provided courtesy of Philippe Lions and the BI Tech Demos team within Oracle. Philippe will be going through the new SampleApp release at the upcoming Rittman Mead BI Forum 2013 events in Atlanta and Brighton, but in the meantime lets take at what’s likely to appear in this new OBIEE 11.1.1.7 demo showcase.
As with previous OBIEE 11g SampleApps, the dashboard front page lists out all of the content, and highlights in bright blue those areas that are new to this release. The 11.1.1.7 SampleApp is largely based on earlier releases to support the 11.1.1.5 and 11.1.1.6 OBIEE versions, with any new content either showing off 11.1.1.7 new features, or adding new functional areas to the SampleApp demo.

The best place to start looking is the New Features Demo dashboard, which highlights new 11.1.1.7 features such as performance tiles, 100% stacked bar charts and waterfall charts, on this first dashboard page:

Totals within tables, pivot tables and other visualisations can now have action links associated with them, to display a financial summary report for example:

Another page on this dashboard shows of the new layout capabilities within dashboard pages; object containers can now have fixed (absolute) width and height sizes, whilst dashboard columns, and rows/columns within table and pivot table views, can also be frozen whilst other areas scroll by.

The new 11.1.1.7 SampleApp is likely to ship with Endeca Information Discovery pre-installed, and configured to provide the catalog search for OBIEE’s BI Presentation Services (a new feature in OBIEE 11.1.1.7). The SampleApp 11.1.1.7 screenshot below shows a typical “faceted search” against the web catalog, displaying key attributes and an attribute search box via an Endeca Information Discovery-style guided navigation pane. The benefit of Endeca providing catalog search vs. the Presentation Server’s regular search feature is that it looks much deeper into the catalog metadata, allows searching across many more attributes, and because it uses an in-memory index, it’s really fast.

There’s also some nice new Oracle R Enterprise content and demos, including an example where R scripts can be selected via the dashboard, parameter values supplied and the scripts then run; ORE using OBIEE’s BI Repository as a data source, and some more examples of ORE analysing the Flight Delays dataset to predict delays on future flights, for example.

11.1.17 comes with a lot of “fit and finish” tweaks to standard visualisations, and one of the dashboard pages shows of new table and pivot table features such as tooltips over row and column values, as well as features such as rollover headers, removable pivot table corners and the like – none of these are world-changing, but they’re often the sort of thing that particular customers want for their systems, and in the past we’ve had to hack-around in Javascript files and the like to meet similar requirements.

If you’re an Endeca Information Discovery developer who’s also interested in the state-of-play around OBIEE integration, there’s a whole dashboard setting out current examples around OBIEE / Endeca integration, including examples of parameter passing between OBIEE and Endeca Studio, Endeca using OBIEE’s BI Repository as its data source, and BI Publisher reporting against Endeca Server data via web service calls.

Finally, the dashboard pages added in to support DBA and developer tasks have been added to, with a new dashboard page for example displaying a list of all the physical SQL queries sent to the database.

Thanks to Philippe and the BI Tech Demos team for the early preview. Check back tomorrow when we’ll continue the look at what’s new with OBIEE 11.1.1.7, by taking a closer look at what’s changed, and dramatically improved, in the area of integration with Essbase and the Oracle EPM (Hyperion) Suite.
Using SampleApp Scripts to run a Simple OBIEE 11g Load Test
If you’ve not done so already, I’d advise anyone interested in OBIEE 11g to subscribe to the Oracle BI Tech Demos channel on Youtube, where Philippe Lions and his team showcase upcoming SampleApp releases, and highlight new features like SmartView integration (to be covered later this week on this blog), integration with Oracle R Enterprise, what’s coming in the new 11.1.1.7 SampleApp, and OBIEE 11.1.1.7 leveraging the Advanced Analytics Option in Oracle Database 11gR2. One demo that I’ve been aware of for some time via the Exalytics program, and that’s also featured in the Tech Demos channel, is a load test demo that uses scripts and internal OBIEE features to run the test, and is used by Oracle to show how many concurrent users an Exalytics server can handle.
What’s particularly interesting about this load test though is that it doesn’t require any external tools such as LoadRunner or JMeter, and the scripts it uses are actually shipped with the full v207 SampleApp VirtualBox VM that is downloadable from OTN. On a recent customer engagement a need came up for a “quick and dirty” load test for their system, so I thought I’d go through how this load test example works, and how it can be adapted for use with any other generic OBIEE 11g (11.1.1.6+) environment.
In the example used in the Youtube video, a report (which actually looks like a dashboard page, but is actually an single analysis compound layout containing two graph views, and a pivot table view) is set up with a special set of filter values; when requested, this analysis will use “randomised” filter values so that response times aren’t skewed by the same values being used each time, and a controlling process outside of the dashboard ramps up 10, 100, 200 and so on separate sessions up to a maximum of 2,000, to simulate the sort of user numbers that an Exalytics server might be required to support.

Then, when the load test is running, the metric display within Fusion Middleware Control is used to show how the server copes with the load (in terms of # of sessions, average response time per query etc), as well as a dashboard page based off of the usage tracking data that shows a similar set of information.

Now of course the reason this sort of test data is important (apart from selling Exalytics servers) is that a report that takes 10 seconds to run, on a system otherwise unused and with only you running queries, might take considerably longer to run when all your users are on the system, due to factors such as disk contention, queuing on database server and mid-tier server CPUs, parallel query getting scaled-back when more than a few users try to run reports at the same time, and so on – so you need to do this sort of load test before unleashing your new dashboards onto your user community. But performing a load test is hard – just ask our Robin Moffatt – so having a ready-made system shipped with SampleApp, that doesn’t require additional software, certainly sounds interesting. So how does it work?
The scripts that control the load test process are contained within the /home/oracle/scripts/loadtest folder on SampleApp, and look like this within the Linux file manager:

The folder actually contains three scripts, and a Java JAR archive file:
- runtest actually runs the load test
- users_list.txt is a set of usernames, that are central to the load test process (more on this in a moment)
- Loadtest_README.txt is the instruction file, and
- LoadTest.jar is a Java program that is called by runtest to log into OBIEE and request the report
Looking through the readme file, the way the process works is that you need to create a set of users with a common password within the OBIEE LDAP directory, and put their usernames in the users_list.txt file. Then, the LoadTest.jar file is called by the runtest script, passing the hostname and port number of the WebLogic server hosting Presentation Services, the path to the analysis that you wish to test against, and the common password, and the script will then initiate a session for each user and then run the report.
Looking at the list of names in the users_list.txt file is interesting, because they all appear to be airport three-letter codes; for example:
SAN
SAT
SAV
SBA
SBN
SBP
MKC
MKE
MKG
MLB
MLI
MLU
MMH
The reason for this becomes clear when you look at the filters behind the analysis that the runtest script calls; to provide the filter predicate randomisation, each run of the report uses the username to filter the origin airport selection, and the other filter values are generated through MOD and RAND functions that in essence, generate random values for each call of the report. So given that we’re not all going to want to test reports based on airport codes, and how the overall testing process works, this presents two challenges to us:
- How we generate a very large number of user accounts with a common password, given that the test process runs the report just once for each user, and how we get rid of these accounts once we’ve finished the testing.
- How we configure the report we want to test to generate “random” filter values – the approach Oracle took with this example is quite “clever”, but we’ll need to come up with something equally clever if we want to do this for our report.
Question 1 seems extricably linked to question 2, so let’s create an example report that we can easily randomise the values for, create a number of views that we can include in a compound layout as Oracle did in the load test demo, and give it a go.
Taking the SampleApp dataset and the A – Sample Sales subject area, let’s create an analysis that has the following columns in the analysis criteria:
- Products.P2 Product Type
- Time.T03 Per Name Qtr
- Time.T02 Per Name Month
- Customer.C3 Customer Type
- Ship To Regions.R50 Region
- Base Facts.1 – Revenue
- Base Facts.1 – Discount Amount
For good measure, create another derived measure, called Base Facts.1 – Gross Revenue, which uses the formula:
- “Base Facts”.”1- Revenue”+”Base Facts”.”3- Discount Amount”
and then create some views off of this criteria so that your analysis looks something along these lines:

Now comes the tricky part of randomising it. We could take the approach that Oracle took with the Airlines load test example and create, for example, a user for each country in the dataset, but instead let’s use Logical SQL’s RAND function to pick a region and calendar quarter at random, and then three of the five customer types, to use as the analysis filters. To do this, we create a filter against this column in the analysis criteria and then convert the filter to SQL, using something like the following SQL clause to filter the quarter randomly:
"Time"."T03 Per Name Qtr" in (
SELECT a.s_1 from
(SELECT
0 s_0,
"A - Sample Sales"."Time"."T03 Per Name Qtr" s_1,
RAND()*100 s_2
FROM "A - Sample Sales"
WHERE
(BOTTOMN(RAND()*100,1) <= 1)
ORDER BY 1, 2 ASC NULLS LAST, 3 ASC NULLS LAST
FETCH FIRST 5000001 ROWS ONLY) a
)
The same goes for the region filter, which we define as:
"Ship To Regions"."R50 Region" in (
SELECT a.s_1 from
(
SELECT
0 s_0,
"A - Sample Sales"."Ship To Regions"."R50 Region" s_1,
RAND()*100 s_2
FROM "A - Sample Sales"
WHERE
(BOTTOMN(RAND()*100,1) <= 1)
ORDER BY 1, 2 ASC NULLS LAST, 3 ASC NULLS LAST
FETCH FIRST 5000001 ROWS ONLY) a )
whereas for the customer type filter, we return the top 3 ordered (random) values, not just the first one:
"A - Sample Sales"."Customers"."C3 Customer Type" in
( SELECT a.s_1
FROM
(SELECT
0 s_0,
"A - Sample Sales"."Customers"."C3 Customer Type" s_1,
DESCRIPTOR_IDOF("A - Sample Sales"."Customers"."C3 Customer Type") s_2,
RAND()*100 s_3
FROM "A - Sample Sales"
WHERE
(BOTTOMN(RAND()*100,3) <= 3)
ORDER BY 1, 2 ASC NULLS LAST, 4 ASC NULLS LAST
FETCH FIRST 5000001 ROWS ONLY) a )
Now when you run the report you should see different filter selections being used each time you run it, similar to what's shown in the preview screenshot below.

One thing I noticed at this stage is, whilst the customer type filtering returned three values, only one would ever be used in the graph prompt, because that's how prompts in a view work vs. the multi-select prompts you get as dashboard prompts. So I then needed to move the customer type column from the prompts are to the Pies and Slices > Pies part of the graph layout (so I then got one pie chart per customer type, not just the one type I was seeing via the graph prompt before), so that my final report looked like this:

and my analysis criteria, including these special filters, looked like this:

Next we need to create an initial set of users so that we can perform the concurrency test. I do this by using the WebLogic Scripting Tool (WLST) script shown below which creates 30 users, assigns them to an LDAP group and then adds that group to the BIConsumers LDAP group, so that they can run the analysis in question (if you're new to WLST or are interested in reading a bit more about it, take a look at this Oracle Magazine of mine that explains the feature).
serverConfig()
password = 'welcome1'
atnr=cmo.getSecurityConfiguration().getDefaultRealm().lookupAuthenticationProvider('DefaultAuthenticator')
group = 'Loadtest-Users'
atnr.createGroup(group,group)
atnr.addMemberToGroup('BIConsumers','Loadtest-Users')
users = ['user1','user2','user3','user4','user5','user6','user7','user8','user9','user10','user11','user12',
'user13','user14','user15','user16','user17','user18','user19','user20','user21','user22','user23','user24',
'user25','user26','user27','user28','user29','user30']
for user in users:
atnr.createUser(user,password,user)
atnr.addMemberToGroup(group,user)
After saving the WLST script to the /home/oracle/scripts/loadtest folder as create_users.py, I then go back to my Mac workstation and SSH into the SampleApp VirtualBox VM to run the script:
Last login: Sat Apr 20 12:58:38 on ttys000
markmacbookpro:~ markrittman$ ssh oracle@obieesampleapp.rittmandev.com
oracle@obieesampleapp.rittmandev.com's password:
Last login: Sun Apr 21 17:13:37 2013 from 192.168.2.31
[oracle@obieesampleapp ~]$ cd obiee/Oracle_BI1/common/bin
[oracle@obieesampleapp bin]$ ./wlst.sh
wls:/offline> connect('weblogic','Admin123','localhost:7001')
Connecting to t3://localhost:7001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain 'bifoundation_domain'.
Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.
wls:/bifoundation_domain/serverConfig> execfile('/home/oracle/scripts/loadtest/create_users.py')
Already in Config Runtime
wls:/bifoundation_domain/serverConfig> exit()
Exiting WebLogic Scripting Tool.
Then using the same SSH session I create a new users_list.txt file containing the usernames of these 30 users (use CTRL-D in a Unix session to send the EOF signal to CAT, and stop copying text into the users_list.txt.new file)
[oracle@obieesampleapp bin]$ cd /home/oracle/scripts/loadtest/
[oracle@obieesampleapp loadtest]$ cat > users_list.txt.new
user1
user2
user3
user4
user5
user6
user7
user8
user9
user10
user11
user12
user13
user14
user15
user16
user17
user18
user19
user20
user21
user22
user23
user24
user25
user26
user27
user28
user29
user30
[oracle@obieesampleapp loadtest]$ mv ./users_list.txt users_list.txt.original
[oracle@obieesampleapp loadtest]$ mv ./users_list.txt.new users_list.txt
Finally, I then edit the runtest script to change the path to point to the analysis I created earlier, update the password setting for the users:
[oracle@obieesampleapp loadtest]$ vi ./runtest
so that the final runtest file looks like this:
[oracle@obieesampleapp loadtest]$ cat ./runtest
export JAVA_HOME=/home/oracle/obiee/Oracle_BI1/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib
echo "Start time: `date`"
echo "Load Test Starting..."
java -jar LoadTest.jar "localhost" "7001" "/shared/loadtest/SampleAnalysis" "welcome1"
echo "Load Test Completed..."
echo "End time: `date`"
Now, we've got everything we need for the initial test; an analysis to run, a set of users to run it with, and the JAR file to perform the test. So let's give it a go...
[oracle@obieesampleapp loadtest]$ chmod a+x runtest
[oracle@obieesampleapp loadtest]$ ./runtest
Start time: Sun Apr 21 18:21:39 PDT 2013
Load Test Starting...
----------------------------------------------
Creating User Sessions for Concurrency Test..
Total active sessions: 30
Initiating Queries..
Total queries initiated: 30
Cleaning up User Sessions created for Concurrency Test..
- Remaining Active Sessions: 30
Completed User Sessions Cleanup
----------------------------------------------
Load Test Completed...
End time: Sun Apr 21 18:21:54 PDT 2013
Where it gets interesting though is when you go over to Fusion Middleware Control, and view the DMS metrics graphs at Capacity Management > Metrics > View the full set of system metrics, where you can then see various metrics such as # of active sessions, request processing time (i.e. how long the analysis took to run), and # requests per minute.

But of course, our current test only runs thirty queries through our thirty users, so its not much of a concurrency test; also, I've got caching enabled, so I'd expect the figure to look fairly good (though this may be what we use in real-life, so the key thing is to make the test as close a reflection of your actual system as possible).
To create a more realistic test in terms of user numbers, there's a couple of options you can use; one option, and the one I use, is to copy the same set of users over and over again into the users_list.txt file, to the point where there are hundreds of rows in the file to simulate hundreds of sessions. Another approach, and perhaps the more purist, is to create many more user accounts and have each one only run one or two reports, but that involves creating the required amount of users and then deleting them afterwards from the LDAP server. I'll go for the first option, using the following Unix commands within my SSH session to copy the file back onto itself many times, giving me at the end around 1600 users to use in my concurrency test:
[oracle@obieesampleapp loadtest]$ wc -l users_list.txt
30 users_list.txt
[oracle@obieesampleapp loadtest]$ cat users_list.txt users_list.txt users_list.txt users_list.txt users_list.txt users_list.txt users_list.txt users_list.txt users_list.txt users_list.txt >> big_users_list.txt
[oracle@obieesampleapp loadtest]$ wc -l big_users_list.txt
420 big_users_list.txt
[oracle@obieesampleapp loadtest]$ cat big_users_list.txt big_users_list.txt big_users_list.txt big_users_list.txt > users_list.txt
[oracle@obieesampleapp loadtest]$ wc -l users_list.txt
1680 users_list.txt
FInally I run the test again, to simulate 1600 users running queries at once:
[oracle@obieesampleapp loadtest]$ ./runtest
Start time: Sun Apr 21 18:42:42 PDT 2013
Load Test Starting...
----------------------------------------------
Creating User Sessions for Concurrency Test..
- Active Sessions: 100
- Active Sessions: 200
- Active Sessions: 300
- Active Sessions: 400
- Active Sessions: 500
- Active Sessions: 600
- Active Sessions: 700
- Active Sessions: 800
- Active Sessions: 900
- Active Sessions: 1000
- Active Sessions: 1100
- Active Sessions: 1200
- Active Sessions: 1300
- Active Sessions: 1400
- Active Sessions: 1500
- Active Sessions: 1600
Total active sessions: 1680
Initiating Queries..
- Queries initiated: 100
- Queries initiated: 200
- Queries initiated: 300
- Queries initiated: 400
- Queries initiated: 500
- Queries initiated: 600
- Queries initiated: 700
- Queries initiated: 800
- Queries initiated: 900
- Queries initiated: 1000
- Queries initiated: 1100
- Queries initiated: 1200
- Queries initiated: 1300
- Queries initiated: 1400
- Queries initiated: 1500
- Queries initiated: 1600
Total queries initiated: 1680
Cleaning up User Sessions created for Concurrency Test..
- Remaining Active Sessions: 1680
- Remaining Active Sessions: 1600
- Remaining Active Sessions: 1500
- Remaining Active Sessions: 1400
- Remaining Active Sessions: 1300
- Remaining Active Sessions: 1200
- Remaining Active Sessions: 1100
- Remaining Active Sessions: 1000
- Remaining Active Sessions: 900
- Remaining Active Sessions: 800
- Remaining Active Sessions: 700
- Remaining Active Sessions: 600
- Remaining Active Sessions: 500
- Remaining Active Sessions: 400
- Remaining Active Sessions: 300
- Remaining Active Sessions: 200
- Remaining Active Sessions: 100
Completed User Sessions Cleanup
----------------------------------------------
Load Test Completed...
End time: Sun Apr 21 18:45:34 PDT 2013
Going back over to EM, I can see the load building up on the server and the response time increasing.

Notice though how the response time actually starts to fall as more queries run? That's most probably caching kicking in, so next time I'll disable caching completely and run the test again. But for now though, this is the Oracle load test script running, and the steps I've outlined here should allow you to run a similar test yourself. Thanks to Phillipe and the Oracle BI Tech Demos team for this, and on a similar topic I'll be previewing the new v303 11.1.1.7 SampleApp in a posting tomorrow.
Rittman Mead BI Forum Atlanta Special Guest: Alex Gorbachev
A few days back, I introduced our special guests for the Rittman Mead BI Forum in Atlanta, focusing first on Cary Millsap. Today I’d like to talk about our other special guest: Oracle ACE Director Alex Gorbachev. Alex was an inspiration for me back in the Oracle Database 10g and early 11g days when I was administering Oracle RAC for several data warehouse customers, and wondering whether RAC was the right platform for BI. Of course it was… and every time I read one of Alex’s blogs (he was quite a prolific blogger back then… we all were once upon a time) or saw him speak, I felt empowered to go take on Cache Fusion.
Alex joined Pythian in Canada as a DBA team lead in 2006. Just two years later, he moved to Australia to successfully startup Pythian Australia. In 2009, he returned to Canada and took up the mantle of Chief Technology Officer, a title he still holds today. He is a member of the distinguished OakTable Network (as is Cary Millsap… something I forgot to mention yesterday), and is a member of the Board of Directors of the Independent Oracle Users Group (IOUG). Alex founded the Battle Against Any Guess Party, a movement promoting scientific troubleshooting techniques. During his time in Australia he also founded Sydney Oracle Meetup, a vibrant local community of passionate Oracle professionals.
Its fortuitous that Mark blogged yesterday on Hadoop… as this is exactly what Alex is speaking on at the BI Forum. His presentation is titled “Hadoop versus the Relational Data Warehouse.” He’ll discuss some of the technical design principles of Hadoop and the reasons for it’s rise in popularity. We’ll get to see the position that Hadoop currently occupies in the enterprise data center, it’s possible future trajectory, and how that trajectory compares with the more traditional relational data warehouse. For the BI developers in the crowd who have perhaps never seen Alex speak… you’re definitely in for a treat. He’s set to speak first thing Friday morning to kick off the last day of the Forum. If you know Alex, you’re obviously aware that he’s an excellent technologist, but you also likely know how much fun he is to be around, so it will be good to have him at the social meet-ups in and around the conference.
I’d really like to thank our friend and business partner Pythian for always supporting Rittman Mead and ensuring that Alex would speak at the Forum. And of course… I’d be remiss if I didn’t say: Love Your Data!
OBIEE, ODI and Hadoop Part 2: Connecting OBIEE 11.1.1.7 to Hadoop Data Sources
In yesterday’s post I looked at the key enabling technologies behind OBIEE and ODI’s connectivity to Hadoop, and today I’ll look at how OBIEE 11.1.1.7 can now access Hadoop data sources through two related technologies; Hive, and MapReduce.
In my introduction to the topic I said that whilst writing MapReduce routines in Java, and then orchestrating them through other tools in the Apache Hadoop family could be quite complex technically, another tool called “Hive” provided an SQL-like query layer over Hadoop and MapReduce so that tools like OBIEE could access them. Rather than you having to write your own MapReduce routines in Java, for instance, Hive writes them for you, returning data to OBIEE and ODI via ODBC and JDBC drivers. The diagram below, also from yesterday’s post, shows the data layers used in such an arrangement.

Under the covers, Hive has its own metadata layer, server engine and data store, with developers “loading” data into Hive “tables” which are then generally stored on the HDFS file system, just like any other data processed through MapReduce. Then, when a query is issued through Hive, the Hive Server dynamically generates MapReduce routines to query the underlying data, returning data to users in a similar way to an interactive database SQL session, like this:
markmacbookpro:~ markrittman$ ssh oracle@bigdatalite
oracle@bigdatalite's password:
Last login: Wed Apr 17 04:02:59 2013 from 192.168.2.200
=====================================================
=====================================================
Welcome to BigDataLite
run startx at the command line for X-Windows console
=====================================================
=====================================================
Host: bigdatalite.us.oracle.com [192.168.2.35]
[oracle@bigdatalite ~]$ hive
Hive history file=/tmp/oracle/hive_job_log_oracle_201304170403_1991392312.txt
hive> show tables;
OK
dwh_customer
dwh_customer_tmp
i_dwh_customer
ratings
src_customer
src_sales_person
weblog
weblog_preprocessed
weblog_sessionized
Time taken: 2.925 seconds
hive> select count(*) from src_customer;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201303171815_0003, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0003
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0003
2013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0%
2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0%
2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33%
2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201303171815_0003
OK
25
Time taken: 22.21 seconds
hive>
In the example above, I connected to the Hive environment, listed out the “tables” available to me, and then ran a count of “rows” in the src_customers table which in the background, caused a MapReduce routine to be written and executed in the background by the Hive server. Hive has been described as the “Hadoop Data Warehouse”, but it’s not really a data warehouse as you and I would know it – you wouldn’t typically use Hadoop and Hive to store customer transaction data, for example, but you might use it as a store of Facebook interactions, for example, or most popular pages or interaction paths through your website, and someone working in Web Analytics might want to interactively query that set of data in a more user-friendly manner than writing their own Java routines. So how does OBIEE gain access to this data, and what extra software or configuration pieces do you need to put in-place to make it happen?
If you want to have OBIEE 11g access Hadoop data, you’re best going with the 11.1.1.7+ release as this is where it’s most tested and stable. You’ll need to configure drivers at two points; firstly at the server level (Hadoop access is only supported with Linux server installations of OBIEE 11.1.1.7) and then at the Windows-based Administration tool level. Let’s start with the BI Administration tool first, based on the instructions in the 11.1.1.7 Metadata Repository Builder’s Guide.
To have the BI Administration tool connect to a Hadoop/Hive data source, you’ll need to download some ODBC drivers for Hadoop via a My Oracle Support download, DocID 1520733.1. This gives you a set of HiveODBC drivers along with a PDF explaining the installation process, and once you’ve installed the drivers, you’ll need to open up the ODBC Data Source Administrator applet and create a new HiveODBC data source. In this instance, I call the datasource “bihdatalite” after the server name, and go with the default values for the other settings. Note that “default” is the name of the “database” within Hive, and the port number is the port that the Hive server is running on.

Now I can create a new repository offline, and connect to the Hive server via the HiveODBC connection to start importing table metadata into the RPD. Note that with the current implementation of this connectivity, whilst you can import tables from multiple Hive databases into the RPD, queries you issue can’t span more than a single Hive database (i.e. you can’t specify a schema name prefix for the table name, therefore can’t join across two schemas).
Then, once you’ve imported the Hive table metadata into the RPD, change the physical database type to “Apache Hadoop”, from the default ODBC 3.5 setting that would have been added automatically by the metadata import process. Leave the connection pool call interface at ODBC2.0, put in any old username and password into the shared login details (or a valid username/password if Hive security is enabled), and then save the repository.

You should then be able to use the View Data feature in the BI Administration tool to view data in a particular Hive table, like this:

Now you need to move over to the server part of OBIEE, and configure the ODBC connection to Hive there too. OBIEE 11.1.1.7 comes with DataDirect drivers already installed that will connect to Hive, so it’s just a case then of configuring a connection of the same name to the Hive datasource using OBIEE’s odbi.ini file, like this:
[ODBC Data Sources]
AnalyticsWeb=Oracle BI Server
Cluster=Oracle BI Server
SSL_Sample=Oracle BI Server
bigdatalite=Oracle 7.1 Apache Hive Wire Protocol
[bigdatalite]
Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/Merant/7.0.1/lib/ARhive27.so
Description=Oracle 7.1 Apache Hive Wire Protocol
ArraySize=16384
Database=default
DefaultLongDataBuffLen=1024
EnableLongDataBuffLen=1024
EnableDescribeParam=0
Hostname=bigdatalite
LoginTimeout=30
MaxVarcharSize=2000
PortNumber=10000
RemoveColumnQualifiers=0
StringDescribeType=12
TransactionMode=0
UseCurrentSchema=0
Note that you also need to configure OBIEE’s OPMN feature to use the DataDirect 7.1 drivers rather than the default, older ones – see the docs for full details on this step. Then, as far as the RPD is concerned, you just need to make a business model out of the Hive table sources, upload it using EM so that its running online on your OBIEE server installation, and your RPD in the end should look similar to this:

Then finally, you can create an OBIEE analysis using this data, and analyse it just like any other data source – except, of course, that there’s quite a lot of lag and latency at the start of the query, as Hive spins up its Java environment, writes the MapReduce query, and then send the data back to OBIEE’s BI Server.

So how do we get data into Hive in the first place, to create these tables that in the background, are access through Hadoop and MapReduce? Check back tomorrow, when I’ll look at how Oracle Data Integrator can be used to load data into Hive, as well as perform other data integration tasks using Hadoop and other big data technologies.




















