BI & Warehousing
Rittman Mead Presentations at UKOUG Tech and EBS Conference
Just a quick remind for anyone who’ll be attending the UKOUG Technology and EBS Conference at the Birmingham ICC next week. We’ll be exhibiting at Stand 50, and will also be delivering the following sessions:
- “Oracle BI EE and Essbase Integration Step-by-Step”, Monday 14:45 – 15:45, presented by myself and Venkat Janakiraman
- “Oracle Warehouse Builder 11gR2 New Features”, Monday 16:15 – 17:00, presented by myself
- “Under the covers of ODI and the Oracle BI Applications”, Tuesday 14.30 – 15.30, presented by myself
- “Oracle BI Enterprise Edition Masterclass”, Wednesday 13.50 – 15.50, presented by myself, Venkat, Adrian Ward and Borkur
This year at the Masterclass, as well as tackling the basics of modeling and reporting, we’ll be tackling four of the questions that most often come up from our customers:
- How do we model non-dimensional (OLTP) sources using the BI Administration tool;
- How do we join together fact tables that do not have the same dimensionality;
- How do we model single-table sources, and
- How do we (safely) migrate OBIEE projects from one environment to the other.
Although OBIEE 10g is a fairly mature technology now, these questions always come up and it seems that no-one is too sure about how to approach them. We’ll deliver a session on this (together with a general update on the product) and try and tackle them for you.
We’re also running a BI “Fringe Event” on the Tuesday night at the Pitcher and Piano from 6pm – 9pm, where we’ll put some money behind the bar, lay on some nibbles and try and get all the BI speakers, delegates and users together for an informal get together. We’ll be inviting all those that came to the BI Forum in May and the Training Days in October as a kind of “reunion”, but we’d be happy to have anyone along with something interesting to say on Oracle BI. If you want to come along, please send me an email or come along to Stand 50 so we can get an idea on numbers. Other than that, see you all in Birmingham in a few weeks time.
Fun with DBMS_Xplan.Display
APEX: Manage Applications
This is the section where you manage your instance wide application attributes. The only functionality that exists in this section, other than mere reporting, is the ability to change an application from Run and Build Application to Run Application Only.

Application Attributes
In the section you can view the following application attributes:
- Workspace
- Application - the number assigned which will be part of the URL
- Parsing Schema
- Application Name
- Last Updated - The user (workspace admin or developer) who last updated this application.
- Date - Time since the last update
- Pages - Number of pages in the application
- Language

Build Status
From here you can view the Build Status of your applications. It will either be Run and Build Application, which means it can be edited by developers and workspace administrators AND viewed by end users. This is typically the status in a non-production environment; or, it will be Run Application Only which is what the status should be in a production environment. It can not be modified in any way but it can be viewed by the end users.

Edit Build Status
The page where you change the Build Status.

Parsing Schemas
Finally, this tells you what the parsing schema is for the application.

Sequences — Not The Only Choice
Oracle BI EE 10.1.3.4.1 & Essbase Connectivity – Enriching Essbase reports with relational attributes
If you had attended our training days event or one of our Open World sessions, you will have noticed that we had covered the various aspects of integration between Essbase and relational sources using BI EE. The presentations are available here. One of the scenarios that we had not covered was the ability to display relational attributes along with Essbase data. It is very similar to the Horizontal Fragmentation technique we had covered in the training. But there are significant differences which i thought made sense to cover in a separate blog entry.
For example, lets assume that we have the entire SH schema loaded into Essbase. In many cases, not all the attributes of a dimension are loaded into Essbase for a couple of reasons.
1. Load & Retrieval Performance
2. Cube size
If you look at the Essbase outline below, for the product dimension we have just loaded the primary hierarchy.

As you see this outline does not have all the Product attributes like Product Price, Product Pack Size etc. In many cases, such attributes might be maintained in relational sources but may never be loaded inside an Essbase outline. But from a reporting standpoint, we might still need to display these attributes in some cases as shown below

If you notice, Quantity is an Essbase measure and PROD_LIST_PRICE is a relational attribute of a product. The question is how do we model this in the repository in such a way that there is least impact on the queries generated. There are couple of approaches that we can take to model this. I will cover one such approach today.
This modeling technique leverages the concept of conforming dimensions. We start first by including the relational product source table in the Essbase Product dimension as shown below


Then we create another custom logical table called Product Attributes. In this logical table we we can include all those attributes that do not exist in Essbase. This logical table will be modeled as a separate dimension.

Ensure that the source of this new logical table has the same relational table source as the source that we added in the Product dimension. Now create a dummy Fact logical table called Fact-Products which will enforce the conforming dimension join.


The source for this fact logical table will be the same Products relational source. Now, create complex joins as shown below.

For each fact measure in the Essbase source map it to the total level of the Product – Attributes dimension as shown below. This will ensure that BI Server will combine both the sources together in its own memory.

Now while creating a report we need to ensure that we bring in measures from Essbase as well as our dummy fact. The dummy fact enforces BI Server level join at the product level.

If you look at the SQL queries, you will notice that BI Server will fire 2 separate queries. But the joins across the sources will be through the conforming product dimension.
T2273.PROD_CATEGORY as c2,
T2273.PROD_SUBCATEGORY as c3,
T2273.PROD_NAME as c4,
T2273.PROD_LIST_PRICE as c5,
sum(1) as c6
from
PRODUCTS T2273
where ( T2273.PROD_SUBCATEGORY = 'Accessories' )
group by T2273.PROD_CATEGORY, T2273.PROD_LIST_PRICE, T2273.PROD_NAME, T2273.PROD_SUBCATEGORY, T2273.PROD_TOTAL
order by c1, c2, c3, c4
With
set [Product4] as '{[Product].[Accessories]}'
set [Product5] as 'Generate({[Product4]}, Descendants([Product].currentmember, [Product].Generations(5),SELF), ALL)'
select
{ [Measures].[Quantity]
} on columns,
NON EMPTY {{[Product5]}} properties ANCESTOR_NAMES, GEN_NUMBER on rows
from [SH.SH]
This shows the capability of BI Server in modeling multiple data source scenarios. If you require other dimensions to be part of the report then the dummy fact will have to assigned to the Total level of all the other dimensions (effectively Total levels need to be created in each dimension). For example, the Promotion dimension is shown below


Playing With PARALLEL
I had a DBA strongly discourage me from using PARALLELism on reporting tables (i.e. facts or dimensions). I think he even tried to explain it to me...but it was even more over my head at that time. I think it had something to do with all the processes kicked off then multiplied by the number of users running queries on those tables. "It will bring the database to its knees" he use to say. I never fought much on that because 1, he explained it to me 2, I trusted him and 3, he showed me once all the processes spawned during a typical data load operation. (I thought that was really cool by the way, all those neat little processes doing little chunks of work).
He did however let me (us) use them on staging tables. This is typically where our transformations took place. INSERTs and/or partition swapping.
Anyway, I was reading the 10gR2 docs, the Data Warehousing Guide to be exact on Using Parallel Execution. Lots of neat pictures...which is good for me.

Then I decided to see if I could see what the run times on a SELECT COUNT(*) would be given different levels of parallelism. If you want to know if a table or index has PARALLEL enabled, you need to look at the DEGREE column in %_TABLES and %_INDEXES. I spent about 20 minutes looking for PARALLEL in those tables before I remembered that.
CREATE TABLE big_tableI know have 74,431 records to play with. It's not a lot, I know, I just want to see what happens. I then created an anonymous block to:
AS
SELECT *
FROM dba_objects;
1. loop through and change the PARALLELism starting with 1 up to 8
2. get the start time from DBMS_UTILITY.GET_TIME
3. loop doing 20 SELECT COUNT(*)s on table table
4. get the end time
5. get the total time
6. print it out
7. put it in a spreadsheet to compare the results side by side
DECLAREAnd here are my results:
l_start NUMBER;
l_end NUMBER;
l_total NUMBER;
l_dummy INTEGER;
BEGIN
FOR j IN 1..8 LOOP
EXECUTE IMMEDIATE 'ALTER TABLE big_table PARALLEL ' || j;
dbms_output.put_line( 'PARALLEL (DEGREE): ' || j );
FOR i IN 1..20 LOOP
l_start := dbms_utility.get_time;
SELECT COUNT(*)
INTO l_dummy
FROM big_table;
l_end := dbms_utility.get_time;
l_total := l_end - l_start;
dbms_output.put_line( 'Run ' || LPAD( i, 2, '0' ) || ': ' || LPAD( l_total, 3, '0' ) );
END LOOP;
END LOOP;
END;
/

Sorry that it's an image...I can't seem to get decently formatted text into Blogger lately.
For this silly, unscientific test, PARALLEL set to 1 (i.e. NOPARALLEL) was the fastest all the way down. I don't know what it means or if it's even valid, but hey, I'm trying. I figured that if I posted something way above my head all you nice folks out there would happily correct me if I mistated something. I am avoiding any statements, if you couldn't tell.
ORACLENERD, Inc.
If you read the previous post, you probably gathered that little fact.
All of this is related to the fact that I have a new job. I'm not quite ready to talk about the details yet but I believe there is quite a bit of opportunity here. For one, I'll be working with OBIEE again. It's been about 9 months since I last worked with and I was just getting started. There may also be some work with APEX, which would be great. Finally, there's a very outside shot that I may get to work with Exadata v2 in the somewhat near future.
Maybe you're wondering why I incorporated? I don't know, just seemed like fun. Actually, I'll be working as a consultant. This is my first real trek down that road. I have to learn about billable hours, successful travel planning...and probably most importantly, budgeting.
I'm very excited about this opportunity. I will have to travel a bit, but I'll also get to work from home quite a bit so I believe in the long run, family time will probably be greater than when I was going to that "9 to 5" type job where I was actually away from home more than 60 hours a week. We'll see.
How To Apply For an EIN
You can find the start page here at irs.gov.
This is the second time I have done this, well, second time I have had an EIN. I'm pretty sure the lawyer did it the first time around. If I remember correctly, that cost me a whopping $500, 5 years ago. Of course that included incorporation as well.
I remember how daunting it was back then, starting your own business. I felt like I had to hire a lawyer to do it. The truth is, it's very easy to do. Especially if you consider the sheer volume of documentation that exists on the Oracle database...this is a piece of cake.
It took 10 minutes to complete, but most of that time was spent doing screen captures. Twenty steps in all.
Landing page from the link above:

Begin Application:

Choose the type of business:

"Sub" Corporations:

Confirm your selection:

Why do you need an EIN?

Principal Officer?

Individual details:

Location of corporation:

Verify physical location:

"Accept Database Version" is funny. Do most people understand what that would mean? I guessed that's the USPS version of my address...but I tend to work with these types of scenarios.
Corporation details:

More about the corporation:

What does your corporation do?

I selected other because there was no IT type category on there.
What kind of Other?

What kind of Consulting?

How do you want this delivered?

Summary:

EIN assigned!

Additional information:

Thank you!

Pretty easy all in all. I bet is was a bit harder 5 years ago (i.e. not automated).
Amusingly, this form is only available at certain times of the day. I tried this morning and was told the online assistant didn't work until 7 PM on Sundays. WTF?

APEX: Create Workspace
I don't think it ever truly registered what this means. In talking to my friend and (current) colleague Daniel McGhan (APEX nerd extraordinaire), he reminded me that when you create a new workspace in APEX that it creates a new tablespace as well (if you don't set "Re-user existing schema?" to "Yes") which means a datafile is also created.
Of course I had to see if for myself:
CJUSTICE@TESTING>SELECT file#, name FROM v$datafile;That's what I currently have. I'll create a new workspace now to demonstrate. Starting from the Manage Workspace page I select Create Workspace:
FILE# NAME
---------- ------------------------------------------------------------
1 /u01/app/oracle/oradata/testing/system01.dbf
2 /u01/app/oracle/oradata/testing/sysaux01.dbf
3 /u01/app/oracle/oradata/testing/undotbs01.dbf
4 /u01/app/oracle/oradata/testing/users01.dbf
5 /u01/app/oracle/oradata/testing/example01.dbf
5 rows selected.

I give it a name, APEX_TEST:

On Step 2, I set "Re-user existing schema?" to "No"

Step 3:

Step 4, confirming details:

Confirmation page:

CJUSTICE@TESTING>SELECT file#, name FROM v$datafile;As you can see, the datafile FLOW_1172420773490155.dbf was created which is mapped to the like named FLOW_1172420773490155 tablespace.
FILE# NAME
---------- ------------------------------------------------------------
1 /u01/app/oracle/oradata/testing/system01.dbf
2 /u01/app/oracle/oradata/testing/sysaux01.dbf
3 /u01/app/oracle/oradata/testing/undotbs01.dbf
4 /u01/app/oracle/oradata/testing/users01.dbf
5 /u01/app/oracle/oradata/testing/example01.dbf
6 /u01/app/oracle/oradata/testing/FLOW_1172420773490155.dbf
6 rows selected.
This can obviously be controlled by selecting "Yes" for "Re-user existing schema?", but I just thought it important enough to point out.
Old Skool Crosstabbing
Dodeca Patent Progress
Ease of use means different things to different people. For end users, ease of use means they can explore their data and get the information they need in very obvious ways with only the slightest bit of training. For administrators, it means they can configure the system easily and quickly deliver applications to their users. Finally, for IT departments, the deployment and on-going maintenance of an application must be painless.
One key feature of the Dodeca architecture is the modular design that allows us to very easily plug in new functionality. This modular concept runs deep within Dodeca. In fact, Essbase functionality in Dodeca is a plug-in module. The plug-in concept also allows customers to remove functionality from their distribution. For example, if a customer wants to use only relational databases with Dodeca, they can simply delete the Essbase module files in their deployment.
A second key design aspect of Dodeca is its metadata driven nature. Everything that can be configured within Dodeca is stored as metadata, in a relational database, on our server. When a piece of metadata is required at runtime, it is requested by the Dodeca client via web services and returned, as XML, over the network via HTTP. Metadata transmitted this way includes report definitions, spreadsheet templates, Essbase connection information and toolbars. Similarly, relational data and Essbase data, including outline information, is transmitted via web services.
The use of XML adds significant flexibility to Dodeca. As an added benefit, it makes in-place upgrades easy. We have customers with thousands of users who have upgraded their Dodeca version in about 15 minutes. XML also gives Dodeca flexibility on the Essbase side. A single version of Dodeca works with every version of Essbase from 6.5.3 to the latest 11.1.1.3; we even support multiple retrievals, from any supported version, into the same report.
The modular architecture and the metadata driven aspects of Dodeca also work together. Some customers and partners, such as greenEPM, have leveraged our modular architecture to build extended functionality on the Dodeca Framework. Extensions are built in Visual Studio using C# or VB.Net and packaged using XML descriptors. Custom modules in Dodeca are treated as metadata and are delivered to the client via web-services. We believe this is a unique aspect of our product and is the subject of one of our pending patents.
This week, we received notice from our Intellectual Property attorneys that our patent application has been published; it will be quite a while before it is rewarded. It was a large time investment to complete the patent application, but, hopefully we will be patent-holders in the near future.
ODTUG Board - Elections
- Jeff Jacobs
- John King
- Monty Latiolais
- Bambi Price
- Mark Rittman
- John Jeunette
- Barbara Morris
- Marc de Oliveira
- Mike Riley
Tim Tow will serve out the rest of his term through December of this year. I know I speak for not only myself but the entire Hyperion community when I say, thank you, Tim, for all the hard work you've put into ODTUG. Hyperion developers, administrators, and users finally have a place to call home again, and it was your tireless service that helped get us there. Sincerely, we appreciate it.
If you want to say thank you to Tim as well, nothing says you appreciate his service like buying a copy of Dodeca from AppliedOLAP.
BIP on ADF
Applying Predicates and Partition Pruning to MERGE Target Tables
APEX: Manage Service > Manage Environment Settings
As I mentioned previously, this is probably the most important area for administering your APEX instance.

Messages
There are 2 types of messages you can create, one is the Login message and one is the System Message

The Login message appears at the top of the Workspace Login screen:

The System messages appears after you have logged in.

Feature Configuration
- Application Development

- Allow PL/SQL Program Unit Editing - This pertains to the SQL Workshop. Will you allow end-users (Developers/Administrators), who have access to SQL Workshop, to edit PL/SQL.
- Create demonstration objects in new workspaces - By default, when you create a new workspace, the sample application is automatically installed in the schema that it is mapped to. In my case, CJUSTICE. There are database objects included with this object as well. If you would like to see those database objects and need a script to remove them, read here.
I don't like that it gets created automatically so I typically turn it off.
- SQL Workshop

- SQL Commands Maximum Inactivity in minutes - "Identify the maximum amount of time a transactional command in the SQL Command Processor waits before timing out."
- SQL Scripts Maximum Script Output Size in bytes - "Identify the maximum amount of output a single SQL Script can generate. SQL scripts are run from the SQL Workshop."
- SQL Scripts Maximum Workspace Output Size in bytes - "Identify the maximum amount of space all scripts within a workspace may consume. SQL Script results are the output generated when running SQL Scripts from the SQL command processor."
- SQL Scripts Maximum Script Size in bytes - "Identify the maximum size of a SQL script used within the SQL Workshop."
- Enable Transactional SQL Commands - "Enable Transactional SQL Commands for the entire instance. Enabling this feature permits users of SQL Commands to issue multiple SQL commands within the same physical database transaction."
- Monitoring

- Enable Database Monitoring - "Enable or disable monitoring within the SQL Workshop."
Honestly, I have no idea what kind of monitoring is done here. I haven't been able to find any related reports...which doesn't mean they don't exist, just that I can't find them.
- Security

- Set Workspace Cookie
- Disable Administrator Login - Defaults to No (otherwise, how would you log in?)
- Disable Workspace Login - Defaults to No
- Allow Public File Upload
- Restrict Access by IP Address
- HTTPS

- Require HTTPS - Use a comma delimited list of IP Addresses to restrict, can be either 1 or up to 4 values
- Session Timeout

- Maximum Session Length in Seconds
- Maximum Session Idle Time in Seconds
- Excluded Domains

- Domain Must Not Contain - "Enter domains (not including the port), separated by colons, that should not be allowed in regions of type URL or accessed as a Web service. "
- Account Login Control

- Require User Account Expiration and Locking - "Select Yes to enable Application Express user account expiration and locking features across all workspaces. This selection prevents the same feature from being disabled at the workspace level.
Select No to allow individual workspaces to enable or disable this feature independently.
Application Express user account expiration and locking features apply to end-user accounts created using the Application Express end-user account management interface." - Maximum Login Failures Allowed - "This setting applies to accounts used to access the Application Express administration and development environment only. It does not apply to end-user accounts used by applications developed in Application Express.
However, this setting is used as the default workspace-level "Maximum Login Failures Allowed" preference, which workspace administrators can change. The workspace-level preference applies to the end-user accounts within that workspace." - Account Password Lifetime (days) - Self Explanatory
- Workspace Password Policy

- Minimum Password Length - Self Explanatory
- Minimum Password Differences - "Enter the number of differences required between old and new passwords. The passwords are compared character by character, and each difference that occurs in any position counts toward the required minimum difference.
This setting applies to accounts for workspace administrators, developers, and end use" - Must Contain At Least One Alphabetic Character - Self Explanatory
- Must Contain At Least One Numeric Character - Self Explanatory
- Must Contain At Least One Punctuation Character - Self Explanatory
- Must Contain At Least One Upper Case Character - Self Explanatory
- Must Contain At Least One Lower Case Character - Self Explanatory
- Must Not Contain Username - Self Explanatory
- Must Not Contain Workspace Name - Self Explanatory
- Must Not Contain - colon seperated list of words that cannot be used in the password, default words are: "oracle:hello:welcome:guest:user:database"
- Alphabetic Characters - default: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
- Punctuation Characters - default: "!"#$%&()``*+,-/:;<=>?_"
- Service Administrator Password Policy

- Policy - You can either define the Workspace Password Policy (above) or just use the default strong password policy. For my sandbox versions, I tend to make it as lax as possible so I do not have to remember complex passwords when putting things like this together.
For most organizations, the only 3 areas you need to worry about here are Email, Report Printing and Wallet.
- Self Service

- Provisioning Status - 3 options here: Manual, Request and Email Verification. For most organizations Manual will do just fine. That will allow your end-users to follow the normal protocols set forth in your organization. For hosted sites, like apex.oracle.com, Email Verification is the way to go.
- Development Service URL - Used in conjunction with Request or Email Verification above.
- Email Provisioning

- Email Provisioning - By default, is is enabled, but without the other settings it is useless. I turn this off...but it doesn't really matter for me since I use a sandbox.
- Schema

- Require New Schema - Do you want to require the user to obtain a new schema or reuse an existing one?
- Email

- SMTP Host Address - This is where you set up your internal mail server. All APEX mail calls will use this server by default.
- SMTP Host Port - Self Explanatory
- Administration Email Address - Self Explanatory
- Notification Email Address - Self Explanatory
- Wallet

- Wallet Path - Path to the wallet on the database server.
- Wallet Password - Self Explanatory
- Report Printing

- Print Server - Standard is the Default. To enable pdf printing you need BI Publisher and this is where you enter the server information.
- Print Server Protocol
- Print Server Host Address
- Print Server Port
- Print Server Script
For a How To on setting up APEX and BI Publisher, check out the RittmanMead blog entry Oracle ApEx and BI Publisher. - New Workspace Request Size

If using the Self Service module, this is where you'll set the sizes (and the default) of the workspaces to choose from. You can enable them all or just a few. - Workspace Change Request Size

Just like the New Workspace Request Size section, you can modify the size of workspaces you are offering.
BI Meetup and Appreciation Drinks, UKOUG Tech & EBS Conference, Birmingham
We’re organizing a drinks reception at the Pitcher and Piano Bar on the Tuesday evening of the UKOUG Tech and EBS Conference, which is open to all BI & DW delegates, speakers and partners who’d like to come along. We’ll be there from 6pm to 9pm on Tuesday, 1st December, and we thought it’d be a good chance for anyone interested in BI and data warehousing to get together and share a few stories.
Attendance is free but it’d be good if you can let us know if you’re going to come along. The Pitcher and Piano is in Brindley Place, just across the road from the ICC, and we’ve reserved an area so just look out for the signs. Drop me a line if you plan to come along, or leave a comment on this blog post, and hopefully we’ll see you in Birmingham.
APEX: Drop Sample Application Database Objects
There are 30 database objects created by default along with the Sample Application:
OBJECT_TYPE OBJECT_NAMEIf you have nothing else in your schema, you can run this:
------------------- ----------------------------------------
FUNCTION CUSTOM_AUTH
CUSTOM_HASH
INDEX DEMO_CUSTOMERS_PK
DEMO_CUST_NAME_IX
DEMO_ORDER_ITEMS_PK
DEMO_ORDER_PK
DEMO_ORD_CUSTOMER_IX
DEMO_PAGE_HIERARCHY_PK
DEMO_PRODUCT_INFO_PK
DEMO_USERS_PK
LOB SYS_LOB0000077614C00007$$
SEQUENCE DEMO_CUST_SEQ
DEMO_ORDER_ITEMS_SEQ
DEMO_ORD_SEQ
DEMO_PROD_SEQ
DEMO_USERS_SEQ
TABLE DEMO_CUSTOMERS
DEMO_ORDERS
DEMO_ORDER_ITEMS
DEMO_PAGE_HIERARCHY
DEMO_PRODUCT_INFO
DEMO_STATES
DEMO_USERS
TRIGGER BI_DEMO_USERS
DEMO_ORDER_ITEMS_GET_PRICE
DEMO_USERS_T1
INSERT_DEMO_CUST
INSERT_DEMO_ORDER_ITEMS
INSERT_DEMO_PROD
UPDATE_ORDER_TOTAL
30 rows selected.
DECLAREIf however, you do have other items in there, use this:
l_sql VARCHAR2(32767);
BEGIN
FOR i IN ( SELECT object_name, object_type
FROM user_objects
WHERE object_type NOT IN ( 'LOB', 'INDEX', 'TRIGGER' )
ORDER BY object_type )
LOOP
CASE
WHEN i.object_type = 'TABLE' THEN
l_sql := 'DROP TABLE ' || i.object_name || ' CASCADE CONSTRAINTS PURGE';
ELSE
l_sql := 'DROP ' || i.object_type || ' ' || i.object_name;
END CASE;
EXECUTE IMMEDIATE l_sql;
END LOOP;
END;
/
DROP FUNCTION CUSTOM_AUTH;Triggers, LOBs and Indexes will be dropped along with their associated objects.
DROP FUNCTION CUSTOM_HASH;
DROP SEQUENCE DEMO_ORDER_ITEMS_SEQ;
DROP SEQUENCE DEMO_PROD_SEQ;
DROP SEQUENCE DEMO_ORD_SEQ;
DROP SEQUENCE DEMO_USERS_SEQ;
DROP SEQUENCE DEMO_CUST_SEQ;
DROP TABLE DEMO_STATES CASCADE CONSTRAINTS PURGE;
DROP TABLE DEMO_PAGE_HIERARCHY CASCADE CONSTRAINTS PURGE;
DROP TABLE DEMO_USERS CASCADE CONSTRAINTS PURGE;
DROP TABLE DEMO_ORDER_ITEMS CASCADE CONSTRAINTS PURGE;
DROP TABLE DEMO_ORDERS CASCADE CONSTRAINTS PURGE;
DROP TABLE DEMO_PRODUCT_INFO CASCADE CONSTRAINTS PURGE;
DROP TABLE DEMO_CUSTOMERS CASCADE CONSTRAINTS PURGE;
Oracle BI EE 10.1.3.4.1 – Sub-Totals & Pivot Calculations
One intriguing or not-so obvious feature of BI EE is the way it does its Sub-Totals & Pivot Table calculations. As with any reporting tool, we expect almost every part of a report to be pushed back to the database in the form of native SQL. But there are certain cases wherein we cannot push an entire report in the form of SQL (for example Sub-Totals & Pivot table aggregations). In such cases, BI Server does these calculations in its own memory. Lets look at what this “in-memory calculation” means and how BI Server treats totals & Pivot table calculations. Since this is not documented anywhere, this behavior can be different across different releases (though i think it will work the same in all 10.1.3.* releases)
Sub-Totals:
To demonstrate how Sub-Totals work, lets start with a very simple report shown below

As you see, this table view report has a Grand Total. The question is how does BI Server create this Total. Does it bring the entire data into BI Server memory and then does the sub-total or does it push that back in SQL or does it write to any temporary file.
To understand this, lets start with looking at the logical SQL in the advanced tab of this report.
SELECT Products.PROD_CATEGORY saw_0, Channels.CHANNEL_DESC saw_1, Sales.AMOUNT_SOLD saw_2 FROM "SH - Training Days - Relational" ORDER BY saw_0, saw_1
As you see this logical SQL does not actually represent the true logical SQL since the aggregation or the total part of the query is not anywhere in this SQL. So, to get the exact SQL we have to go to the Manage->Sessions & extract the logical SQL

SELECT Products.PROD_CATEGORY saw_0, Channels.CHANNEL_DESC saw_1, Sales.AMOUNT_SOLD saw_2, REPORT_SUM(saw_2 BY ) FROM "SH - Training Days - Relational" ORDER BY saw_0, saw_1
If you notice, there will be a new function called REPORT_SUM which will be passed to the logical SQL. REPORT_SUM is an aggregation function that BI Server uses to calculate the total (since we had chosen Report Based Total whenever applicable in the table view). One good thing is we can take the exact SQL and fire it in the logical SQL window

As you see BI Server calculates the total as a column based total. There are quite a few functions like REPORT_SUM. I will list some of the common ones here
1. SUM
2. AGGREGATE
3. MIN
4. MAX
5. COUNT
6. COUNTDISTINCT
After calculating the column based total, BI Server will get the first record value in the Report Sum column and display it as Sub-Total. Now that we know what BI Server does for calculating the sub-totals in a table view, lets look at how the BI Server pushes the REPORT_SUM function. If you look at the physical query for the above logical SQL, you will notice that the report level total is not pushed back to the database. So, the question is how BI EE does this total. To determine this lets look at the {OracleBIData}/tmp folder. You will notice that for every query that has a sub-total (with database cache and presentation services cache disabled) BI EE will create temporary files in this tmp directory. The writing of temp files to this directory is governed by 4 settings in the NQSConfig.ini
WORK_DIRECTORY_PATHS = "C:\Oracle\OracleBIData\tmp"; SORT_MEMORY_SIZE = 4 MB ; SORT_BUFFER_INCREMENT_SIZE = 256 KB ; VIRTUAL_TABLE_PAGE_SIZE = 128 KB ;
So, when we created the above report, BI Server had created a set of temp files in the above directory. These tmp files are actually similar to the cache files but are stored seperately. But how they are handled is unfortunately not documented (though we can arrive at certain conclusions by trying various scenarios). In some complex reports, when the sub-total calculation involves a lot of intermediate rows, you will notice a sizeable increase in the size of the tmp files

Pivot Tables:
Pivot table is probably the most widely used Answers component. Though widely used, many a times it can eventually lead to performance bottlenecks due to a lot of in-memory calculations. Most Pivot calculations are done in the BI Server memory and are not pushed back to the database(though we can control where it is executed to an extent). As a generic rule, in any implementation, if the repository is correctly designed, almost 90 to 95% of the reports can be achieved using normal table views(unless all the reports require a specific feature like Sub-Total at the top etc). The most common reason why pivot table is quite popular is the fact that it offers flexibility to end users in slicing/dicing the data. It is treated as an alternative to a multi-dimensional reporting solution.Pivot table can sometimes hog the temp space so much that even cache writing process might fail due to memory allocation errors. For example, lets take the above report and convert it into a pivot table as shown below

If you notice, all i have done is i have moved a couple of columns to the Excluded section of the Pivot table. If you look at the Physical query, the SQL will still contain Product Id and Channel Desc even though they are excluded.
select T24112.PROD_ID as c2,
T24112.PROD_CATEGORY as c4,
T26412.CHANNEL_ID as c6,
sum(T24170.AMOUNT_SOLD) as c8,
sum(T24170.QUANTITY_SOLD) as c9
from
CHANNELS T26412,
PRODUCTS T24112,
SALES T24170
where ( T24112.PROD_ID = T24170.PROD_ID and T24170.CHANNEL_ID = T26412.CHANNEL_ID )
group by T24112.PROD_CATEGORY, T24112.PROD_ID, T24170.CHANNEL_ID, T26412.CHANNEL_ID
order by c6
So what this means is BI Server will do all the aggregations in its own memory or by writing to tmp files. If we look at the tmp file directory

As you see, a sudden 59KB file has come into the tmp file directory due to the amount of custom calculations that the BI Server has to do in its memory. The size of the files directly are related to the amount of calculations & the number of cells retrieved. So, whenever you face an issue like a Write to Tmp directory error etc, then the first place to look at is how much of calculations are being done by the BI Server. Also, as much as possible try to reduce the amount of calculations to be done by the pivot table. As more users start accessing the system, more will the number of tmp files and hence more will be the I/O. This is something to take notice of while creating Pivot Table reports.


