Feed aggregator

What are the types of Data Scientist?

Steve Jones - Tue, 2014-03-11 11:00
There are various views going around on what a Data Scientist is and what their value is to an organisation and the salaries they command.  To me however asking 'what is a Data Scientist?' is like asking 'What is a Physicist?' sure 'someone who studies Physics' might be a factually accurate but pointless definition.  How does that separate someone who did Physics in High School from Albert
Categories: Fusion Middleware

The Impact of Change

Antony Reynolds - Sun, 2014-03-09 16:01
Measuring Impact of Change in SOA Suite
Mormon prophet Thomas S. Monson once said:

When performance is measured, performance improves. When performance is measured and reported, the rate of performance accelerates.

(LDS Conference Report, October 1970, p107)

Like everything in life, a SOA Suite installation that is monitored and tracked has a much better chance of performing well than one that is not measured.  With that in mind I came up with tool to allow the measurement of the impact of configuration changes on database usage in SOA Suite.  This tool can be used to assess the impact of different configurations on both database growth and database performance, helping to decide which optimizations offer real benefit to the composite under test.

Basic Approach

The basic approach of the tool is to take a snapshot of the number of rows in the SOA tables before executing a composite.  The composite is then executed.  After the composite has completed another snapshot is taken of the SOA tables.  This is illustrated in the diagram below:

An example of the data collected by the tool is shown below:

Test NameTotal Tables ChangedTotal Rows AddedNotesAsyncTest11315Async Interaction with simple SOA composite, one retry to send response.AsyncTest21213Async interaction with simple SOA composite, no retries on sending response.AsyncTest31213Async interaction with simple SOA composite, no callback address provided.OneWayTest11213One-Way interaction with simple SOA composite.SyncTest177Sync interaction with simple SOA composite.

Note that the first three columns are provided by the tool, the fourth column is just an aide-memoir to identify what the test name actually did. The tool also allows us to drill into the data to get a better look at what is actually changing as shown in the table below:

Test NameTable NameRows AddedAsyncTest1AUDIT_COUNTER1AsyncTest1AUDIT_DETAILS1AsyncTest1AUDIT_TRAIL2AsyncTest1COMPOSITE_INSTANCE1AsyncTest1CUBE_INSTANCE1AsyncTest1CUBE_SCOPE1AsyncTest1DLV_MESSAGE1AsyncTest1DOCUMENT_CI_REF1AsyncTest1DOCUMENT_DLV_MSG_REF1AsyncTest1HEADERS_PROPERTIES1AsyncTest1INSTANCE_PAYLOAD1AsyncTest1WORK_ITEM1AsyncTest1XML_DOCUMENT2

Here we have drilled into the test case with the retry of the callback to see what tables are actually being written to.

Finally we can compare two tests to see difference in the number of rows written and the tables updated as shown below:

Test NameBase Test NameTable NameRow DifferenceAsyncTest1AsyncTest2AUDIT_TRAIL1

Here are the additional tables referenced by this test

Test NameBase Test NameAdditional Table NameRows AddedAsyncTest1AsyncTest2WORK_ROWS1How it Works

I created a database stored procedure, soa_snapshot.take_soa_snaphot(test_name, phase). that queries all the SOA tables and records the number of rows in each table.  By running the stored procedure before and after the execution of a composite we can capture the number of rows in the SOA database before and after a composite executes.  I then created a view that shows the difference in the number of rows before and after composite execution.  This view has a number of sub-views that allow us to query specific items.  The schema is shown below:

The different tables and views are:

  • CHANGE_TABLE
    • Used to track number of rows in SOA schema, each test case has two or more phases.  Usually phase 1 is before execution and phase 2 is after execution.
    • This only used by the stored procedure and the views.
  • DELTA_VIEW
    • Used to track changes in number of rows in SOA database between phases of a test case.  This is a view on CHANGE_TABLE.  All other views are based off this view.
  • SIMPLE_DELTA_VIEW
    • Provides number of rows changed in each table.
  • SUMMARY_DELTA_VIEW
    • Provides a summary of total rows and tables changed.
  • DIFFERENT_ROWS_VIEW
    • Provides a summary of differences in rows updated between test cases
  • EXTRA_TABLES_VIEW
    • Provides a summary of the extra tables and rows used by a test case.
    • This view makes use of a session context, soa_ctx, which holds the test case name and the baseline test case name.  This context is initialized by calling the stored procedure soa_ctx_pkg.set(testCase, baseTestCase).

I created a web service wrapper to the take_soa_snapshot procedure so that I could use SoapUI to perform the tests.

Sample OutputHow many rows and tables did a particular test use?

Here we can see how many rows in how many tables changed as a result of running a test:

-- Display the total number of rows and tables changed for each test
select * from summary_delta_view
order by test_name;

TEST_NAME            TOTALDELTAROWS TOTALDELTASIZE TOTALTABLES
-------------------- -------------- -------------- -----------
AsyncTest1                   15              0          13
AsyncTest1noCCIS             15              0          13
AsyncTest1off                 8              0           8
AsyncTest1prod               13              0          12
AsyncTest2                   13              0          12
AsyncTest2noCCIS             13              0          12
AsyncTest2off                 7              0           7
AsyncTest2prod               11              0          11
AsyncTest3                   13              0          12
AsyncTest3noCCIS             13          65536          12
AsyncTest3off                 7              0           7
AsyncTest3prod               11              0          11
OneWayTest1                  13              0          12
OneWayTest1noCCI             13          65536          12
OneWayTest1off                7              0           7
OneWayTest1prod              11              0          11
SyncTest1                     7              0           7
SyncTest1noCCIS               7              0           7
SyncTest1off                  2              0           2
SyncTest1prod                 5              0           5

20 rows selected

Which tables grew during a test?

Here for a given test we can see which tables had rows inserted.

-- Display the tables which grew and show the number of rows they grew by
select * from simple_delta_view
where test_name='AsyncTest1'
order by table_name;
TEST_NAME            TABLE_NAME                      DELTAROWS  DELTASIZE
-------------------- ------------------------------ ---------- ----------
AsyncTest1       AUDIT_COUNTER                           1          0
AsyncTest1       AUDIT_DETAILS                           1          0
AsyncTest1       AUDIT_TRAIL                             2          0
AsyncTest1       COMPOSITE_INSTANCE                      1          0
AsyncTest1       CUBE_INSTANCE                           1          0
AsyncTest1       CUBE_SCOPE                              1          0
AsyncTest1       DLV_MESSAGE                             1          0
AsyncTest1       DOCUMENT_CI_REF                         1          0
AsyncTest1       DOCUMENT_DLV_MSG_REF                    1          0
AsyncTest1       HEADERS_PROPERTIES                      1          0
AsyncTest1       INSTANCE_PAYLOAD                        1          0
AsyncTest1       WORK_ITEM                               1          0
AsyncTest1       XML_DOCUMENT                            2          0
13 rows selected

Which tables grew more in test1 than in test2?

Here we can see the differences in rows for two tests.

-- Return difference in rows updated (test1)
select * from different_rows_view
where test1='AsyncTest1' and test2='AsyncTest2';

TEST1                TEST2                TABLE_NAME                          DELTA
-------------------- -------------------- ------------------------------ ----------
AsyncTest1       AsyncTest2       AUDIT_TRAIL                             1

Which tables were used by test1 but not by test2?

Here we can see tables that were used by one test but not by the other test.

-- Register base test case for use in extra_tables_view
-- First parameter (test1) is test we expect to have extra rows/tables
begin soa_ctx_pkg.set('AsyncTest1', 'AsyncTest2'); end;
/
anonymous block completed
-- Return additional tables used by test1
column TEST2 FORMAT A20
select * from extra_tables_view;
TEST1                TEST2                TABLE_NAME                      DELTAROWS
-------------------- -------------------- ------------------------------ ----------
AsyncTest1       AsyncTest2       WORK_ITEM                               1

Results

I used the tool to find out the following.  All tests were run using SOA Suite 11.1.1.7.

The following is based on a very simple composite as shown below:

Each BPEL process is basically the same as the one shown below:

Impact of Fault Policy Retry Being Executed OnceSettingTotal Rows Written Total Tables UpdatedNo Retry1312One Retry1513

When a fault policy causes a retry then the following additional database rows are written:

Table NameNumber of RowsAUDIT_TRAIL1WORK_ITEM1Impact of Setting Audit Level = Development Instead of ProductionSettingTotal Rows Written Total Tables UpdatedDevelopment1312Production1111

When the audit level is set at development instead of production then the following additional database rows are written:

Table NameNumber of RowsAUDIT_TRAIL1WORK_ITEM1Impact of Setting Audit Level = Production Instead of OffSettingTotal Rows Written Total Tables UpdatedProduction1111Off77

When the audit level is set at production rather than off then the following additional database rows are written:

Table NameNumber of RowsAUDIT_COUNTER1AUDIT_DETAILS1AUDIT_TRAIL1COMPOSITE_INSTANCE1Impact of Setting Capture Composite Instance StateSettingTotal Rows Written Total Tables UpdatedOn1312Off1312

When capture composite instance state is on rather than off then no additional database rows are written, note that there are other activities that occur when composite instance state is captured:

Impact of Setting oneWayDeliveryPolicy = async.cache or syncSettingTotal Rows Written Total Tables Updatedasync.persist1312async.cache77sync77

When choosing async.persist (the default) instead of sync or async.cache then the following additional database rows are written:

Table NameNumber of RowsAUDIT_DETAILS1DLV_MESSAGE1DOCUMENT_CI_REF1DOCUMENT_DLV_MSG_REF1HEADERS_PROPERTIES1XML_DOCUMENT1

As you would expect the sync mode behaves just as a regular synchronous (request/reply) interaction and creates the same number of rows in the database.  The async.cache also creates the same number of rows as a sync interaction because it stores state in memory and provides no restart guarantee.

Caveats & Warnings

The results above are based on a trivial test case.  The numbers will be different for bigger and more complex composites.  However by taking snapshots of different configurations you can produce the numbers that apply to your composites.

The capture procedure supports multiple steps in a test case, but the views only support two snapshots per test case.

Code Download

The sample project I used us available here.

The scripts used to create the user (createUser.sql), create the schema (createSchema.sql) and sample queries (TableCardinality.sql) are available here.

The Web Service wrapper to the capture state stored procedure is available here.

The sample SoapUI project that I used to take a snapshot, perform the test and take a second snapshot is available here.

The Impact of Change

Antony Reynolds - Sun, 2014-03-09 16:01
Measuring Impact of Change in SOA Suite
Mormon prophet Thomas S. Monson once said:

When performance is measured, performance improves. When performance is measured and reported, the rate of performance accelerates.

(LDS Conference Report, October 1970, p107)

Like everything in life, a SOA Suite installation that is monitored and tracked has a much better chance of performing well than one that is not measured.  With that in mind I came up with tool to allow the measurement of the impact of configuration changes on database usage in SOA Suite.  This tool can be used to assess the impact of different configurations on both database growth and database performance, helping to decide which optimizations offer real benefit to the composite under test.

Basic Approach

The basic approach of the tool is to take a snapshot of the number of rows in the SOA tables before executing a composite.  The composite is then executed.  After the composite has completed another snapshot is taken of the SOA tables.  This is illustrated in the diagram below:

An example of the data collected by the tool is shown below:

Test Name Total Tables Changed Total Rows Added Notes AsyncTest1 13 15 Async Interaction with simple SOA composite, one retry to send response. AsyncTest2 12 13 Async interaction with simple SOA composite, no retries on sending response. AsyncTest3 12 13 Async interaction with simple SOA composite, no callback address provided. OneWayTest1 12 13 One-Way interaction with simple SOA composite. SyncTest1 7 7 Sync interaction with simple SOA composite.

Note that the first three columns are provided by the tool, the fourth column is just an aide-memoir to identify what the test name actually did. The tool also allows us to drill into the data to get a better look at what is actually changing as shown in the table below:

Test Name Table Name Rows Added AsyncTest1 AUDIT_COUNTER 1 AsyncTest1 AUDIT_DETAILS 1 AsyncTest1 AUDIT_TRAIL 2 AsyncTest1 COMPOSITE_INSTANCE 1 AsyncTest1 CUBE_INSTANCE 1 AsyncTest1 CUBE_SCOPE 1 AsyncTest1 DLV_MESSAGE 1 AsyncTest1 DOCUMENT_CI_REF 1 AsyncTest1 DOCUMENT_DLV_MSG_REF 1 AsyncTest1 HEADERS_PROPERTIES 1 AsyncTest1 INSTANCE_PAYLOAD 1 AsyncTest1 WORK_ITEM 1 AsyncTest1 XML_DOCUMENT 2

Here we have drilled into the test case with the retry of the callback to see what tables are actually being written to.

Finally we can compare two tests to see difference in the number of rows written and the tables updated as shown below:

Test Name Base Test Name Table Name Row Difference AsyncTest1 AsyncTest2 AUDIT_TRAIL 1

Here are the additional tables referenced by this test

Test Name Base Test Name Additional Table Name Rows Added AsyncTest1 AsyncTest2 WORK_ROWS 1 How it Works

I created a database stored procedure, soa_snapshot.take_soa_snaphot(test_name, phase). that queries all the SOA tables and records the number of rows in each table.  By running the stored procedure before and after the execution of a composite we can capture the number of rows in the SOA database before and after a composite executes.  I then created a view that shows the difference in the number of rows before and after composite execution.  This view has a number of sub-views that allow us to query specific items.  The schema is shown below:

The different tables and views are:

  • CHANGE_TABLE
    • Used to track number of rows in SOA schema, each test case has two or more phases.  Usually phase 1 is before execution and phase 2 is after execution.
    • This only used by the stored procedure and the views.
  • DELTA_VIEW
    • Used to track changes in number of rows in SOA database between phases of a test case.  This is a view on CHANGE_TABLE.  All other views are based off this view.
  • SIMPLE_DELTA_VIEW
    • Provides number of rows changed in each table.
  • SUMMARY_DELTA_VIEW
    • Provides a summary of total rows and tables changed.
  • DIFFERENT_ROWS_VIEW
    • Provides a summary of differences in rows updated between test cases
  • EXTRA_TABLES_VIEW
    • Provides a summary of the extra tables and rows used by a test case.
    • This view makes use of a session context, soa_ctx, which holds the test case name and the baseline test case name.  This context is initialized by calling the stored procedure soa_ctx_pkg.set(testCase, baseTestCase).

I created a web service wrapper to the take_soa_snapshot procedure so that I could use SoapUI to perform the tests.

Sample Output How many rows and tables did a particular test use?

Here we can see how many rows in how many tables changed as a result of running a test:

-- Display the total number of rows and tables changed for each test
select * from summary_delta_view
order by test_name;

TEST_NAME            TOTALDELTAROWS TOTALDELTASIZE TOTALTABLES
-------------------- -------------- -------------- -----------
AsyncTest1                   15              0          13
AsyncTest1noCCIS             15              0          13
AsyncTest1off                 8              0           8
AsyncTest1prod               13              0          12
AsyncTest2                   13              0          12
AsyncTest2noCCIS             13              0          12
AsyncTest2off                 7              0           7
AsyncTest2prod               11              0          11
AsyncTest3                   13              0          12
AsyncTest3noCCIS             13          65536          12
AsyncTest3off                 7              0           7
AsyncTest3prod               11              0          11
OneWayTest1                  13              0          12
OneWayTest1noCCI             13          65536          12
OneWayTest1off                7              0           7
OneWayTest1prod              11              0          11
SyncTest1                     7              0           7
SyncTest1noCCIS               7              0           7
SyncTest1off                  2              0           2
SyncTest1prod                 5              0           5

20 rows selected

Which tables grew during a test?

Here for a given test we can see which tables had rows inserted.

-- Display the tables which grew and show the number of rows they grew by
select * from simple_delta_view
where test_name='AsyncTest1'
order by table_name;
TEST_NAME            TABLE_NAME                      DELTAROWS  DELTASIZE
-------------------- ------------------------------ ---------- ----------
AsyncTest1       AUDIT_COUNTER                           1          0
AsyncTest1       AUDIT_DETAILS                           1          0
AsyncTest1       AUDIT_TRAIL                             2          0
AsyncTest1       COMPOSITE_INSTANCE                      1          0
AsyncTest1       CUBE_INSTANCE                           1          0
AsyncTest1       CUBE_SCOPE                              1          0
AsyncTest1       DLV_MESSAGE                             1          0
AsyncTest1       DOCUMENT_CI_REF                         1          0
AsyncTest1       DOCUMENT_DLV_MSG_REF                    1          0
AsyncTest1       HEADERS_PROPERTIES                      1          0
AsyncTest1       INSTANCE_PAYLOAD                        1          0
AsyncTest1       WORK_ITEM                               1          0
AsyncTest1       XML_DOCUMENT                            2          0
13 rows selected

Which tables grew more in test1 than in test2?

Here we can see the differences in rows for two tests.

-- Return difference in rows updated (test1)
select * from different_rows_view
where test1='AsyncTest1' and test2='AsyncTest2';

TEST1                TEST2                TABLE_NAME                          DELTA
-------------------- -------------------- ------------------------------ ----------
AsyncTest1       AsyncTest2       AUDIT_TRAIL                             1

Which tables were used by test1 but not by test2?

Here we can see tables that were used by one test but not by the other test.

-- Register base test case for use in extra_tables_view
-- First parameter (test1) is test we expect to have extra rows/tables
begin soa_ctx_pkg.set('AsyncTest1', 'AsyncTest2'); end;
/
anonymous block completed
-- Return additional tables used by test1
column TEST2 FORMAT A20
select * from extra_tables_view;
TEST1                TEST2                TABLE_NAME                      DELTAROWS
-------------------- -------------------- ------------------------------ ----------
AsyncTest1       AsyncTest2       WORK_ITEM                               1

 

Results

I used the tool to find out the following.  All tests were run using SOA Suite 11.1.1.7.

The following is based on a very simple composite as shown below:

Each BPEL process is basically the same as the one shown below:

Impact of Fault Policy Retry Being Executed Once Setting Total Rows Written Total Tables Updated No Retry 13 12 One Retry 15 13

When a fault policy causes a retry then the following additional database rows are written:

Table Name Number of Rows AUDIT_TRAIL 1 WORK_ITEM 1 Impact of Setting Audit Level = Development Instead of Production Setting Total Rows Written Total Tables Updated Development 13 12 Production 11 11

When the audit level is set at development instead of production then the following additional database rows are written:

Table Name Number of Rows AUDIT_TRAIL 1 WORK_ITEM 1 Impact of Setting Audit Level = Production Instead of Off Setting Total Rows Written Total Tables Updated Production 11 11 Off 7 7

When the audit level is set at production rather than off then the following additional database rows are written:

Table Name Number of Rows AUDIT_COUNTER 1 AUDIT_DETAILS 1 AUDIT_TRAIL 1 COMPOSITE_INSTANCE 1 Impact of Setting Capture Composite Instance State Setting Total Rows Written Total Tables Updated On 13 12 Off 13 12

When capture composite instance state is on rather than off then no additional database rows are written, note that there are other activities that occur when composite instance state is captured:

Impact of Setting oneWayDeliveryPolicy = async.cache or sync Setting Total Rows Written Total Tables Updated async.persist 13 12 async.cache 7 7 sync 7 7

When choosing async.persist (the default) instead of sync or async.cache then the following additional database rows are written:

Table Name Number of Rows AUDIT_DETAILS 1 DLV_MESSAGE 1 DOCUMENT_CI_REF 1 DOCUMENT_DLV_MSG_REF 1 HEADERS_PROPERTIES 1 XML_DOCUMENT 1

As you would expect the sync mode behaves just as a regular synchronous (request/reply) interaction and creates the same number of rows in the database.  The async.cache also creates the same number of rows as a sync interaction because it stores state in memory and provides no restart guarantee.

Caveats & Warnings

The results above are based on a trivial test case.  The numbers will be different for bigger and more complex composites.  However by taking snapshots of different configurations you can produce the numbers that apply to your composites.

The capture procedure supports multiple steps in a test case, but the views only support two snapshots per test case.

Code Download

The sample project I used us available here.

The scripts used to create the user (createUser.sql), create the schema (createSchema.sql) and sample queries (TableCardinality.sql) are available here.

The Web Service wrapper to the capture state stored procedure is available here.

The sample SoapUI project that I used to take a snapshot, perform the test and take a second snapshot is available here.

Pre-digested authentication

Gary Myers - Sun, 2014-03-09 04:03
A bit of a follow-up to my previous post on Digest authentication.

The fun thing about doing the hard yards to code up the algorithm is that you get a deeper level of understanding about what's going on. Take these lines:

    v_in_str := utl_raw.cast_to_raw(i_username||':'||i_realm||':'||i_password);
    v_ha1 := lower(DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw));

Every time we build the "who we are" component for this site, we start with exactly the same hash made up of the username, realm (site) and password. This is a batch routine, which means somewhere we would store the username and password for the site - whether that is a parameter in a scheduling tool, coded into a shell script or OS file, or somewhere in the database. If you've got the security option for Oracle, you can use the Wallet, with its own security layers.

But digest authentication gives us another option. Since we actually use the hashed value of the user/site/password, we can store that instead. The receiving site has no idea the code doesn't actually know the REAL password.

Now turn that over in your head. We can call the web service as this user WITHOUT knowing the password, just by knowing the hash. I don't know about you, but it makes me a little bit more worried when I hear of user details being leaked or hacked from sites. It's all very well reassuring us the passwords are hashed and can't be reverse engineered (assuming your own password can't be brute-forced). But depending on the security mechanism, a leak of those hashes can be dangerous. If a hacked provider advises people to change their passwords, take their advice. 

'Basic' authentication doesn't have the same weakness. In that environment the provider can store the password hash after applying their own 'secret sauce' mechanism (mostly a salt). When you authenticate, you send the password, they apply the secret sauce and compare the result. You can't get away without knowing the password, because all the work is done at their end.

There's no secret sauce for digest authentication, and there can't be. Even if the provider had the password in the clear, there's no way they can be sure the client has the password since all the client needs is the result of the hash. The provider must store, or be able to work out, the result of that hash because they need to replicate the final hash result using both the client and server nonces. They can store either that same user/realm/password hash as is, or they can encrypt it in a reversible manner, but a one-way hash wouldn't be usable.

In short, digest authentication means that our batch routine doesn't need to 'know' the actual password, just a hash. But it also makes those hashes a lot more dangerous.

I'm an amateur in this field. I checked around and it does seem this is a recognized limitation of digest authentication. EG: This Q&A and this comparison of Digest and Basic.

PL/SQL, UTL_HTTP and Digest Authentication

Gary Myers - Fri, 2014-03-07 17:28
For the first time in what seems like ages, I've actually put together a piece of code worth sharing. It's not that I haven't been working, but just that it has all been very 'in-house' specific.

However I had a recent requirement to use a web service that makes use of Digest Authentication. If you have look at the UTL_HTTP SET_AUTHENTICATION subprogram, it only addresses Basic authentication (and, apparently, Amazon S3 which looks intriguing).

In Basic authentication, the username and password get sent across as part of the request. Going through SSL, that doesn't seem too bad, as it is encrypted over the transfer and the certificates should ensure you are talking to the legitimate destination. However if that destination has been compromised, you've handed over your username and password. In an ideal world, the server shouldn't need to know your password, which is why database should only have hashed versions of passwords. 

Outside of SSL, you might as well just print the username and password on the back of a postcard.


In Digest authentication, you get a more complex interaction that keeps the password secret. You ask for a page, the server responds with an "Authentication Required" plus some bits of information including a nonce. You come up with a hashed value based on the server nonce, your own nonce and a hash of your username and password and send it back with the next request. The server has its own record of your username/password hash and can duplicate the calculations. If everyone is happy, the server can fulfill your request and nobody ever actually needs to know the password.

Our server used SSL, and thanks to Tim's article on SSL and UTL_HTTP, it was a simple set up. I've done it before, but that was in the days when it seemed a lot hard to get certificates OUT of a browser to put them in your Oracle Wallet.

The Interwebs were a lot less forthcoming on a PL/SQL implementation of Digest authentication though. The closest I got was this discussion, which can be summed up as "This may be complex, but I do not see these offhand as being impossible to do in PL/SQL....No Digest configured web server nearby or I would definitely have had a bash at this"

A read through the Wikipedia article, and I came up with the code below:

Firstly, after the initial request, go through the header to get the 'WWW-Authenticate' item. Take the value associated with that header, and pass it to the "auth_digest" procedure. 


    l_max := UTL_HTTP.GET_HEADER_COUNT(l_http_response);
    l_ind := 1;
    l_name := '-';
    while l_ind <= l_max AND l_name != 'WWW-Authenticate' LOOP
      UTL_HTTP.GET_HEADER(l_http_response, l_ind, l_name, l_value);
      IF  l_name = 'WWW-Authenticate'
      AND l_http_response.status_code = UTL_HTTP.HTTP_UNAUTHORIZED THEN
        --
        -- Unauthorized. Using the Authorization response header, we can come up with the
        -- required values to allow a re-request with the authentication/authorisation details
        --
        dbms_application_info.set_action('auth:'||$$PLSQL_LINE);
        UTL_HTTP.END_RESPONSE(l_http_response);
        --
        dbms_application_info.set_action('auth_req:'||$$PLSQL_LINE);
        l_http_request := UTL_HTTP.BEGIN_REQUEST(l_server||l_method);
        auth_digest (io_http_request => l_http_request, i_auth_value => l_value,
          i_username => nvl(i_username,'xxxx'), i_password => nvl(i_password,'xxxx'), 
          i_req_path => l_method, i_client_nonce => null);
        dbms_output.put_line($$PLSQL_LINE||':Get Response from authenticated request');
        dbms_application_info.set_action('auth_resp:'||$$PLSQL_LINE);
        l_http_response := UTL_HTTP.GET_RESPONSE(l_http_request);
        dump_resp (l_http_response);
        dump_hdr (l_http_response);
      END IF;
      l_ind := l_ind + 1;

    END LOOP;

The auth_digest starts with an extraction of the 'valuables' from that value string. I've used regular expressions here. I spent time working with grep, awk and perl, and regexes are habit forming.

  procedure extract_auth_items
    (i_text in varchar2,
    o_realm out varchar2, o_qop out varchar2, o_nonce out varchar2, o_opaque out varchar2) is
  begin
    o_realm   := substr(regexp_substr(i_text, 'realm="[^"]+' ),8);
    o_qop     := substr(regexp_substr(i_text, 'qop="[^"]+'   ),6);
    o_nonce   := substr(regexp_substr(i_text, 'nonce="[^"]+' ),8);
    o_opaque  := substr(regexp_substr(i_text, 'opaque="[^"]+'),9);

  end extract_auth_items;

Next is the 'meat' where the values are combined in the various hashes. Yes, there's a hard-coded default client nonce in there that, by a strange coincidence, matches on in the wikipedia article. That's how this stuff gets developed, by following through a worked example. Just like school.

  function digest_auth_md5_calcs
      (i_username     in varchar2, i_password     in varchar2, i_req_path      in varchar2,
      i_realm         in varchar2, i_server_nonce in varchar2,
      i_qop           in varchar2 default 'auth',
      i_client_nonce  in varchar2 default '0a4f113b',
      i_req_type      in varchar2 default 'GET',  i_req_cnt IN NUMBER default 1)
  return varchar2 is
    --
    v_in_str    varchar2(2000);
    v_in_raw    raw(2000);
    v_out       varchar2(60);
    --
    v_ha1       varchar2(40);
    v_ha2       varchar2(40);
    v_response  varchar2(40);
    --
  begin
    --
    v_in_str := i_username||':'||i_realm||':'||i_password;
    v_in_raw := utl_raw.cast_to_raw(v_in_str);
    v_out := DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw);
    v_ha1 := lower(v_out);
    --
    v_in_str := i_req_type||':'||i_req_path;
    v_in_raw := utl_raw.cast_to_raw(v_in_str);
    v_out := DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw);
    v_ha2 := lower(v_out);
    --
    v_in_str := v_ha1||':'||i_server_nonce||':'||lpad(i_req_cnt,8,0)||':'||
                   i_client_nonce||':'||i_qop||':'||v_ha2;
    v_in_raw := utl_raw.cast_to_raw(v_in_str);
    v_out := DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw);
    v_response := lower(v_out);
    --
    return v_response;
  end digest_auth_md5_calcs;

And this is the full auth_digest bit

  procedure auth_digest
    (io_http_request  in out UTL_HTTP.REQ,  i_auth_value    in varchar2,
    i_username        in varchar2,          i_password      in varchar2,
    i_req_path        in varchar2,          i_qop           in varchar2 default 'auth',
    i_req_cnt         in number default 1,  i_client_nonce  in varchar2 default null)
  is
    l_realm         varchar2(400);
    l_qop           varchar2(30);
    l_server_nonce  VARCHAR2(400);
    l_opaque        varchar2(100);
    --
    l_response      varchar2(40);
    l_value         VARCHAR2(1024);
    --
    l_client_nonce  varchar2(30);
    --
  begin
    --
    -- Apply the username / password for Digest authentication
    --
    extract_auth_items (i_auth_value,
                    l_realm, l_qop, l_server_nonce, l_opaque);
    --
    IF i_client_nonce is not null then
      l_client_nonce := i_client_nonce;
    ELSE
      l_client_nonce := lower(utl_raw.cast_to_raw(DBMS_OBFUSCATION_TOOLKIT.md5(
                            input_string=>dbms_random.value)));
    END IF;
    --
    l_response := digest_auth_md5_calcs
      (i_username => i_username, i_password    => i_password,     i_req_path => i_req_path,
      i_realm     => l_realm,    i_server_nonce => l_server_nonce,
      i_client_nonce => l_client_nonce);
    --i_qop default to auth, i_req_type default to GET and i_req_cnt default to 1
    --
    l_value := 'Digest username="' ||i_username          ||'",'||
               ' realm="'          ||l_realm             ||'",'||
               ' nonce="'          ||l_server_nonce      ||'",'||
               ' uri="'            ||i_req_path          ||'",'||
               ' response="'       ||l_response          ||'",'||
               ' qop='             ||i_qop               ||',' ||
               ' nc='              ||lpad(i_req_cnt,8,0) ||',' ||
               ' cnonce="'         ||i_client_nonce      ||'"'
               ;
    --
    IF l_opaque is not null then
      l_value := l_value||',opaque="'||l_opaque||'"';
    END IF;
    dbms_output.put_line(l_value);
    UTL_HTTP.SET_HEADER(io_http_request, 'Authorization', l_value);
    --

  end auth_digest;

A package with the code is available from my CodeSpace page, or directly here. There's a lot of debug 'stuff' in there. The code I'm using is still tailored to my single specific need, and I've stripped specific values from this published variant. You'll need to hard-code or parameterize it for any real use. I may be able to do a 'cleaned-up' version in the future, but don't hold your breath.

BI change is coming, time to get over it and get on with the job

Steve Jones - Fri, 2014-03-07 11:15
One of the things that always stuns me in IT is how people don't appear to like change.  Whether it was the EAI folks pushing back on Web Services in 2000 in favour of their old-school approaches.  The package guys pushing back against SaaS or now the BI guys pushing back against the new wave of BI technologies and approaches the message is always the same: We are happy doing what we are doing,
Categories: Fusion Middleware

Finally...the official sizing guide for Oracle Application Express

Joel Kallman - Thu, 2014-03-06 06:29
The following question was recently posted on an internal mailing list:
"Is there a sizing/capacity/scalability guide available for APEX?"
I'm always fascinated by this question.  I appreciate the fact that this is a standard, acceptable practice in the industry, and people come to expect it.  How else could architects and planners appropriately allocate resources without some form of estimate?  This impacts capital expenditures and budgets and rack space and energy costs and support costs and human capital.  People seem to be looking for some simple formula like:
(X number of pages in an APEX application) * (Y number of concurrent users) = (W number of processors) + (Z number of GB of RAM)
Voila!  Plug that formula into your favorite spreadsheet and away you go.  Well....if I lured you in with the title of this blog post, I have to be honest - it's all fiction.  There is no such thing.  But why not?  There are a number of reasons.

  1. There is no such thing as a representative, typical application.  As I've often bloviated in the past, Oracle Application Express is as fast or as slow as you, the developer, make it.  The overhead associated with the APEX engine itself is fairly static (measured in hundredths of a second). If you have a query that takes 30 seconds to execute and you put this query in a report in an APEX application, you can expect the execution of that page to take just over 30 seconds per page view.

  2. What does "concurrent" mean?  Is that the total number of users in an hour?  Total number of users in a 5-minute interval?  Or is that the high-water mark of number of users all clicking the mouse or hitting the Enter key, all at the same time?

  3. What is the typical "think time" of an end user?  Effectively, resources are only being consumed when there is a request actively being processed by the APEX engine.  So while the end user is interpreting the results of a report or keying in data in a form, they aren't (typically) making any requests to the APEX engine.

  4. How much memory will be consumed by the typical page view?  Does your application allocate GB's of in-memory LOBs, per user per page view?  This would have a definite impact on scalability.
The total number of pages in an application has close to zero correlation to scalability and throughput.  You can have a 1,000-page application, each page with sub-second performance, which will be far more scalable than a 1-page application that consumes 15 seconds per page view.

As the Oracle Database Performance and Tuning Guide states, there are many variables involved in workload estimation, and it's typically done via either benchmarking or extrapolation from a similar system.  But what is "a similar system" for an APEX application?  Does a call-center application at one enterprise approximate the back-office order processing system at another company?

I can understand how a formula can be prepared for a COTS application.  If you're deploying Fusion Applications or the eBusiness Suite or JD Edwards or SAP, those applications are created, the business logic is written, the queries and transactions are crafted, and concurrency has been measured on representative systems for a given workload.  But I don't understand how someone can produce a sizing guide for any application development framework - Application Express, ADF, .NET, Java.  It's like asking "how scalable is C?"

An application that our team wrote and runs for Oracle is quite scalable (the oft-mentioned Aria People employee directory).  Yesterday (05-MAR), there were 2.1M page views on this system with a median page rendering time of 0.03 seconds from 45,314 distinct users.  The busiest hour saw 129,284 page views through the APEX engine (35.9 page views/second).  If another team within Oracle wrote this same system but didn't tune the SQL like we did, is that a reflection on the scalability of APEX?  And if the answer to that question is "no", then is the hardware configuration all that relevant?

Back in 2007, my manager Mike Hichwa took a draft note that I wrote and published an article for  Oracle Magazine entitled "Sizing up Performance".  There is a very simple formula which can be used to estimate the throughput of an APEX application.  This isn't going to help you determine how much hardware to buy or how to estimate the size of your VM, but it will help estimate (in back-of-the-napkin form) how scalable an existing APEX application will be on an existing system.

With all this said, we, on the Oracle Application Express team, have been deficient.  At a minimum, we should have a list of systems developed by our customers, with specific information about the hardware configuration, purpose of the system, and number of end-users served.  Maybe we should also obtain the level of expertise of the developers.  We will gather this information and publish it online (without specific customer names).  If nothing else, this can serve as the foundation for extrapolation by architects and designers.


Unintended, but interesting consequences

Nuno Souto - Thu, 2014-03-06 02:56
It's interesting how from time to time something happens that makes sense and seems logical afterwards, but at the time it causes a bit of a surprise.  Part of the fun of working with this type of software! A few days ago we had an incident in an Oracle DW database when a developer tried to load an infinitely big file from a very large source.  Yeah, you got it: big-data-ish!  Suffice to say: Noonshttp://www.blogger.com/profile/04285930853937157148noreply@blogger.com6

Internal Links

Tim Dexter - Wed, 2014-03-05 20:06

Another great question today, this time, from friend and colleague, Jerry the master house re-fitter. I think we are competing on who can completely rip and replace their entire house in the shortest time on their own. Every conversation we have starts with 'so what are you working on?' He's in the midst of a kitchen re-fit, Im finishing off odds and ends before I re-build our stair well and start work on my hidden man cave under said stairs. Anyhoo, his question!

Can you create a PDF document that shows a summary on the first page and provides links to more detailed sections further down in the document?

Why yes you can Jerry. Something like this? Click on the department names in the first table and the return to top links in the detail sections. Pretty neat huh? Dynamic internal links based on the data, in this case the department names.

Its not that hard to do either. Here's the template, RTF only right now.


The important fields in this case are the ones in red, heres their contents.

TopLink

<fo:block id="doctop" />

Just think of it as an anchor to the top of the page called doctop

Back to Top

<fo:basic-link internal-destination="doctop" text-decoration="underline">Back to Top</fo:basic-link>

Just a live link 'Back to Top' if you will, that takes the user to the doc top location i.e. to the top of the page.

DeptLink

<fo:block id="{DEPARTMENT_NAME}"/>

Just like the TopLink above, this just creates an anchor in the document. The neat thing here is that we dynamically name it the actual value of the DEPARTMENT_NAME. Note that this link is inside the for-each:G_DEPT loop so the {DEPARTMENT_NAME} is evaluated each time the loop iterates. The curly braces force the engine to fetch the DEPARTMENT_NAME value before creating the anchor.

DEPARTMENT_NAME

<fo:basic-link  internal-destination="{DEPARTMENT_NAME}" ><?DEPARTMENT_NAME?></fo:basic-link>

This is the link for the user to be able to navigate to the detail for that department. It does not use a regular MSWord URL, we have to create a field in the template to hold the department name value and apply the link. Note, no text decoration this time i.e. no underline.

You can add a dynamic link on to anything in the summary section. You just need to remember to keep link 'names' as unique as needed for source and destination. You can combine multiple data values into the link name using the concat function.

Template and data available here. Tested with 10 and 11g, will work with all BIP flavors.

Categories: BI & Warehousing

Oracle Direct NFS and Infiniband: A Less-Than-Perfect Match

Don Seiler - Wed, 2014-03-05 19:54
Readers of an earlier post on this blog will know about my latest forays into the world of Direct NFS. Part of that means stumbling over configuration hiccups or slamming into brick walls when you find new bugs.

To quickly re-set the table, my organization purchased the Oracle ZFS Storage Appliance (ZFSSA) 7420. Oracle sold us on the Infiniband connectivity as a way to make a possible future transition to Exadata easier. However the pre-sales POC testing was done over 10gb Ethernet (10gigE). So it was that everything (including their Infiniband switches and cables) arrived at the datacenter and was installed and connected by the Oracle technicians. There were a few initial hiccups and frustrating inconsistencies with their installation and configuration, but those are outside the scope of this post.

We decided to put a copy of our standby database on the ZFSSA and have it run as a second standby. The performance problems were quick to appear, and they weren't pretty.



Configuring SharesWe configured the ZFS project shares by the common Oracle best practices in terms ZFS recordsize and write bias. For example, datafile shares were set to an 8k recordsize (to match the db_block_size) and throughput write bias, where as redo log shares were set to 128k recordsize and latency bias. Note that with Oracle Database 12c, Direct NFS over NFSv4, and the more recent ZFSSA firmware, you gain the benefit of Oracle Intelligent Storage Protocol (OISP), which will determine the recordsize and write bias automatically based on the type of file it recognizes.

Copying the DatabaseTo start out we needed to get a copy of the database onto the ZFSSA shares. This was easily done with RMAN's backup as copy database command, specifying the ZFSSA mount as the format destination. We were fairly impressed with the Direct NFS transfer speed during the copy and so we were optimistic about how it would stand up with our production load.

Starting Recovery!
Once everything was set, we started managed recovery on the standby. Our earlier excitement gave way to a sort of soul-crushing disappointment as the recovery performance basically ground to a standstill and traffic to the ZFSSA went from hundreds of Mbps to barely a trickle. We could stop recovery and copy a big file with great speed, but something in managed recovery was not playing nicely.

We found that we could disable Direct NFS (requires a database restart and software relinking), and managed recovery would actually perform better over the kernel NFS, although still not nearly as well as we would need.

This started a blizzard of SR creations, including SRs being spawned from other SRs. We had SRs open for the ZFSSA team, the Direct NFS team, the Data Guard team, and even the Oracle Linux and Solaris teams, even though we were not on Oracle Linux or Solaris (we use RHEL). It came to a point where I had to tell our account manager to have support stop creating new SRs, since every new SR meant I had to explain the situation to a new technician all over again.

At this point we were having twice-daily conference calls with our account manager and technical leads from the various departments. Their minions were working hard on their end to replicate the problem and find a solution, but we were running into a 4th week of this craziness.

The Infiniband BanditAfter many frustrating weeks of changing configurations, cables, cards, and just generally grasping at straws, it was finally narrowed down to the Infiniband. Or rather, a bug in the open fabric (OFA) linux kernel module that dealt with Infiniband that was triggered when Direct NFS would fire off a whole lot of connections, like when DataGuard managed recover would fire up 80 parallel slaves. We tested out the 10gigE channel we had for the management UI and performance was like night and day with just the one channel.

Oracle Support suggested it might be related to bug 15824316, which also deals with dramatic performance loss with Direct NFS over Infiniband. The bug in the OFA kernel module was fixed in recent versions of Oracle Enterprise Linux (OEL) (specifically the UEK kernel), but Oracle is not sharing this fix with Red Hat (or anyone else, presumably). Since we're on RHEL, we had little choice but to send all the Infiniband networking hardware back and order up some 10gigE replacements.

We're still in the process of getting the 10gigE switches and cables all in place for the final production setup. If you're curious, it's 4 10gigE cards per server, bonded to a single IP to a 10gigE switch into the ZFSSA heads. This 10gigE network is dedicated exclusively to ZFSSA traffic.

So, in the end, if you're on (a recent version of) of OEL/UEK, you should have nothing to worry about. But if you're on RHEL and planning to use Direct NFS, you're going to want to use 10gigE and NOT Infiniband.

Update - 6 Mar 2014Some of have asked, and I want to re-iterate: Oracle have claimed that the OFA module was entirely re-written, and their fix is specific to OEL and is not covered by GPL or any similar license. We were told that they have no plans to share their code with RHEL. Also there is no MOS bug number for the OFA issue, it was apparently re-written from scratch with no bug to track the issue. If this all sounds rather dubious to you, join the club. But it's what our account manager told us at the end of last year.

Another Update - 6 Mar 2014Bjoern Rost and I discussed this privately and after quite a bit of research and analysis he shared this conclusion:

Oracle Support suggested that this issue would not occur with the OFA module used in Oracle Linux with the UEK kernel. RedHat changed their support in RHEL6 from shipping the whole openfabrics stack to just including the drivers that were also present in the upstream mainline kernel. This is RedHat’s policy to ensure stability in the version of the kernel they ship. Oracle offers an OFA package with some additional patches (all GPL/BSD license) for the UEKr1 and UEKr2 kernels. Unfortunately, these two different approaches make it very hard to pinpoint specific patches or create backports for the RedHat kernel version.
Categories: DBA Blogs

The OLAP Extension is now available in SQL Developer 4.0

Keith Laker - Tue, 2014-03-04 14:57


The OLAP Extension is now in SQL Developer 4.0.

See
http://www.oracle.com/technetwork/developer-tools/sql-developer/downloads/sqldev-releasenotes-v4-1925251.html for the details.

The OLAP functionality is mentioned toward the bottom of the web page.
You will still need AWM 12.1.0.1.0 to
  • Manage and enable cube and dimension MV's.
  • Manage data security.
  • Create and edit nested measure folders (i.e. measure folders that are children of other measure folders)
  • Create and edit Maintenance Scripts
  • Manage multilingual support for OLAP Metadata objects
  • Use the OBIEE plugin or the Data Validation plugin
What is new or improved:
  • New Calculation Expression editor for calculated measures.  This allows the user to nest different types to calculated measures easily.  For instance a user can now create a Moving Total of a Prior Period as one calculated measure.  In AWM, it would have required a user to create a Prior Period first and then create a Moving Total calculated measure which referred to the Prior Period measure.  Also the new Calculation Expression editor displays hypertext helper templates when the user selects the OLAP API syntax in the editor.
  • Support for OLAP DML command execution in the SQL Worksheet.  Simply prefix OLAP DML commands by a '~' and then select the execute button to execute them on the SQL Worksheet.  The output of the command will appear in the DBMS Output Window if it is opened, or the Script Output Window if the user has executed 'set serveroutput on' before executing the DML command.
  • Improved OLAP DML Program Editor integrated within the SQL Developer framework.
  • New diagnostic reports in the SQL Developer Report navigator.
  • Ability to create a fact view with a measure dimension (i.e. "pivot cube").  This functionality is accessible from the SQL Developer Tools-OLAP menu option.
  • Cube scripts have been renamed to Build Specifications and are now accessible within the Create/Edit Cube dialog.  The Build Specifications editor there, is similar to the calculation expression editor as far as functionality.
Categories: BI & Warehousing

Latest for the folks who have to deal with Peoplesoft

Nuno Souto - Tue, 2014-03-04 05:51
Dang, been a while since the last posts!  A lot of water under the bridge since then. We've ditched a few people that were not really helping anything, and are now actively looking at cloud solutions, "big data" use, etcetc. Meanwhile, there is the small detail that business as usual has to continue: it's very easy to parrot about the latest gimmick/feature/funtastic technology that will Noonshttp://www.blogger.com/profile/04285930853937157148noreply@blogger.com2

The next big wave of IT is Software Development

Steve Jones - Mon, 2014-03-03 10:25
I can smell a change coming, the last few years have seen cloud and SaaS on the rise and seen a fragmentation in application development (thanks in a large part to the appalling stewardship of Java) and a real focus of budgets around BI and 'vanilla' package approaches.  Now this is a good thing, both because I jumped out of the Java boat onto the BI boat a few years ago but also because its
Categories: Fusion Middleware

Software Development Wave 4: back to the package

Steve Jones - Mon, 2014-03-03 10:20
The end of the next Software Development wave will be when Software development against 'eats itself' as it did with with technologies like Hadoop showing a new value in information, with platforms like SFDC showing new pre-build services, where people like GoodData have turned BI into SaaS.  So we will see the same evolution again and a new generation of commoditisation which drives
Categories: Fusion Middleware

Fun with global temporary tables in Oracle 12c

Mihajlo Tekic - Sun, 2014-03-02 22:07

Few months ago I wrote a post about 12c session specific statistics for global temporary tables (link). Long awaited feature no matter what.

Recently I had some discussions on the same subject with members of my team.

One interesting observation was the behavior of transaction specific GTTs with session specific statistics enabled. What attracted our interest was the fact that data in global temporary tables is not deleted after DBMS_STATS package is invoked.

Prior to 12c, a call to DBMS_STATS will result with an implicit commit. This would wipe out the content of a transaction specific global temporary table.

I’ll digress here a bit. Yes, I know, who would call DBMS_STATS to collect statistics on a transaction specific GTT knowing the data in the table will be lost. Well, things change a bit in 12c.

In Oracle 12c, no implicit commit is invoked when DBMS_STATS.GATHER_TABLE_STATS is invoked on a transaction specific with session specific statistics enabled thus letting users take advantage of session specific statistics for this type of GTTs.

This behavior is documented in Oracle documentation.

I’ll try to put some more light on this behavior through couple of examples:

For this purpose I’ll start with three tables. T1 and T2 are transaction specific temporary tables. T3 is a regular table. By default, in 12c, session specific statistics are used.



CREATE GLOBAL TEMPORARY TABLE t1 (id NUMBER);

CREATE GLOBAL TEMPORARY TABLE t2 (id NUMBER);

CREATE TABLE t3 (id NUMBER);



Scenario #1 – Insert 5 rows to each of the three tables and observe the state of the data after DBMS_STATS is invoked on a transaction specific GTT.



SQL> INSERT INTO t1 (SELECT rownum FROM dual CONNECT BY rownum<=5);
5 rows created.

SQL> INSERT INTO t2 (SELECT rownum FROM dual CONNECT BY rownum<=5);
5 rows created.

SQL> INSERT INTO t3 (SELECT rownum FROM dual CONNECT BY rownum<=5);
5 rows created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(user,'T1');
PL/SQL procedure successfully completed.

SQL> SELECT count(1) FROM t1;
COUNT(1)
----------
5


As you can see the data in T1 is still present. Furthermore if you open another session you can also see that T3 has no rows. This means commit was not invoked when session specific statistics were collected for T1.

Scenario 2# Insert 5 rows in each of the three tables and collect statistics only on the regular table, T3.


SQL> INSERT INTO t1 (SELECT rownum FROM dual CONNECT BY rownum<=5);
5 rows created.

SQL> INSERT INTO t2 (SELECT rownum FROM dual CONNECT BY rownum<=5);
5 rows created.

SQL> INSERT INTO t3 (SELECT rownum FROM dual CONNECT BY rownum<=5);
5 rows created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(user,'T3');
PL/SQL procedure successfully completed.

SQL> SELECT count(1) FROM t1;
COUNT(1)
----------
0


As you can see in this scenario implicit commit was invoked which resulted with data in T1 being purged.

Hope this helps … :-)

Cheers!





“How did you learn so much stuff about Oracle?”

Cary Millsap - Fri, 2014-02-28 22:35
In LinkedIn, a new connection asked me a very nice question. He asked, “I know this might sound stupid, but how did you learn so much stuff about Oracle. :)”

Good one. I like the presumption that I know a lot of stuff about Oracle. I suppose that I do, at least about some some aspects of it, although I often feel like I don’t know enough. It occurred to me that answering publicly might also be helpful to anyone trying to figure out how to prepare for a career. Here’s my answer.

I took a job with the young consulting division of Oracle Corporation in September 1989, about two weeks after the very first time I had heard the word “Oracle” used as the name of a company. My background had been mathematics and computer science in school. I had two post-graduate degrees: a Master of Science Computer Science with a focus on language design and compilers, and a Master of Business Administration with a focus in finance.

My first “career job” was as a software engineer, which I started before the MBA. I designed languages and wrote compilers to implement those languages. Yes, people actually pay good money for that, and it’s possibly still the most fun I’ve ever had at work. I wrote software in C, lex, and yacc, and I taught my colleagues how to do it, too. In particular, I spent a lot of time teaching my colleagues how to make their C code faster and more portable (so it would run on more computers than just one on which you wrote it).

Even though I loved my job, I didn’t see a lot of future in it. At least not in Colorado Springs in the late 1980s. So I took a year off to get the MBA at SMU in Dallas. I went for the MBA because I thought I needed to learn more about money and business. It was the most difficult academic year of my life, because I was not particularly connected to or even interested in most of the subject matter. I hated a lot of my classes, which made it difficult to do as well as I had been accustomed. But I kept grinding away, and finished my degree in the year it was supposed to take. Of course I learned many, many things that year that have been vital to my career.

A couple of weeks after I got my MBA, I went to work for Oracle in Dallas, with a salary that was 168% of what it had been as a compiler designer. My job was to visit Oracle customers and help them with their problems.

It took a while for me to get into a good rhythm at Oracle. My boss was sending me to these local customers that were having problems with the Oracle Financial Applications (the “Finapps,” as we usually called them, which would many years later become the E-Business Suite) on version 6.0.26 of the ORACLE database (it was all caps back then). At first, I couldn’t help them near as much as I had wanted to. It was frustrating.

That actually became my rhythm: week after week, I visited these people who were having horrific problems with ORACLE and the Finapps. The database in 1990, although it had some pretty big bugs, was still pretty good. It was the applications that caused most of the problems I saw. There were a lot of problems, both with the software and with how it was sold. My job was to fix the problems. Some of those problems were technical. Many were not.

A lot of the problems were performance; problems of the software running “too slowly.” I found those problems particularly interesting. For those, I had some experience and tools at my disposal. I knew a good bit about operating systems and compilers and profilers and linkers and debuggers and all that, and so learning about Oracle indexes and rollback segments (two good examples, continual sources of customer frustration) wasn’t that scary of a step for me.

I hadn’t learned anything about Oracle or relational databases in school, I learned about how the database worked at Oracle by reading the documentation, beginning with the excellent Oracle® Database Concepts. Oracle sped me along a bit with a couple of the standard DBA courses.

My real learning came from being in the field. The problems my customers had were immediately interesting by virtue of being important. The resources available to me for solving such problems back in the early 1990s were really just books, email, and the telephone. The Internet didn’t exist yet. (Can you imagine?) The Oracle books available back then, for the most part, were absolutely horrible. Just garbage. Just about the only thing they were good for was creating problems that you could bill lots of consulting hours to fix. The only thing that was left was email and the telephone.

The problem with email and telephones, however, is that there has to be someone on the other end. Fortunately, I had that. The people on the other end of my email and phone calls were my saviors and heroes. In my early Oracle years, those saviors and heroes included people like Darryl Presley, Laurel Jamtgaard, Tom Kemp, Charlene Feldkamp, David Ensor, Willis Ranney, Lyn Pratt, Lawrence To, Roderick Mañalac, Greg Doherty, Juan Loaiza, Bill Bridge, Brom Mahbod, Alex Ho, Jonathan Klein, Graham Wood, Mark Farnham (who didn’t even work for Oracle, but who could cheerfully introduce me to anyone I needed), Anjo Kolk, and Mogens Nørgaard. I could never repay these people, and many more, for what they did for me. ...In some cases, at all hours of the night.

So, how did I learn so much stuff about Oracle? It started by immersing myself into a universe where every working day I had to solve somebody’s real Oracle problems. Uncomfortable, but effective. I survived because I was persistent and because I had a great company behind me, filled with spectacularly intelligent people who loved helping each other. Could I have done that on my own, today, with the advent of the Internet and lots and lots of great and reliable books out there to draw upon? I doubt it. I sincerely do. But maybe if I were young again...

I tell my children, there’s only one place where money comes from: other people. Money comes only from other people. So many things in life are that way.

I’m a natural introvert. I naturally withdraw from group interactions whenever I don’t feel like I’m helping other people. Thankfully, my work and my family draw me out into the world. If you put me into a situation where I need to solve a technical problem that I can’t solve by myself, then I’ll seek help from the wonderful friends I’ve made.

I can never pay it back, but I can try to pay it forward.

(Oddly, as I’m writing this, I realize that I don’t take the same healthy approach to solving business problems. Perhaps it’s because I naturally assume that my friends would have fun helping solve a technical problem, but that solving a business problem would not be fun and therefore I would be imposing upon them if I were to ask for help solving one. I need to work on that.)

So, to my new LinkedIn friend, here’s my advice. Here’s what worked for me:
  • Educate yourself. Read, study, experiment. Educate yourself especially well in the fundamentals. So many people don’t. Being fantastic at the fundamentals is a competitive advantage, no matter what you do. If it’s Oracle you’re interested in learning about, that’s software, so learn about software: about operating systems, and C, and linkers, and profilers, and debuggers, .... Read the Oracle Database Concepts guide and all the other free Oracle documentation. Read every book there is by Tom Kyte and Christian Antognini and Jonathan Lewis and Tanel Põder and Kerry Osborne and Karen Morton and James Morle all the other great authors out there today. And read their blogs.
  • Find a way to hook yourself into a network of people that are willing and able to help you. You can do that online these days. You can earn your way into a community by doing things like asking thoughtful questions, treating people respectfully (even the ones who don’t treat you respectfully), and finding ways to teach others what you’ve learned. Write. Write what you know, for other people to use and improve. And for God’s sake, if you don’t know something, don’t act like you do. That just makes everyone think you’re an asshole, which isn’t helpful.
  • Immerse yourself into some real problems. Read Scuttle Your Ships Before Advancing if you don’t understand why. You can solve real problems online these days, too (e.g., StackExchange and even Oracle.com), although I think that it’s better to work on real live problems at real live customer sites. Stick with it. Fix things. Help people.
Help people.

That’s my advice.

Waterfall Charts

Tim Dexter - Fri, 2014-02-28 19:35

Great question came through the ether from Holger on waterfall charts last night.

"I know that Answers supports waterfall charts and BI Publisher does not.
Do you have a different solution approach for waterfall charts with BI Publisher (perhaps stacked bars with white areas)?
Maybe you have already implemented something similar in the past and you can send me an example."

I didnt have one to hand, but I do now. Little known fact, the Publisher chart engine is based on the Oracle Reports chart engine. Therefore, this document came straight to mind. Its awesome for chart tips and tricks. Will you have to get your hands dirty in the chart code? Yep. Will you get the chart you want with a little effort? Yep. Now, I know, I know, in this day and age, you should get waterfalls with no effort but then you'd be bored right?

First things first, for the uninitiated, what is a waterfall chart? From some kind person at Wikipedia, "The waterfall chart is normally used for understanding how an initial value is affected by a series of intermediate positive or negative values. Usually the initial and the final values are represented by whole columns, while the intermediate values are denoted by floating columns. The columns are color-coded for distinguishing between positive and negative values."

We'll get back to that last sentence later, for now lets get the basic chart working.

Checking out the Oracle Report charting doc, search for 'floating' their term for 'waterfall' and it will get you to the section on building a 'floating column chart' or in more modern parlance, a waterfall chart. If you have already got your feet wet in the dark arts world of Publisher chart XML, get on with it and get your waterfall working.

If not, read on.

When I first starting looking at this chart, I decided to ignore the 'negative values' in the definition above. Being a glass half full kind of guy I dont see negatives right :)

Without them its a pretty simple job of rendering a stacked bar chart with 4 series for the colors. One for the starting value, one for the ending value, one for the diffs (steps) and one for the base values. The base values color could be set to white but that obscures any tick lines in the chart. Better to use the transparency option from the Oracle Reports doc.

<Series id="0" borderTransparent="true" transparent="true"/> 

Pretty simple, even the data structure is reasonably easy to get working. But, the negative values was nagging at me and Holger, who I pointed at the Oracle Reports doc had come back and could not get negative values to show correctly. So I took another look. What a pain in the butt!

In the chart above (thats my first BIP waterfall maybe the first ever BIP waterfall.) I have lime green, start and finish bars; red for negative and green for positive values. Look a little closer at the hidden bar values where we transition from red to green, ah man, royal pain in the butt! Not because of anything tough in the chart definition, thats pretty straightforward. I just need the following columns START, BASE, DOWN, UP and FINISH. 

START 200
BASE 0
UP 0
DOWN 0
FINISH 0
START 0
BASE 180
UP 0
DOWN 20
FINISH 0
START 0
BASE 150
UP 0
DOWN 30
FINISH 0
 Bar 1 - Start Value
 Bar 2 - PROD1
 Bar 3 - PROD2

and so on. The start, up, down and finish values are reasonably easy to get. The real trick is calculating that hidden BASE value correctly for that transition from -ve >> + ve and vice versa. Hitting Google, I found the key to that calculation in a great page on building a waterfall chart in Excel from the folks at Contextures.  Excel is great at referencing previous cell values to create complex calculations and I guess I could have fudged this article and used an Excel sheet as my data source. I could even have used an Excel template against my database table to create the data for the chart and fed the resulting Excel output back into the report as the data source for the chart. But, I digress, that would be tres cool thou, gotta look at that.
On that page is the formula to get the hidden base bar values and I adapted that into some sql to get the same result.

Lets assume I have the following data in a table:

PRODUCT_NAME SALES PROD1 -20 PROD2 -30 PROD3 50 PROD4 60

The sales values are versus the same period last year i.e. a delta value.  I have a starting value of 200 total sales, lets assume this is pulled from another table.
I have spent the majority of my time on generating the data, the actual chart definition is pretty straight forward. Getting that BASE value has been most tricksy!

I need to generate the following for each column:

PRODUCT_NAME

STRT

BASE_VAL

DOWN

UP

END_TOTAL

START
200
0
0
0
0
PROD1
0
180
20
0
0
PROD2
0
150 30 0
0
PROD3
0 150 0 50 0 PROD4
0 200
0 60 0 END
0 0 0 0 260

Ignoring the START and END values for a second. Here's the query for the PRODx columns:

 SELECT 2 SORT_KEY 
, PRODUCT_NAME
, STRT
, SALES
, UP
, DOWN
, 0 END_TOTAL
, 200 + (SUM(LAG_UP - DOWN) OVER (ORDER BY PRODUCT_NAME)) AS BASE_VAL
FROM
(SELECT P.PRODUCT_NAME
,  0 AS STRT
, P.SALES
, CASE WHEN P.SALES > 0 THEN P.SALES ELSE 0 END AS UP  
, CASE WHEN P.SALES < 0 THEN ABS(P.SALES) ELSE 0 END AS DOWN
, LAG(CASE WHEN P.SALES > 0 THEN P.SALES ELSE 0 END,1,0) 
      OVER (ORDER BY P.PRODUCT_NAME) AS LAG_UP
FROM PRODUCTS P
)

The inner query is breaking the UP and DOWN values into their own columns based on the SALES value. The LAG function is the cool bit to fetch the UP value in the previous row. That column is the key to getting the BASE values correctly.

The outer query just has a calculation for the BASE_VAL.

200 + (SUM(LAG_UP - DOWN) OVER (ORDER BY PRODUCT_NAME))

The SUM..OVER allows me to iterate over the rows to get the calculation I need ie starting value (200) + the running sum of LAG_UP - DOWN. Remember the LAG_UP value is fetching the value from the previous row.
Is there a neater way to do this? Im most sure there is, I could probably eliminate the inner query with a little effort but for the purposes of this post, its quite handy to be able to break things down.

For the start and end values I used more queries and then just UNIONed the three together. Once note on that union; the sorting. For the chart to work, I need START, PRODx, FINISH, in that order. The easiest way to get that was to add a SORT_KEY value to each query and then sort by it. So my total query for the chart was:

SELECT 1 SORT_KEY
, 'START' PRODUCT_NAME
, 200 STRT
, 0 SALES
, 0 UP
, 0 DOWN
, 0 END_TOTAL
, 0 BASE_VAL
FROM PRODUCTS
UNION
SELECT 2 SORT_KEY 
, PRODUCT_NAME
, STRT
, SALES
, UP
, DOWN
, 0 END_TOTAL
, 200 + (SUM(LAG_UP - DOWN) 
      OVER (ORDER BY PRODUCT_NAME)) AS BASE_VAL
FROM
(SELECT P.PRODUCT_NAME
,  0 AS STRT
, P.SALES
, CASE WHEN P.SALES > 0 THEN P.SALES ELSE 0 END AS UP  
, CASE WHEN P.SALES < 0 THEN ABS(P.SALES) ELSE 0 END AS DOWN
, LAG(CASE WHEN P.SALES > 0 THEN P.SALES ELSE 0 END,1,0) 
       OVER (ORDER BY P.PRODUCT_NAME) AS LAG_UP
FROM PRODUCTS P
)
UNION
SELECT 3 SORT_KEY 
, 'END' PRODUCT_NAME
, 0 STRT
, 0 SALES
, 0 UP
, 0 DOWN
, SUM(SALES) + 200 END_TOTAL
, 0 BASE_VAL
FROM PRODUCTS
GROUP BY 1,2,3,4,6
ORDER BY 1 

A lot of effort for a dinky chart but now its done once, doing it again will be easier. Of course no one will want just a single chart in their report, there will be other data, tables, charts, etc. I think if I was doing this in anger I would just break out this query as a separate item in the data model ie a query just for the chart. It will make life much simpler.
Another option that I considered was to build a sub template in XSL to generate the XML tree to support the chart and assign that to a variable. Im sure it can be done with a little effort, I'll save it for another time.

On the last leg, we have the data; now to build the chart. This is actually the easy bit. Sadly I have found an issue in the online template builder that precludes using the chart builder in those templates. However, RTF templates to the rescue!

Insert a chart and in the dialog set up the data like this (click the image to see it full scale.)

Its just a vertical stacked bar with the BASE_VAL color set to white.You can still see the 'hidden' bars and they are over writing the tick lines but if you are happy with it, leave it as is. You can double click the chart and the dialog box can read it no problem. If however, you want those 'hidden' bars truly hidden then click on the Advanced tab of the chart dialog and replace:

<Series id="1" color="#FFFFFF" />

with

<Series id="1" borderTransparent="true" transparent="true" />

and the bars will become completely transparent. You can do the #D and gradient thang if you want and play with colors and themes. You'll then be done with your waterfall masterpiece!

Alot of work? Not really, more than out of the box for sure but hopefully, I have given you enough to decipher the data needs and how to do it at least with an Oracle db. If you need all my files, including table definition, sample XML, BIP DM, Report and templates, you can get them here.

Categories: BI & Warehousing

New APEX Certification Exam BETA Now Running

David Peake - Fri, 2014-02-28 18:40
How does a prospective employer  or customer know that you are any good at developing with Oracle Application Express?
One of the best credentials, to prove you have APEX chops, is to obtain Oracle Application Express Certification!

The APEX Certification Exam has undergone a large upgrade.
Much of the content being rewritten / improved as part of this effort.
The new exam is currently running a BETA program until 10-May-2014.
Testing centers are available worldwide.



Help us by taking the BETA Exam and help yourself by saving significant cost over sitting for the exam once published.
The "published" exam will be the exact same questions you take in the BETA exam.
If the question ranks poorly during the beta it may be rewritten or removed.



Pages

Subscribe to Oracle FAQ aggregator