Feed aggregator

Oracle Financials Technical Interview Questions and Score Sheet

Richard Byrom - Thu, 2008-03-13 11:22

Over the last few years I’ve been involved in interviewing a number of Oracle Financials Consultants and have also in the past been sitting on the other side of the desk as an interviewee. The last time I conducted an interview it was more technical in nature and I decided to came up with a list of questions to ask the interviewee. These questions were focused along the lines of people, process and technology and I thought I would write a blog entry with the questions as I’m sure it will be useful for interviewers and interviewee’s alike. In addition to the questions I also developed a score sheet utilising the Kepner Tregoe decision making methodology, an approach I learnt when doing my Masters in Business Leadership (MBL). This score sheet allows the interviewer to develop an objective means of evaluating the answers to interview questions rather than a subjective one and can be modified to suit a number of different scenarios. I haven’t included any answers to the questions as in many cases there is no right answer, I prefer to see if someone understands the concepts and can explain them well. Having said that, if any one wants to come up with a suggested answer sheet I will gladly append it to the questions listing. Hopefully this question listing and score sheet will provide readers with a baseline/guideline that can be used in any type of interview.

Technology Questions General Ledger
  • Explain the organisational concepts of business group, set of books, legal entity, operating unit and inventory organization and the hierarchical structure relating to these.
  • What are the three C’s in relation to a set of books? 
  • Explain the difference between a key flexfield and descriptive flexfield
  • What is the difference between a cross validation rule and a security rule? 
  • What is a value set and what are the different types of value sets that you will find within Oracle?
Accounts Payable
  • Describe the Key Configuration decisions you would need to undertake when implementing the Accounts Payable module.
  • Briefly describe the Supplier Invoice to Payment process within Oracle.
Accounts Receivable
  • Describe the Key Configuration decisions you would need to undertake when implementing the Accounts Receivable module.
  • Briefly describe/draw the Customer Invoice to Cash process within Oracle
  • Describe the Key difference in setting up accounting for Accounts Receivable and Accounts Payable
  • What happens when you have a receipt in Oracle that can’t be applied to an invoice?
Fixed Assets
  • Describe the key configuration decisions you would need to undertake when implementing the fixed assets module.
  • Briefly describe/draw the asset additions process within Oracle.
Cash Management
  • Describe the key configuration decisions you would need to undertake when implementing the cash management module.
  • Outline the Statement to Reconciliation process within Oracle with particular reference to the different options you might want to consider when reconciling.
All Modules

Describe/draw the sequence of events you would undertake in closing the various financials modules.

People Questions
  • Draw a typical structure relating to project implementation and describe the role each person would play within a project.
  • Describe how to implement an effective training strategy within an implementation – include the timing of training relative to the project lifecycle.
  • Within an implementation describe a Change Management strategy that you would put in place.
Process Questions
  • Describe/draw the main phases of an implementation lifecycle and the key milestones and deliverables within each of these phases.
  • Describe the key differences between a process and work instruction
  • Within a project describe the different test phases that you might go through and what level of testing would be undertaken.
  • Describe a typical process you would implement for managing Change Requests and Bug requests within your organization.

Web 2.0 Expo + APEX = Me!

Carl Backstrom - Wed, 2008-03-12 14:18
Well if you didn't know Oracle is a Platinum Sponsor of Web 2.0 Expo, here's the page on the Oracle Wiki.

The interesting part to this story, for me, is that Oracle has a booth that will be showcasing different ways that Oracle leveraging the Web 2.0 technologies/techniques/philosophies , and Application Express is one of those technologies, and I get to run the Application Express part of the booth!

There is will be unconference type discussions at the booth, and considering how the OOW unconference saved me before I'm looking forward to doing another one.

Between the people behind AppsLab and mix.oracle.com , XML DB , Semantic Web and more it looks like it will be very interesting conference, even if I just circle the Oracle booth all day, which of course I won't how else will I be able to ......... borrow some new good ideas :).

If your going stop by and say hello.

Fair warning to people that don't know about my booth skills, don't slow down by me and look interested , confused or ask for directions. If you do your going to get a 5 min APEX demo and then there will be another life long APEX developer.

Plus since it is in SF at some point I get to go to one of my favorite restaurants in the world, House of Nanking

Role Based Access Control (RBAC), Oracle User Management (UMX) and Release 12

Solution Beacon - Tue, 2008-03-11 10:21
There are three dynamics that come together in Release 12 to produce inflection points where increased levels of access may not have adequate enough controls surrounding them. First and foremost is the emergence of OAFramework applications within the Release 12 footprint which is depicted in the chart to the left.One of the reasons this is so important is that OAFramework applications look to

Troubleshooting done to make root.sh work after a 10gR2 CRS (10.2.0.1) installation on HP-UX PA RISC 64-bit OS

Gaurav Verma - Mon, 2008-03-10 08:48

PrefaceThis posting describes the troubleshooting done to make root.sh work after a 10gR2 CRS (10.2.0.1) installation on HP-UX PA RISC 64-bit OS. We were upgrading from 9i (9.2.0.6) to 10gR2.
The problem(root@testdb):/stage11i/software# /u01/oracle/testdb/CRS/root.sh
WARNING: directory '/u01/oracle/testdb' is not owned by root
WARNING: directory '/u01/oracle' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up

Setting the permissions on OCR backup directory
Setting up NS directories
PROT-3: Failed to create export file
Failed to upgrade Oracle Cluster Registry configuration


The fix was mentioned in Bug 5487374 -- TST&PERF:CRS UPGRADE NEEDS TO CREATE OCR BACKUP DIR DURING OUT OF PLACE UPGRADE

We had to create sub-directory  "crs" under /u01/oracle/testdb/CRS/cdata/crs10g and re-run root.sh. The root.sh expects to find a subdirectory for the previous RAC cluster's name.

More trouble..There were some more errors while running root.sh
...
...
Format of 1 voting devices complete.

/u01/oracle/testdb/CRS/install/rootconfig[938]: /sbin/init.d/init.crs: Execute permission denied.
/u01/oracle/testdb/CRS/install/rootconfig[941]: /sbin/init.d/init.cssd: Execute permission denied.
CRS daemons not set to start. See /var/adm/messages for details.


(root@testdb):/u01/oracle/testdb/CRS/cdata/crs10g# echo $?
1


Fix:

As per the SR 6807470.992 we performed the following action Plan.

Login as root and grant the following permissions -

# cd /sbin/init.d
# chmod 755 init.crs
# chmod 755 init.crsd
# chmod 755 init.cssd
# chmod 555 init.evmd


Then check the permissions are correct -

(root@testdb):/sbin/init.d# ls -rlt init*
-rwxr-xr-x 1 root sys 2226 Mar 3 18:32 init.crs
-rwxr-xr-x 1 root sys 4845 Mar 3 18:32 init.crsd
-rwxr-xr-x 1 root sys 36798 Mar 3 18:32 init.cssd
-rwxr-xr-x 1 root sys 3185 Mar 3 18:32 init.evmd


When we tried to execute root.sh we are getting following warning:

(root@testdb):/stage11i/software# /u01/oracle/testdb/CRS/root.sh
WARNING: directory '/u01/oracle/testdb' is not owned by root
WARNING: directory '/u01/oracle' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured


At this point, we edited the file /var/opt/oracle/scls_scr/testdb/oratest1/cssfatal from disable to enable in both the nodes (there is only 1 line in this file)

(root@testdb):more /var/opt/oracle/scls_scr/testdb/oratest1/cssfatal   
enable


Also give permissions as follows:

(root@testdb):/var/opt/oracle# chmod -R 777 /var/opt/oracle/scls_scr

Then started the CRS.

# $ORA_CRS_HOME/bin/crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly

Backups, cheaper and faster?

Hampus Linden - Sun, 2008-03-09 18:21
When did backups become easy and affordable?
I've been sort of slacking off in the backup game in the last year or so, our backups have been managed by other parts of the company and some backups have been outsourced to BSP's. I recently spec and deploy a basic Netbackup solution for our office servers at work and was amazed how extremely simple the whole thing was. The first backup system I worked with was Legato Networker (now owned by EMC) and it was at the time a pretty great product but it was quite difficult to manage, lots of things needed to be scripted and setting up a new tape library required quite a lot of tweaking.
Secondly, the budget was at first glance fairly limited but when we speced up what we needed and got a couple of quotes in I was amazed of how much we actually got for our money.
I wanted basic disk staging then duplication to tape. The obvious choice for tape was LTO4, we get pretty good price / performance, market leading capacity and it's a solid standard.
Most suppliers actually quoted the same tape library, just branded by different vendors. HP, Dell, IBM. All where IBM 3573 tape libraries, it's a excellent and affordable library. Dell came in best in price as usual. We opted for SAS attached. The backup server, my first choice server these days, is the Dell PowerEdge 2950. I buy it for pretty much 75% of all requirements. Speced out with a pair of Quad-Core 1.86GHz processors, 4Gb RAM, 6x500Gb SATA drives (in RAID5) for the disk staging pool and a SAS5 to attach the library.
Software wise the server is running CentOS 5.1 and Symantec/Vertias Netbackup 6.5 “Linux Starter Edition (LSE)”. The LSE is a pretty good deal, you get a server license, tape library license and 5 client licenses (with bare metal, open file etc.) for a fairly low cost.
Installing and configuring Netbackup was as easy as it gets, the tape library was detected on boot, no sysctl.conf changes needed. Stepped through the netbackup installation process, added the tape library (less than 5 clicks!!), defined a couple of tape label policies, created a 2Tb disk staging pool with my new tape group as the target, just add water (ehm, clients actually).

The total cost was under £13.000 (about $22.000), installation time was less than one day, included was:

  • Veritas Netbackup 6.5 Linux starter edition
  • Netbackup clients 5-pack

  • Netbackup Exchange agent

  • Dell PowerEdge 2950 server

  • Dell PowerVault TL2000 library w/ one LTO4 SAS drive

  • 25x LTO4 tapes

  • 3x LTO cleaning tapes

  • CentOS Linux 5.1 (free)

Job Vs Position

RameshKumar Shanmugam - Sun, 2008-03-09 16:23
As a Functional consultant the first thing that we should decide before we can design the solution for a customer is whether the system going to be a Single Business Group or Multi Business Group.

The second main important thing we need to decide is whether the customer is going to use Job or position.

When we put this question to the customer the first expected question for the Customer side would be what is the difference between Job and Position

The content in this blog is more of my own view and the simplistic approach I always like. review the documentation before you can decide on the approach you want to take

To explain it in a very high level
Jobs are Generic Title or Role within a Business Group, independent of any single organization. Required. Usually more specific if positions are not used.

Position are Specific occurrence of one job, fixed within an organization. Not required

If you are in US legislation your job will drive you FLSA and EEO reporting. Personally I feel maintaining the Position is hard in an unstructured organization. Position Hierarchy will suite for the University/college/School and Government Organization

The Maintenance is more in the Position hierarchy than in the Job. If your customer feel they need less maintenance activity then you should recommend Job not the Position
Categories: APPS Blogs

Horrible practice of storing users passwords

Pawel Barut - Sun, 2008-03-09 13:24
Written by Paweł Barut
I've read today terrifying article about un-ethical storage of users password. It is often common practice that users passwords are stored unencrypted in database. But this case is even more dreadful. It about software, G-Archiver, that is available to everyone. If you ever used this software, you should change your G-mail password right now.
So what is the problem:
  • G-Archiver has build in user-name and password for one of g-mail accounts.
  • Whenever someone uses G-Archiver and provides his credentials for g-mail, user-name and password is send to author of this software
  • Additionally, any one who finds out the user-name and password stored in G-Archiver, can get passwords of thousand of previous users.
Go and read. Hope there are not too many of software that works like this one. It is really terrifying, that some people are so irresponsible. It could be well planed action to collect gmail users and passwords, or it could be just lack of imagination what are the consequences. I would like to believe that it was the second one.

But also big corporations do not care for security enough. It is common practice, that users password is stored internally as plain text. It can be easily found out by using "I forgot my password" feature. If in return you will get email with your original password, then it is a crappy web side. And you should never reuse password passed to such web side as this is big risk for you. Take a look at Password Security: It’s Not That Hard (But You Still Can’t Get It Right) for further explanation and examples.

Cheers Paweł

--
Related Articles on Paweł Barut blog:
Categories: Development

Troubleshooting case study for 9i RAC ..PRKC-1021 : Problem in the clusterware

Gaurav Verma - Sat, 2008-03-08 09:20

Preface This is a troubleshooting case study for a 9i RAC environment on HP-UX PA RISC 64 bit, which was using HP ServiceGuard solution (third party) for cluster maintenance.
The problem The problem was that the srvctl command was giving the following error:

(oracle@rac2):  gsdctl stat
PRKC-1021 : Problem in the clusterware
Failed to get list of active nodes from clusterware

(oracle@rac1):  srvctl config
PRKC-1021 : Problem in the clusterware

(oracle@rac1):  pwd
/u01/oracle/uatdb/9.2/bin
(oracle@rac1):  ./lsnodes
rac1
rac2
app1
app2


Troubleshooting Approach Followed note 178683.1 to edit $ORACLE_HOME/bin/gsd.sh like this and traced gsd.sh:

# added -DTRACING.ENABLED=true -DTRACING.LEVEL=2 before -classpath
exec $JRE -DPROGRAM=gsd -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath $CLA
SSPATH oracle.ops.mgmt.daemon.OPSMDaemon $MY_OHOME


$ $ORACLE_HOME/bin/gsd.sh
...
...
[main] [13:15:42:852] [line# N/A]  my property portnum=null
[main] [13:15:42:942] [NativeSystem.<init>:123]  Detected Cluster
[main] [13:15:42:946] [NativeSystem.<init>:124]  Cluster existence = true
[main] [13:15:42:947] [UnixSystem.<init>:118]  Going to load SRVM library
[main] [13:15:42:950] [UnixSystem.<init>:118]  loaded libraries
[main] [13:15:42:950] [OPSMDaemon.main:726]  Initializing the daemon ...
[main] [13:15:42:951] [OPSMDaemon.<init>:188]  NativeDebug is set to true
[main] [13:15:42:952] [OPSMDaemon.<init>:188]  UnixSystem.initializeDaemon: groupName is opsm
[main] [13:15:42:953] [OPSMDaemon.<init>:188]  Unsatisfied Link Error caught. Could not initialize the cluster
[main] [13:15:42:954] [OPSMDaemon.main:726]  initializeDaemon status = false
[main] [13:15:42:955] [OPSMDaemon.main:726]  Failed to initialize and register with clusterware
[main] [13:15:42:955] [OPSMDaemon.main:726]  OPSMErrCode = 1003
[main] [13:15:42:958] [OPSMDaemon.main:729]  java.rmi.RemoteException: Unable to initialize with the clusterware
java.rmi.RemoteException: Unable to initialize with the clusterware
        at oracle.ops.mgmt.daemon.OPSMDaemon.<init>(OPSMDaemon.java:195)
        at oracle.ops.mgmt.daemon.OPSMDaemon.main(OPSMDaemon.java:726)

[main] [13:15:42:958] [line# N/A]  Exiting from main..no point trying to start the daemon


At this point, one option was to initialize the srvconfig raw device (OCR) and then add the RAC instances etc manuall using srvctl add command:

(oracle@rac1):  srvconfig -init
java.lang.UnsatisfiedLinkError: readRawObject
        at oracle.ops.mgmt.nativesystem.UnixSystem.readRawObject(UnixSystem.java:410)
        at oracle.ops.mgmt.rawdevice.RawDevice.readMagicString(RawDevice.java:187)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.readVersionString(RawDeviceVersion.java:175)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.isValidConfigDevice(RawDeviceVersion.java:75)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.<init>(RawDeviceUtil.java:147)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.main(Compiled Code)
Exception in thread "main" (oracle@rac1): 

(oracle@rac1):  srvconfig  -version  
java.lang.UnsatisfiedLinkError: readRawObject
        at oracle.ops.mgmt.nativesystem.UnixSystem.readRawObject(UnixSystem.java:410)
        at oracle.ops.mgmt.rawdevice.RawDevice.readMagicString(RawDevice.java:187)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.readVersionString(RawDeviceVersion.java:175)
        at oracle.ops.mgmt.rawdevice.RawDeviceVersion.isValidConfigDevice(RawDeviceVersion.java:75)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.<init>(RawDeviceUtil.java:147)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.main(Compiled Code)

If the config file is pointing to a raw device the following type of output should be returned:

     $ raw device version "9.0.0.0.0"

Since we were not getting that output, there was either a problem in accessing the OCR raw device or the soft link was not working due to a permissions issue. Outputting the contents of the OCR on the standard output using $ dd if=/dev/orar1/rrawuat.conf bs=1500 showed that the OCR device was readable all right.

$  more /var/opt/oracle/srvConfig.loc
srvconfig_loc=/dev/orar1/rrawuat.conf

$ ls -l /dev/orar1/rrawuat.conf
crw-rw----   1 oracle    dba         64 0x110004 Apr 11  2007 /dev/orar1/rrawuat.conf


Then one idea was to relink the srv* binaries using the make command, but that also resulted in error:

(oracle@rac2):  cd $ORACLE_HOME/srvm/lib  
(oracle@rac2):  make -f ins_srvm.mk install
nm -gp /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.a | grep T  | grep Java | awk '{ print "-u " $3 }' >  /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def;
/u01/oracle/uatdb/9.2/bin/echodo ld +s -G -b -o libsrvm.sl -c
/u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.a                           -L/u01/oracle/uatdb/9.2/lib32/ -L/u01/oracle/uatdb/9.2/srvm/lib/  -L/usr/lib -lc  -lclntsh -lwtc9 -lnls9  -lcore9 -lnls9 -lcore9  -lnls9 -lxml9 -lcore9
-lunls9 -lnls9 /opt/nmapi/nmapi2/lib/libnmapi2.sl -lm `cat /u01/oracle/uatdb/9.2/lib/sysliblist` ;
rm /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def
ld +s -G -b -o libsrvm.sl -c /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.def /u01/oracle/uatdb/9.2/srvm/lib/libsrvm.a -L/u01/oracle/uatdb/9.2/lib32/ -L/u01/oracle/uatdb/9.2/srvm/lib/
-L/usr/lib -lc -lclntsh -lwtc9 -lnls9 -lcore9 -lnls9 -lcore9 -lnls9 -lxml9 -lcore9 -lunls9 -lnls9 /opt/nmapi/nmapi2/lib/libnmapi2.sl -lm -l:libcl.sl -l:librt.sl -lpthread -l:libnss_dns.1 -l:libdld.sl
ld: Mismatched ABI (not an ELF file) for -lclntsh, found /u01/oracle/uatdb/9.2/lib32//libclntsh.sl
Fatal error.
*** Error exit code 1

Stop.

The turning point Saw bug 6281672 and kind of got hint from there.

Compared the file $ORACLE_HOME/lib32/libsrvm.sl on another RAC system (duat).

(oracle@duat1):  ls -l $ORACLE_HOME/lib*/libsrvm*
-rwxr-xr-x   1 oracle    dba          57344 Nov  7 20:04 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl
-rwxr-xr-x   1 oracle    dba          57344 Nov  7 08:11 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl0
-rwxr-xr-x   1 oracle    dba          36864 Nov  7 20:04 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl
-rwxr-xr-x   1 oracle    dba          36864 Nov  7 08:11 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl0
(oracle@duat1): 


On rac1/2, the /u01/oracle/uatdb/9.2/lib32/libsrvm.sl file was missing. Saw that the .sl and .sl0 files were copied of each other.

so on rac1/2, did the following:

$ cd /u01/oracle/uatdb/9.2/lib32
$ cp libsrvm.sl0 libsrvm.sl


Before:

(oracle@rac2):  ls -l $ORACLE_HOME/lib32/libsrvm*
-r-xr-xr-x   1 oracle    dba          57344 Oct 17  2004 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl0
-rwxr-xr-x   1 oracle    dba          36864 Feb  9 06:49 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl
-r-xr-xr-x   1 oracle    dba         106496 Nov 30  2004 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl0


After:

(oracle@rac2):  ls -l $ORACLE_HOME/lib32/libsrvm*
-r-xr-xr-x   1 oracle    dba          57344 Mar  5 15:14 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl
-r-xr-xr-x   1 oracle    dba          57344 Oct 17  2004 /u01/oracle/uatdb/9.2/lib32/libsrvm.sl0
-rwxr-xr-x   1 oracle    dba          36864 Feb  9 06:49 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl
-r-xr-xr-x   1 oracle    dba         106496 Nov 30  2004 /u01/oracle/uatdb/9.2/lib32/libsrvmocr.sl0

The fact that the size of libsrvmocr.sl0 was not matching with libsrvmocr.sl did not seem to be a showstopper. The Oracle system seemed to be taking libsrvm.sl into account for srv* related commands.

After this, the gsdctl and srvctl commands started working:

(oracle@rac1):  gsdctl stat                                             
GSD is running on the local node

(oracle@rac1):  srvctl config
uat

(oracle@rac2):  gsdctl stat
GSD is not running on the local node
(oracle@rac2): 

(oracle@rac2):  gsdctl start
Successfully started GSD on local node

(oracle@rac2):  srvctl status database -d uat
Instance uat1 is not running on node rac1
Instance uat2 is not running on node rac2
But there was "just" one more problem.. Interestingly, srvctl stop database/instance worked fine at this point, but srvctl start did not.

(oracle@rac2):  srvctl start instance -d uat -i uat1
PRKP-1001 : Error starting instance uat1 on node rac1
ORA-00119: invalid specification for system parameter local_listener
ORA-00132: syntax error or unresolved network name 'uat1' reserved.
ORA-01078: failure in processing system parameters local_listener
ORA-00132: syntax error or unresolved network name 'uat1' reserved.

(oracle@rac2):  srvctl start instance -d uat -i uat2
PRKP-1001 : Error starting instance uat2 on node rac2
ORA-00119: invalid specification for system parameter local_listener
ORA-00132: syntax error or unresolved network name 'uat2' reserved.
ORA-01078: failure in processing system parameters
SQL> Disconnected
ORA-00132: syntax error or unresolved network name 'uat2' reserved.

initUAT2.ora:
-------------
uat1.local_listener='uat1'
uat2.local_listener='uat2'
uat1.remote_listener='uat2'
uat2.remote_listener='uat1'

(oracle@rac1):  strings spfileuat1.ora | grep listener
uat1.local_listener='uat1'
uat2.local_listener='uat2'
uat1.remote_listener='uat2'
uat2.remote_listener='uat1'

Tnsping utility was working for both UAT1 and UAT2 service names..

(oracle@rac2):  tnsping uat1
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:19:36

Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac1.test.com)(PORT=1522)) (CONNECT_DATA= (SERVICE_NAME=uat) (INSTANCE_NAME=uat1)))
OK (0 msec)

(oracle@rac2):  tnsping uat2
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:19:39

Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac2.test.com)(PORT=1522)) (CONNECT_DATA= (SERVICE_NAME=uat) (INSTANCE_NAME=uat2)))
OK (0 msec)

Added entries for local_listener and remote_listener to $TNS_ADMIN/tnsnames.ora on rac1 (and rac2 resp) and made sure that tnsping to them was working:


local_listener=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=rac2.test.com)(PORT=1522))
        )
remote_listener=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=rac1.test.com)(PORT=1522))
        )

(oracle@rac2):  tnsping local_listener
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:44:05

Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac2.test.com)(PORT=1522)))
OK (10 msec)


(oracle@rac2):  tnsping remote_listener
TNS Ping Utility for HPUX: Version 9.2.0.6.0 - Production on 05-MAR-2008 16:44:13
Copyright (c) 1997 Oracle Corporation.  All rights reserved.

Used parameter files:
/u01/oracle/uatdb/9.2/network/admin/uat2_rac2/sqlnet_ifile.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=rac1.test.com)(PORT=1522)))
OK (10 msec)

The final clincher I saw the light when I realized that the Database ORACLE_HOME was using autoconfig, which meant that the $TNS_ADMIN variable had $CONTEXT_NAME in it, and was not just plain vanilla $ORACLE_HOME/network/admin. Therefore, the listener.ora and tnsnames.ora files in $TNS_ADMIN were not the same as those in $ORACLE_HOME/network/admin

To get around this issue graciously, srvctl setenv command could be used:

$ srvctl setenv instance -d uat -i uat1 -t TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat1_rac1


$ srvctl getenv instance -d uat -i uat1
TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat1_rac1

$ srvctl setenv instance -d uat -i uat2 -t TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat2_rac2

$ srvctl getenv instance -d uat -i uat2
TNS_ADMIN=/u01/oracle/uatdb/9.2/network/admin/uat2_rac2


(oracle@rac1):  srvctl start instance -d uat -i uat1

(oracle@rac1):  srvctl status instance -d uat -i uat1
Instance uat1 is running on node rac1

(oracle@rac1):  srvctl start instance -d uat -i uat2

(oracle@rac1):  srvctl status instance -d uat -i uat2
Instance uat2 is running on node rac2
Some pending issues.. Some issues were still pending, pointing towards some more mis-configuration, but at least we were able to get over the initial error.

(oracle@rac1):  srvconfig  -version
java.lang.NullPointerException
        at java.text.MessageFormat.format(Compiled Code)
        at java.text.MessageFormat.format(MessageFormat.java)
        at java.text.MessageFormat.format(MessageFormat.java)
        at oracle.ops.mgmt.nls.MessageBundle.getMessage(MessageBundle.java:225)
        at oracle.ops.mgmt.rawdevice.RawDeviceUtil.main(Compiled Code)


SOA on SOA !! - Bring the discipline of SOA to service development and creation in your organization.

Arvind Jain - Fri, 2008-03-07 18:22

SOA on SOA!!

It was difficult to put the most appropriate words to my thoughts but what I am trying to bring out is that SOA implementation should not burden the service providers and consumers to go through the burden of learning all the latest standards, tools & technologies.

They should just worry about their business logic and there should be a framework which transparently takes care of making that business logic a service as in SOA world while adhering to their company's enterprise policies, processes and standards.

How to enable this? Enterprise architects should closely watch two upcoming standards - SCA & JBI.

JBI is JSR 208 and called as Java Business Integration. SCA is Service component architecture.

JBI is used by system integrators during physical deployment (customers and end users do not see this). It helps in management & interoperability of your SOA infrastructure.

SCA has a design and composition perspective. It is used by developers to annotate or put notes in their code to describe service and their dependencies.

The aim is to create a virtual container for hosting services. This way services can be plugged into ESB or into an existing Policy Manager. It will be independent of language and will help as a framework for exposing business logic as service.

The other significant benefits I see are
- Consistent deployment & management
- Location Transperancy (Virtualization)
- Policy Enforcement
- Consistent Security Model
- SOA does not means every developer needs to know about WSDL or WS-* or other standards. They need to know the core business logic.
- It might possibly help in transaction coordination.

So let us try to use our own methodology SOA to help in implementation & adoption of SOA.

Arvind

Upgrade from Jinitiator 1.3 to Java Plugin 1.6.0.x

Aviad Elbaz - Fri, 2008-03-07 05:51

Lately Oracle announced the end of Error Correction Support for Jinitiator 1.3 for E-Business Suite 11i, effective July 2009.

This is the sign it’s about time to upgrade to the native Java Plug-in… :-)

Among other things, one of the main advantages of upgrading from Jinitiator to the native Java Plug-in is the prevention of conflicts between them.

This upgrade is great news to all are working with Oracle Discoverer Plus (with Java plug-in) and Oracle EBS 11i (with Jinitiator) and experiencing those conflicts.

I’ll skip all the others advantages and disadvantages of upgrading to Java Plug-in as they are well described in Steven Chan’s post - Jinitiator 1.1.8 To Be Desupported for Apps 11i and in Metalink Note: 290807.1 - Upgrading Sun JRE with Oracle Applications 11i.

So I will focus on the upgrade process itself - step by step.

I tested the upgrade on the following test environment:

  • EBS 11.5.10.2
  • Database 10.2.0.3
  • ATG Rollup 5
  • Developer 6i patchset 18
  • OS RHEL4.

Be aware that before upgrading to Java Plug-in you must upgrade to Developer 6i patchset 18 or later (currently the latest patchset is 19).

* You can use my previous post in order to Upgrading Developer 6i with Oracle Apps 11i to patchset 18.

  1. Download JRE plug-in Oracle E-Business Suite interoperability patch - 6863618
  2. Download the Sun JRE Plug-in 
    • Select Java Runtime Environment (JRE) 6 Update X (select the latest available update, currently it’s 5)
    • Select Windows offline installation, multi-language
  3. Rename the downloaded installation file jre-6_uX-windows-i586-p.exe to j2se1600x.exe
    In my case rename jre-6_u5-windows-i586-p.exe to ==>> j2se16005.exe
  4. Copy the j2se1605.exe file to $COMMON_TOP/util/jinitiator on the Apps Tier node
  5. If you are on Developer 6i patchset 18 you should apply forms patches 6195758 & 5884875.
    ** Skip this step if you are on Developer 6i patchset 19.
    • Download Patches 6195758 & 5884875
    • Apply patch 6195758
      • Stop all applications processes by adstpall.sh
      • Unzip p6195758_60827_GENERIC.zip
      • cd 6195758
      • cp -r $ORACLE_HOME/forms60/java/oracle/forms/
        handler/UICommon.class $ORACLE_HOME/forms60/java/oracle/forms/
        handler/UICommon.class.PRE_BUG6195758
      • cp -r $ORACLE_HOME/forms60/java/ oracle/forms
        /handler/ComponentItem.class $ORACLE_HOME/forms60/java/oracle/forms/
        handler/ComponentItem.class.PRE_BUG6195758
      • cp oracle/forms/handler/UICommon.class $ORACLE_HOME/forms60/java/oracle/forms/
        handler/UICommon.class
      • cp oracle/forms/handler/ComponentItem.class $ORACLE_HOME/forms60/java/oracle/forms/
        handler/ComponentItem.class
    • Apply Patch 5884875
      • Unzip p5884875_60827_GENERIC.zip
      • cd 5884875
      • cp -r $ORACLE_HOME/forms60/java/oracle/forms/engine/Main.class $ORACLE_HOME/forms60/java/oracle/forms/
        engine/Main.class.PRE_BUG5884875
      • cp -r $ORACLE_HOME/forms60/java/ oracle/forms/
        handler/AlertDialog.class $ORACLE_HOME/forms60/java/oracle/forms/
        handler/AlertDialog.class.PRE_BUG5884875
      • cp oracle/forms/engine/Main.class $ORACLE_HOME/forms60/java/oracle/forms/engine/Main.class
      • cp oracle/forms/handler/AlertDialog.class $ORACLE_HOME/forms60/java/oracle/forms/
        engine/AlertDialog.class
      • Run adadmin -> Generate Applications Files menu -> Generate product JAR files
  6. Apply the Interoperability patch 6863618
    • Make sure all applications processes are down
    • Enable maintenance mode:
      Execute adadmin -> Change Maintenance Mode (5) -> Enable Maintenance Mode (1)
    • Unzip p6863618_11i_GENERIC.zip
    • Cd 6863618
    • Run adpatch to apply patch driver u6863618.drv
    • cd [PATCH_NUMBER]/fnd/bin
    • Execute the following command where X represent the update number:
      $ txkSetPlugin.sh 1600X
      In my case:
      $ txkSetPlugin.sh 16005
    • Disable maintenance mode:
      Execute adadmin -> Change Maintenance Mode (5) -> Disable Maintenance Mode (2)
  7. Start all applications processes by adstrtall.sh
  8. Verify installation by sign in Oracle EBS and select a forms based responsibility.

For those who worry about performance, take a look at this benchmark:
Benchmark comparison test with Jinitiator 1.3.1.23 and Java Plug-in 1.5.0_05 – performance whitepaper

For more information - Note: 290807.1 - Upgrading Sun JRE with Oracle Applications 11i

Aviad

Categories: APPS Blogs

Do-Gooderitis

Mary Ann Davidson - Thu, 2008-03-06 09:26

You know there are too many labor-saving devices in the world when you see the sheer number of professional do-gooders trying to solve problems hardly anybody else worries about. If you have a day job, having someone with too much free time tell you why you need to be concerned about "Making the World Better Through FOO" is often just about as irritating as those old TV commercials moaning about ugly yellow wax buildup on your kitchen floors (my solution: paint your kitchen walls yellow to match the floor).

 

There are, of course, many people who devote time and passion to making the world a better place. I'm not talking about them here. I am talking about the people who seize on something they care about without bothering to find out if there is an actual problem that needs to be solved. Or, if there is a "problem," asking what the cost is of fixing it and what one could do with those same resources that might solve a more pressing problem (a concept known as "opportunity cost" to economists). It's all you can do, when confronted with an earnest but clueless do-gooder, not to say, "Ask me if I care."

 

Where I live in Idaho, there are a couple of professional Do-Gooder Projects that engender a lot of whining in the local newspapers. One of them is the Relocate the Airport lobby. The claim is that we need to 1) build a new airport 2) with longer landing strips 3) so that larger commercial planes will fly here. (Never mind the fact that commercial airlines have said repeatedly they will not land larger planes here because there isn't enough demand to support it.) There isn't actually a problem the community needs to solve via a new airport, but we'd create a bunch of new problems, like people having to drive an hour or more to get to Sun Valley instead of the current half hour from Friedman Memorial Airport.

 

The other local Do-Gooder Project relates to "affordable housing." Mind you, there is no actual housing shortage in town: if you want to work here, you can easily find an affordable place to rent. Many people who work here who want to own property live in another county - where they can get a lot more land for a lot less money. The idea that anyone who works here - regardless of income - should be entitled to own a free-standing home isn't reasonable given market (and geographic) realities (e.g., the land around us is Bureau of Land Management land and cannot be developed). As one of my friends put it to a local Affordable Housing Do-Gooder: "You didn't live next door to your gardener in Marin, either."

 

My personal opinion is that a lot of these do-gooders retired early, miss running something and want to run everyone else in town by solving problems that don't exist.

 

There are Do-Gooder Initiatives in the IT industry, too, a number of which are in security.  Security Do-Gooder Initiatives sometimes come under the guise of a laundry list of 5,000 things that everyone should do to be more secure. Even if all 5,000 of those things are somewhat useful, just like New Year's Resolutions, they are likely to be more actionable and "accomplishable" if the list is shorter. Putting it differently, I know very well that I should eat less, exercise more, eat more nutritious food, read better books, improve my skate skiing technique by lengthening my glide and so on. I can't actually process 5,000 "should dos" so I try to parse them down to a smaller list of things that I can actually do that will also make the most difference to my health, my skate skiing, or whatever it is I am trying to improve upon. Many Do-Gooder Initiatives do not have any sense of "nobody can do everything all at once, so maybe doing something now and doing more later is a better way to slice the pie." The initiatives fail due to the expectations - and failure to prioritize - that they entail. You might actually just give up from the frustration of trying to comply with 5,000 "shoulds." 

 

(It turns out that the people who actually do make good on their New Year's Resolutions start with a small, actionable list instead of a 30-page life plan. A small list of things you can do and will do is better than a much larger list of things that you are never going to get to. Less really is more.)

 

The reality is that some things matter more than others if you are trying to make constructive change. If I drink a bottle of wine a night (I don't) and have 40 "better health things" I want to do, saving my liver might be among the most important ones. So maybe, trying to cut down to a glass or so a night would be the biggest payoff on my better health things list and I can skip the other 39 items or relegate them to next year. Unfortunately, there are a lot of Do-Gooder Initiatives that not only have too many things on the list; the list is not weighted at all for where the value is in making change. (Opportunity cost again: what could I do with the same resources that would have a bigger payoff?)

 

I wonder if a lot of Do-Gooders get out enough in the real world. Maybe they are academics who think "theory" is enough. ("Theory" of baking doesn't get you a pie.) Or think-tankers who are paid to develop secure Internet toaster protocols that they then want to standardize. (Does anybody really worry about who is accessing their bagels remotely?)

 

Whenever I participate in public-private partnerships where a lot of "improve security" initiatives are generated and where there is typically a broad tent of participants (a good thing, in general), I try to ask that the people putting the laundry lists together grab someone who is either a cost accountant or an economist to look at where the bang for the buck goes in what's being proposed. Because if they do not do that, these initiatives are doomed to fail. Or, they will be so expensive that nobody does them because they can't afford the entire megillah.

 

The one take-away lesson I got from my nerdy quantitative methods class in business school is that when you are trying to solve an optimization problem, you can't optimize on all parameters. Time is constrained. Resources are (ultimately) constrained. Answering the question, "How can do X while making best use of scarce resources?" means I need to take account of what I most want to accomplish and how valuable is it to me that I accomplish those things.

 

For example, there are security initiatives around "what metrics and artifacts at every stage of development you should produce to 'prove' assurance claims."  People measuring the assurance of software believe that there are things you ought to be able to produce and measure at each stage of development. However, there is a cost to producing metrics and artifacts. If the cost of producing these is greater than the value of more information, you shouldn't put the work in to produce them. Even if everything has some value, some things are more critical than others or provide greater value for the work you put into getting them. One of the way I tranche our metrics project is to look at a) what can we data mine today to give us security metrics? b) what else would we like to know (in some order)? c) what will it cost to get that information? and d) is the cost less than or greater than the benefit of the information?

 

If you are a small company, maybe you can't - in the beginning - do every single Best Practice Recommendation (or produce every single metric or every single artifact that anybody in a theoretically perfect world would want). But you can do something, and you'd be willing to do something if someone helped you by telling you what the most important things are to do first that make the biggest impact. Something is almost always better than nothing.

 

Even people who know they ought to do more in security - and are willing to improve - will fight tooth and nail if they are confronted with a "my way or the highway" mandate that takes little account of real world constraints.

 

For example, consider the Federal Desktop Core Configuration (FDCC), a recent initiative to mandate that US Federal agencies lock down their environments to a specific Windows configuration (which, as a matter of course, means packaged applications will need to run on those locked down Windows configurations). I have said often and publicly that I think one of the easiest things vendors can do to help improve security is to lock down default configurations - better security out-of-the-box, cheaper lifecycle cost for customers. I've also said that one of the things customers can do to be "smart buyers" is to insist that their vendors lock down default configurations: "You don't ask; you don't get." I don't have any issue with the goodness of this concept (and we have a company-wide initiative related to locking down default configurations). In that sense, FDCC is not a "Do-Gooder Initiative" the way I've defined it since it actually does address a problem that people worry about, that needs looking after.

 

The problem with the way FDCC has been mandated is that it did not, first of all, define what a "desktop" configuration is. Is it desktop software? Or anything installed on the Microsoft operating system (which can and is used on desktops)? There might be a huge (and legitimate) difference between the configuration of middleware or servers on Windows and the client piece of an application configured on Windows. There's certainly a big scope difference between "validating how client pieces of applications running on desktops are configured to run with FDCC" and "validating how every single component of every application that runs on Windows is configured with FDCC." What problem, exactly, is it that is being solved? "Desktops used to launch attacks?" or "locking down the Windows operating system for every single application running on it?" Nobody knows, especially since this is called a "desktop" configuration initiative, and nobody on the mandate side of this issue has yet answered that basic question.

 

Most vendors have product lifecycles such that they do not make configuration changes in anything other than a major product release. That is, when customers uptake patch sets, their expectation is that there won't be configuration changes that could break their existing applications. One time in almost 20 years at Oracle, I tried to change a configuration parameter in a patch set (for good security reasons). The configuration change broke all our business applications, so we backed it out before the patch set shipped and I've been apologizing to the release manager ever since. (We later made the configuration change in a major product release.) Unfortunately, FDCC was mandated without adequately taking into account vendors' product lifecycles. Some vendors simply will need more time to phase in needed configuration changes. A lot more, if your major release product lifecycle is years and not months.

 

Nobody was evil-minded here, but even people who support the idea of FDCC are dead in the water until they can get some basic questions answered and a dialogue going. Ideally, this dialogue should have taken place before FDCC was mandated. Industry (including Oracle) is still working to try to get clarification on the specifics of FDCC and also asking that in future these types of configuration mandates be developed with industry and with adequate phase-in that allows for product lifecycles. How you implement change is as important as what the change is if you want people to move the security ball down the field. Otherwise, even a worthy initiative like FDCC can sink into the morass of Do-Gooder Projects.

 

A better example (where really, "there is no there there," to quote Gertrude Stein) is the recent proposal to develop an ISO standard for vulnerability disclosure. I know of no vendor who thinks this is a good idea. For a start, what problem are we trying to solve? Does anybody think that we can come up with a one-size-fits-all standard for how long it should take to fix a security bug, the exact "rules" on how much information gets put into security advisories and the specific format of how that vulnerability information is expressed? Software vendors have different release cycles, customer bases, risk profiles, and more. (One-size-fits-all pantyhose, as any woman knows, only fits Hilda Mae Throckmorton of Muncie, Indiana.) There are plenty of industry guidelines for good practice on vulnerability disclosure already. Most of these acknowledge that you can't standardize this business practice any more than you can standardize apple-pie making ("Allspice? Death to infidels!"). There are also existing standards on vulnerability disclosure that vendors are adopting, such as the Common Vulnerability Scoring System (CVSS). Oracle was an early adopter of CVSS and customers have told us that it's really useful to them.

 

It is unwise (no, make that "really stupid") to try to standardize what is in effect both a business process and a set of business practices. Ira Gershwin (who knew he was a security maven?) penned the perfect lyric that applies to this Unneeded Standard Attempt: "You say po-TAY-to, I say po-TAH-to, let's call the whole thing off."

 

I offer one last example that isn't quite in line with Do-Gooder Initiatives but relates to what problem to solve and at what price. It's also a big pet peeve of mine: I get a lot of phone calls from vendors trying to shill their security products to Oracle. (Though I do not have operational security responsibility - wonderful, capable colleagues look after that - vendors assume that since my title is "CSO," I am the person who buys Cool Security Products for the IT department.)

 

I hate to mention how many cold callers do not even do basic homework before trying to sell me true love and security happiness. My favorite was the cold caller who said his firm had expertise in securing Oracle Applications deployments. I had to point out to him that, "Uh, we are Oracle, we run on Oracle Applications, and since we build the software, we'd be unlikely to hire a third party to 'securely deploy' it for us." Or, the vendors selling solutions that run on a non-Oracle database. You know, that's just a religious problem for us: we are not going to deploy a third party security solution that runs on <insert name of competitor database here>.

 

My basic pet peeve is the people who do not think about the customer perspective before they launch into their "cure cancer, raise the dead, protect against every attack known to mankind with zero false positive" shill. They claim this shill will only be "twenty minutes of your time" (only "twenty minutes" is measured on a calendar, not a watch).

 

Forthwith, here is my script for parsing through shill-meisters as quickly as possible:

 

1. "What problem does this solve?" (If you can't articulate that in 25 words or less, do not waste my time or anyone else's.)

2. "Is it a problem we are worried about or care about solving?" (Secure remote bagel access is not something that concerns me, so forget the 'Internet Toaster Protocol' pitch.)

3. and 4. "Does it address the problem better, cheaper or faster than what I am doing now? How much better, cheaper or faster?" (If it doesn't, why would I switch from something that may not be sexy or "a breakthrough technology" but gets the job done? I don't have an electric salad tosser, either, because the salad spinner I have - or a pair of tongs - works just fine and has fewer moving parts.)

5. "How can it be broken?"  (Especially for a security product, knowing and being honest about how it can be broken is important. A claim of "zero false positives," for example, should cause anyone to run screaming in the opposite direction.)

 

Do-Gooders, the next time you come up with A Cause, a small request. Please, in the interests of making it a better world without wasting everyone else's time, use your skills on a problem that really needs a solution (or on a better, faster, or cheaper way of solving an existing problem), not on a solution in search of a problem to solve.

 

For More Information:

 

Book of the week: Hog Pilots, Blue Water Grunts by Robert Kaplan (who also wrote Imperial Grunts). If you want to know what the military really does, this is a great read. Robert Kaplan was embedded with a number of different types of units, in multiple services, around the globe: special forces, marines, aviators, and submariners. A really cool read. Mahalo nui loa, all you soldiers, sailors, airmen and marines for keeping us safe.

 

http://www.amazon.com/Hog-Pilots-Blue-Water-Grunts/dp/1400061334

 

We aren't going to have "oldies" rap stations anytime in the future. If anybody has written a more clever lyric than Ira Gershwin (OK, maybe Cole Porter) I have yet to hear it. Songs with lyrics by Ira Gershwin:

 

http://en.wikipedia.org/wiki/Category:Songs_with_lyrics_by_Ira_Gershwin

 

Totally off topic, but Go! Airlines has just done a web page where you can book your next interisland trip totally in Hawaiian. E ola mau ka 'olelo Hawai'i (May the language of Hawai'i live!).

 

Check it out at:

 

http://www.lelegowau.com/

 

When all else fails, follow instructions.

Claudia Zeiler - Tue, 2008-03-04 18:26
  1. Today I tried to open my test database and got:

ORA-16038: log 3 sequence# 729 cannot be archived
ORA-19809: limit exceeded for recovery files
ORA-00312: online log 3 thread 1: 'C:\ORACLE\ORA10\ORADATA\DB10\REDO03.LOG'

Researching the problem I found:

  • Maybe your Oracle instance has reached the maximum db recovery file destination file size assigned to it? If that is so, probably your oracle instance will stop working and you will get the 3 error messages listed above while trying to start it.
  • The relevant parameter is: DB_RECOVERY_FILE_DEST_SIZE - specifies (in bytes) the hard limit on the total space to be used by target database recovery files created in the flash recovery area.
  • It works in concert with DB_RECOVERY_FILE_DEST which specifies the default location for the flash recovery area. The flash recovery area contains multiplexed copies of current control files and online redo logs, as well as archived redo logs, flashback logs, and RMAN backups.

  1. What is the maximum db recovery file destination file size assigned in my DB?

SQL> show parameter db_recovery_file_dest_size
db_recovery_file_dest_size 2G

At least this wasn't a surprise.


  1. I removed excess files

I shutdown the database. I then removed the archivelog files and the flashback files from the file system – even though they didn't amount to the 2gb in total that the system is complaining about. (This is a play database so I can do things like kill these files.)

  1. Still had a problem.

I brought the database up to mount state, but immediately there was a trace file message:

ORA-19815: WARNING: db_recovery_file_dest_size of 2147483648 bytes is 99.85% used, and has 3320320 remaining bytes available.

*** 2008-03-03 10:03:19.635

************************************************************

You have following choices to free up space from flash recovery area:

1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard, then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating system command was used to delete files, then use RMAN CROSSCHECK and DELETE EXPIRED commands.

  1. The files were still being seen by RMAN! When all else fails, follow instructions, in this case #4 above. “4. Delete unnecessary files using RMAN DELETE command. If an operating system command was used to delete files, then use RMAN CROSSCHECK and DELETE EXPIRED commands.”

RMAN> crosscheck archivelog all;
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=146 devtype=DISK
validation failed for archived log
archive log filename=C:\ORACLE\FLASH_RECOVERY_AREA\DB10\ARCHIVELOG\2008_03_03\O1
_MF_1_733_3WRQ14DY_.ARC recid=70 stamp=648389669

validation failed for archived log
archive log filename=C:\ORACLE\FLASH_RECOVERY_AREA\DB10\ARCHIVELOG\2008_03_03\O1
_MF_1_734_3WRQ4V7R_.ARC recid=71 stamp=648389787


What nice list of all the files that aren't there, and all the locations where they aren't; 68 files that don't exist! [apologies to any reader of English as a second language]

RMAN> delete expired archivelog all;

released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=146 devtype=DISK
List of Archived Log Copies
Key Thrd Seq S Low Time Name
------- ---- ------- - --------- ----
70 1 733 X 02-MAR-08 C:\ORACLE\FLASH_RECOVERY_AREA\DB10\ARCHIVELOG\2
008_03_03\O1_MF_1_733_3WRQ14DY_.ARC

71 1 734 X 03-MAR-08 C:\ORACLE\FLASH_RECOVERY_AREA\DB10\ARCHIVELOG\2
008_03_03\O1_MF_1_734_3WRQ4V7R_.ARC

(etc)

Do you really want to delete the above objects (enter YES or NO)? yes

deleted archive log
archive log filename=C:\ORACLE\FLASH_RECOVERY_AREA\DB10\ARCHIVELOG\2008_03_03\O1
_MF_1_733_3WRQ14DY_.ARC recid=70 stamp=648389669

deleted archive log
archive log filename=C:\ORACLE\FLASH_RECOVERY_AREA\DB10\ARCHIVELOG\2008_03_03\O1
_MF_1_734_3WRQ4V7R_.ARC recid=71 stamp=648389787


(etc)

Deleted 68 EXPIRED objects


  1. Now all is well. (with the database if not with this blog-entry's numbers). I bounced the database and the alert log shows:

db_recovery_file_dest_size of 2048 MB is 0.38% used. This is a user-specified limit on the amount of space that will be used by this database for recovery-related files, and does not reflect the amount of space available in the underlying filesystem or ASM diskgroup.




Guest Blog: Smart Space from an end-user’s perspective

Oracle EPM Smart Space - Mon, 2008-03-03 13:54
I had the pleasure of being an early user of Smart Space while Mike, Matt and the team were building it.

I’ve been using software since the days of Wordstar, dBase II, and Visicalc (on CP/M before DOS no less). Using Smart Space was not like using software. This was an environment I was experiencing. I think Matt’s vision of “I just want to ride a bike” is spot on.

Smart Space is:

Intuitive: I didn’t know I was doing enterprise performance management, data mining, query & analysis – I thought I was making decisions (I further didn’t know I was making those decisions based on judgments & knowledge learned from information that came from data!). Gadgets solve problems.

Social: my favorite part of using Smart Space was collaborating with other users. I could slice & dice data and quickly see an anomaly (variance outside of a threshold, for example), and instantly ping a colleague about it, share my view with her and record the decisions/actions taken as a result.

Personal: my desktop was role & function specific, that is it was set up (by me) to only contain those gadgets, workflow, and points of view that had to do with the role I played. So it should be with enterprise performance management: finance cares about the balance sheet, Operations cares about the supply chain, sales cares about revenue and pipeline, and so on. Executives care about summarized results and trends, managers care about tracking targets and variance, and analysts care about cause and effect. And, in an “EPM enlightened” organization, people across all functions care about cross-functional things. They would have gadgets that let them get to revenue growth, operational efficiency, customer satisfaction, and so on.

Ready at hand:” Smart Space is always-on EPM. It’s always there, always ready, and becomes an extension of my work – like the phone, IM, e-mail, and coffee.

Expandable: there is a fairly large (and growing) community of EPM and BI devotees: customers, consultants, vendors, analysts. These folks are candidates for contributing (dreaming-up & developing) the library of gadgets available to get your job done. I just thought of one while typing this (I’ll let you know what it is as soon as I get a developer to write it for me, any volunteers?)

This is pretty exciting stuff, and I can’t wait to see what Matt & Mike have in store for us in ’08.
-Ron
Ron Dimon
Business Foundation
EPM Advisory
Categories: Development

Soft-skills (not oracle related post)

Fadi Hasweh - Sun, 2008-03-02 23:20
Dears,
As a part of my study in my masters program we have to create a blog related to e-business course and it must be evaluated and one of the evaluation criteria is the number of users accessing it and comments so if you have free time can you access it and leave comments the blog has a good and helpful information.
the blog talks about soft skills and how they are important in our life check for example (speed reading post and managing a busy schedule post and many others)
please check and leave comments if you can
Thank you
Fadi

Interesting Corrupted Redo Log Case

Renaps' Blog - Sun, 2008-03-02 11:59

It has been a while since I wrote on Renaps’ Blog .. most probably because I didn’t run through any interesting case to talk about for a while !

Yesterday, at one of our main clients, the production (soon to be – on Monday) database hung. Was it because it was a Feb 29th of a bisextile year ? or simply because the week-end arriving in the next 4 hours was the go-live of the main data warehouse ? — Its hard to say, I believe that it was just a little wink from our old friend Murphy !

The environment is a 16 Itanium CPUs Windows Server running 10.2.0.3.0; ASM is used even though the database runs on a single instance mode.

When the database hung, the alert.log showed the following error messages (of course, no one could connect or run any transaction on the database):

ARC0: All Archive destinations made inactive due to error 354
Committing creation of archivelog ‘+FLASH/inf1/archivelog/arc17972_0627748574.001’ (error 354)
ARCH: Archival stopped, error occurred. Will continue retrying
Fri Feb 29 11:48:05 2008
Errors in file c:oraadmininf1bdumpinf1_arc0_10236.trc:
ORA-16038: Message 16038 not found; No message file for product=RDBMS, facility=ORA; arguments: [1] [17972]
ORA-00354: Message 354 not found; No message file for product=RDBMS, facility=ORA
ORA-00312: Message 312 not found; No message file for product=RDBMS, facility=ORA; arguments: [1] [1] [+FLASH/inf1/onlinelog/group_1.257.627748575]

The first line of this message stack made me believe that the database just hung because there was no more space for the archiver to do its job. But after further analysis, it appeared that the problem was much more serious.

The actual cause of the issue was related to Redo-Log Corruption. The cause of the corruption was a “write or rewrite” SAN related error.

When Redo-Log corruption occurs, Oracle first tries to use the second (or third) member of the same group showing corruption to avoid un-availability of the database. In our case, this was impossible since there was only one member per group of Redo-Log — I was surprised to see that this client did not multiplex the Redo-Logs …

My first attempt to get the database back on its legs was to dump the content of the redo-log showing corruption issues (in this case, group 1).

SQL> ALTER SYSTEM DUMP LOGFILE ‘+FLASH/inf1/onlinelog/group_1.257.627748575’;
ALTER SYSTEM DUMP LOGFILE ‘+FLASH/inf1/onlinelog/group_1.257.627748575’
*
ERREUR à la ligne 1 :
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 53504 change 70389305 time 02/29/2008 11:48:02
ORA-00334: archived log: ‘+FLASH/inf1/onlinelog/group_1.257.627748575’

As you can see, this attempt failed, I then attempted to clear the unarchived redo-log:
SQL> alter database clear unarchived logfile ‘+FLASH/inf1/onlinelog/group_1.257.627748575’;

at this point, my command hung indefinitely, and I realized that pmon was holding a lock on the Redo-log Header block…

My margin of maneuver has become much smaller, here is the sequence of commands that brought the database back on track:

  1. shutdown abort;
  2. startup mount;
  3. issue the same command that was previously hanging (this time, it worked right away) :
    alter database clear unarchived logfile ‘+FLASH/inf1/onlinelog/group_1.257.627748575’;
  4. alter database open;
  5. shutdown immediate;
  6. Immediately take a FULL backup at this point;
  7. startup;
  8. Don’t forget to multiplex your Redo-Logs to avoid to run into this situation again !

Here you go, Redo-Log corruption on the header block of a group when only one member is used does look bad at first sight but I hope this article can help you if you run into a similar scenario.

References:

Metalink Notes:
145769.1
332672.1
438176.1
114734.1

Patrick Hamou
E-Mail: phamou@renaps.com
Web: www.renaps.com


Categories: DBA Blogs

APEX 3.0.1 on XE

Jornica - Sat, 2008-03-01 09:17

On 29 Februari 2008 Oracle released Application Express 3.1 . With this release you can upgrade the Application Express within Oracle XE to 3.1. For more information on this topic see Oracle Database XE and Application Express 3.1 and APEX 3.1 on XE .

Oracle released Application Express 3.0.1 recently. With this release you can upgrade APEX within Oracle XE from 2.1 to 3.0.1 (see Oracle Database XE and Application Express 3.0.1). But how to upgrade when you already installed APEX 3.0 on XE?

There is a patch available on Metalink to upgrade from APEX 3.0 to 3.0.1. According to the Patch Set Notes Oracle9i release 2 and higher support this patch if you have APEX 3.0.0.00.20 already installed (the Patch Set Notes also includes a list of fixed bugs). In order to download the patch from Metalink, you will need a support contract (which is not included with XE).

Another option is to remove APEX 3.0 completely. After the removal, XE has APEX 2.1. (still) installed and the next step is of course to install APEX 3.0.1. The big advantage of this option is you following the main route of upgrading XE/APEX as proposed by Oracle. However there are some things to keep in mind.

  • As always start with making a backup of your database.

  • Export all applications you would like to see again in APEX 3.0.1. If you remove APEX 3.0 completely, you also remove all applications stored in the FLOWS_030000(APEX 3.0) schema! Note that applications stored in the FLOWS_020100 schema will be migrated to FLOWS_030000 (APEX 3.0.1).

  • Make a backup of changed or added cascading style sheets (CSS), images and java scripts separately.

  • Run the apex_3.0.1\apxremov.sql script (the script is the same for both versions).

  • Run the apex_3.0.1\apxins.sql script. Check the installation log for errors.

  • Copy the apex_3.0.1\images directory to the /i/directory. Also copy the previous backed up CSS, images and java scripts to the /i/ directory.

  • Import all applications exported in the first step. Note that existing applications in APEX 2.1 are migrated to APEX 3.0.1 as part of the upgrade process.

  • Time to make a backup again.

Have a look at the APEX reports in SQL Developer, no errors any more.

APEX 3.0 on XE part 2

Jornica - Sat, 2008-03-01 09:16

On 29 Februari 2008 Oracle released Application Express 3.1 . With this release you can upgrade the Application Express within Oracle XE to 3.1. For more information on this topic see Oracle Database XE and Application Express 3.1 and APEX 3.1 on XE .

If you followed the guidelines of APEX 3.0 on XE, you have a working APEX 3.0 interface. However the administrative interface of APEX is not installed on XE by default. Using this interface you can manage all aspects of the service including creating, removing, and managing workspaces. Barney Mattox discovered a workaround to install this interface as well.

In order to install the administrative interface perform the following steps:

  1. Start SQL*Plus. Make sure your working folder is the APEX installation folder (where apexins.sql is located).

  2. Connect as user SYS.

  3. ALTER SESSION SET CURRENT_SCHEMA = FLOWS_030000
    /

  4. The following command imports the administrative interface: @script\f4050.sql

  5. The following command imports the request for workspace interface: @script\f4700.sql

  6. Change the password for the ADMIN user in order to login in the administrative interface: @apxxepwd.sql

  7. Use the following URL to access the administrative interface http://server:port/apex/f?p=4550:10 i.e. http://127.0.0.1:8080/apex/f?p=4550:10 .

Additonal notes:

  • Issue a spool install.log after logging in to SQL* Plus. If there occurs an error SQL*Plus will exit. You can use the logfile to find out what went wrong.

  • I experienced problems with step 3: Illegal security group id value for this schema. A workaround is to login as FLOWS_030000 and run the scripts. First you need to unlock user FLOWS_030000:
    ALTER USER FLOWS_030000 IDENTIFIED BY password ACCOUNT UNLOCK
    /

Workflow performance case study: Dont Repeat History, Learn from it

Gaurav Verma - Sat, 2008-03-01 02:06

A Sanskrit Shlok underlying the importance of careful planning

Preface They say: "Those who do not know history are condemned to repeat it."
But why repeat history if you dont have to?
Many questions come to light on this topic of discussion:
  • Is there a better way of designing conditional implementation in workflow?
  • What are the merits/demerits of repeated checking of a particular condition as opposed to executing something when the actual condition happens? What are the performance implications?
  • Are there workflow APIs available which simulate event based subscription model?
  • What are the performance advantages of using the basic Workflow APIs that simulate this event based subscription model?
This case study discusses the performance benefits of keeping unnecessary looping execution in workflow design a a bare minimum and if possible, eliminate it. If there are several short timed loops in the workflow design, that get executed very often, the bad news is that the System spends valuable resources spinning the wheels in vain.
The intended audience is workflow designers, support personnel and Oracle Apps DBA alike. This case study can potentially help users to design their workflow more optimally and succintly, which would have a minimal impact on the execution. The I/O profile and time taken for Workflow Background Process bears out all this information. As far as possible, trends have been depicted from the customer's Production system. The Business Scenario

The business requirement to be implemented was during the ATP (Available to Promise) phase in the life of an Order line. ATP simply means that we can guarentee the customer that a particular Item can be made available from a particular warehouse on a particular date.

When scheduling an Order Line, if a financial hold existed on the Order line due to some reason, it needed to be unschedule. Also, as soon as the financial hold was released by the Accounts Receivables department, the requirement was to proceed with the remaining business flow in the OEOL workflow design.

How It Was Designed

This is how it was designed:

Let us translate the flow:

  • When the WF background program executes custom ATP process, it checks if the order is on hold or not.
  • If the order header or one of the lines is on hold the custom activity "Unschedule Lines On Hold" will unschedule all those lines and waits for an hour before checking the line status again.
  • This loop repeats until all holds on the header/line are released.
Lets look more closely at the Wait Activity definition:

Checking its Node Attributes tab:

Now, checking its actual timeout value:

So, the bottomline is that the WAIT activity was timing out after every 15 mins (60 x 0.0416 ~ 15 mins). Just imagine the volume of OEOL item keys at this activity on any given hour. What kind of a load would they be putting on the system.
Now, is that an intelligent way of designing the workflow for this requirement? Some shocking revelations.. A little analysis of the volume in WF_ITEM_ACTIVITY_STATUSES_H (history) table showed something shocking (!!):

The XXONT_CUSTOM_ATP->WAIT step was generating the highest history records. In this case, it had created 27 million rows (75% of the entire volume) out of 35 millions rows in in WF_ITEM_ACTIVITY_STATUSES_H table.
PS: This is one of the reasons as to why this case study is very interesting: simply because its impact was tremendous. How are history records created?

Lets clear up some air and get our basics right about how and when are history records created.

When the workflow engine encounters an activity which has a status of COMPLETE, it moves (figuratively) that record from WF_ITEM_ACTIVITY_STATUSES to WF_ITEM_ACTIVITY_STATUSES_H and creates a new record in WF_ITEM_ACTIVITY_STATUSES with status ACTIVE.

Therefore, the more times a COMPLETed activity is re-visited, the more history records for that activity are created. This can happen when a loop having a WAIT activity with a short timeout is designed. The more the workflow background process flows through the activities of such a loop, the more records will get created in WF_ITEM_ACTIVITY_STATUSES_H.

The performance implications...

Lets try to think as to what are the other implications of such poor design?

  • The system keeps looping on idle timeout activity every 15 mins
  • Immediately after the timeout completes, the activity becomes eligible to be processed by the workflow.
  • The next activity (check if hold has been released) checks if the hold has been released?
  • Statistically speaking, as the probability of a financial hold being released is not immediate, the control goes back to the WAIT activity, wherein the cycle repeats
While this is happening:
  • Every 15 mins, the backlog (timedout or deferred) count of unprocessed OEOL item type bloats
  • Due to such frequent time-out activities, other more important processes/activities in the OEOL workflow are not executed in a realistic timeframe
  • Valuable system resource are spent on scheduling and executing the loop's activities on the Production machine
  • The OS on Production machine spends vital I/O resources in servicing these application calls
Simply speaking, that is very in-efficient.
This is like a vicious cycle. The more data you have in in WF_ITEM_ACTIVITY_STATUSES and WF_ITEM_ACTIVITY_STATUSES_H tables, the more time it takes to query and insert  into them (also, the more indexes you have, the more time it would take to delete/update/insert due to automatic index re-organization effort required). It just kills the online performance at times.
The following diagram bears out the above ideas. It shows the top activity (in terms of number of item_keys) in the breakup of OEOL backlog. It can be seen that, without fail, XXONT_CUSTOM_ATP was the top wait activity in the OEOL backlog. Not only was it a liablity to itself, it was also a liability to other legitimate activites perfectly eligible to be processed:
 Design Considerations: The Danger Of Introducing Infinite Loops

There is another big danger here: that of introducing infinite loop in the execution of Workflow Background Process (also referred to as FNDWFBG) for OEOL.

Let us assume that there is a standard WAIT activity with a very short timeout/wait time in a loop with a condition checking activity. When the condition is checked, the activity will be deferred or timed out for a very short time.
Meanwhile, Workflow Background Process would be busy processing all other deferred or timed out activities. If FNDWFBG is not done processing other deferred activities by the time the wait/time-out time of previous activities expires, the RE-timedout or deferred activities will be enqueued and re-processed AGAIN by the very same FNDWFBG (Workflow Background Process for say item_type=OEOL/OEOH) job over and over, as long as the condition (e.g "line on hold") is met.
This creates an infinite loop, so FNDWFBG process almost never completes, eating up CPU time, creating massive DBMS archive logs (if turned on), and pumping rows into the WF activity statuses table. This would also create unnecessary rows in the WF_ITEM_ACTIVITY_STATUS_H table, which can be nemesis of online order entry performance in the 11i EBusiness Suite later on.

It is a good idea for workflow activities in custom workflows to have Wait/Timeout time greater than the longest normal execution time of Workflow Background process.
Also, a workflow activity having a near zero Relative (Wait) Time, or Wait Mode stated as Absolute Date, but no Absolute Date specified, which is equivalent to zero Relative Wait Time can cause the same issue (infinite loop) to occur for FNDWFBG (Workflow Background Process for say item_type=OEOL/OEOH) runs with Process Deferred = Yes.

Hence, these poor design practices should be avoided, because, remember, once a runtime for a particular WF definition gets initiated and even if the workflow definition is corrected later and uploaded, the previous runtime data will not be automatically corrected. So its prudent to take in good design considerations before letting your production data run on it.

More design considerations..

Whenever possible, standard Wait activity ( WF_STANDARD.WAIT() ) should not be placed in a loop with a small wait time (shorter than several hours). Alternative workflow designs should be evaluated.
WF_STANDARD.WAIT() with wait time of zero or near zero, or standard Defer activity (WF_STANDARD.DEFER()), should be NEVER be placed in a loop in a workflow process design diagram.

Extending the Wait time should only be used if this solution is not acceptable for a very good reason.

The Superior Design Alternative....

For the above business scenario, the best solution is not to use Wait activity in a loop at all. A much more attractive option is to use published Workflow blocking/contnuation APIs like WF_STANDARD.BLOCK() or OE_STANDARD_WF.STANDARD_BLOCK() should be used for stopping processing flow of a workflow item_key. WF_ENGINE.COMPLETEACTIVITY() should be used for resuming processing flow of a workflow item_key.

An additional advantage of using OE_STANDARD_WF.STANDARD_BLOCK is that  any OEOL workflow item which is blocked on OE_STANDARD_WF.STANDARD_BLOCK can also be progressible through Order Entry form via "Actions" button Progress Order option. This gives more flexibility to the user/designer.

If one wants to progress the flow of a workflow item programatically, WF_ENGINE.COMPLETEACTIVITY() API can be called. This API notifies the Workflow Engine that the specified activity has been completed for a particular item. It signals the Workflow Engine that an asynchronous activity has been completed. This procedure requires that the activity currently has a status of 'Notified' (Internal status equivalent to BLOCKED). An optional activity completion result can also be passed. The result can determine what transition the process takes next.

Example

/* Complete the BLOCK activity which is used in multiple
* subprocesses in parallel splits. */

wf_engine.CompleteActivity('ORDER', '10045',
'XXONT_CUSTOM_ATP:WAIT', 'null');

To enhance the design AND achieve the same business functionality, we need to replace the WAIT activity with a BLOCK activity which will wait till the last hold is released from the order header/line without causing any looping. The COMPLETEACTIVITY API would be called from the Financial Hold release form at Order Header level.

Its as simple as that. The new design would look like:

Seeing the Performance Benefits in Production..
Adopting this approach will create only 1 record in WF_ITEM_ACTIVITY_STATUSES table and 2 records in WF_ITEM_ACTIVITY_STATUSES_H table.


Contrast this against continuous looping till a condition is met: for each execution of the WF logic, records are created in WF_ITEM_ACTIVITY_STATUS* tables.

As a result of this change, the landscape of OEOL backlog also changed drastically.

The average runtime of the Workflow background Process for OEOL also came down consistently.

Summary of Learnings


  • It pays to know the published Workflow APIs. The Workflow Developer Guide is a good place to start. Spend time reading the fine manual. More information can never hurt.
  • It pays to spend time and effort in evaluating re-design of loops in workflow design early, especially since once initiated, a specific version of workflow runtime data does not automatically get updated after a newer version of workflow design is uploaded (through WFLOAD)
  • Poor design involving WAIT and poll activities in a loop can cause infinitely running Workflow Background Process (FNDWFBG)
  • Short-timeout activities in loops should be replaced with OE_STANDARD_WF.STANDARD_BLOCK or WF_STANDARD.BLOCK APIs and WF_ENGINE.COMPLETEACTIVITY APIs
  • Simple design changes can bring HUGE performance benefits, especially when a lot of volume is involved
  • One thing which was also learnt in hindsight was that the Development/Testing team needs to be extremely careful to account for all instantiated financial Holds.  We ran into a problem with the removal of the loop where a certain set of financial holds were not being completed.  For this set of scenarios (automatically released holds not going through the quick order form) we created a custom concurrent program to identify the blocks and call the WF_ENGINE.COMPLETEACTIVITY API. 


Simply put, Know your facts, Be smart and Dont repeat history if you dont have to. Above all, do you testing well.

Suggested Related Readings

When Conventional Thinking Fails: A Performance Case Study in Order Management Workflow customization --

Posted on 17 May 2007; 12:09:31 PM


http://blogs.oracle.com/gverma/performance/om_wf_xxont_cascade

A Performance Case Study: Taming the Workflow Purge --

Posted on 21 March 2007; 6:58:11 PM


http://blogs.oracle.com/gverma/2007/03/21#a72

We're Skewed! Using 10053 event for being Histogram-wise --

Posted on 27 March 2007; 3:02:33 PM


http://blogs.oracle.com/gverma/2007/03/27#a80

When Conventional Thinking Fails: A Performance Case Study in Order Management Workflow customization

Gaurav Verma - Sat, 2008-03-01 01:57

Foreword

Order management workflow interacts with a variety of components like core workflow, order management and pricing application call interfaces.

Customizing it without understanding the implications can be a double edged sword and the nature of customization may unwittingly worsen the situation. An undesirable combination of the aspects can cause a very difficult to diagnose and troubleshoot situation as shall be demonstrated in this case study.

This article is essentially the distilled wisdom of severity 1 situation at a client site who was unable to ship ordered items to their customers since OM order line workflow background process had a severe performance issue. The business impact was tremendous: $ 1.5 to 2 million worth of shipping and invoicing was being prevented.

Many a times, performance troubleshooting involves having some functional insight as well. TKPROF and 10046 trace is not the panacea for all performance issues. In this case, some insight into the nature of Order Management APIs was also required for getting traction on the performance problem. A painful discovery path followed, riddled with bumps and insightful discoveries. The troubleshooting approach used was really out of the box and involved breaking down the code being executed and using logical deduction.

The generic learnings from the ordeal are presented in this case study. It is hoped that the learnings will help oracle applications user community to be concious of hidden implications of customizing OM workflow.

Feeling the pinch...
The client had heavily customized their order management workflow during an implementation of Order to cash business cycle of Oracle applications 11i.

Immediately after the April 2007 financial quarter end, the OEOL workflow background process started exhibiting extremely long execution times, such that the backlog of unprocessed OEOLs came to ~150,000. This was a sharp contrast from the normal processing throughput.

This was a moving problem. The hanging of OEOL workflow background process was quite apparenly due to a 'few' order lines, belonging to a few orders. Once the 'problematic' orders' OEOL workflow were either CANCELLED or ERRORED, the backlog started clearing at a slow pace. This went on for a little while, until similar Order lines OEOL workflow seemed to take inordinately long to process and held up the entire workflow queue.

Coming to grips with it.. As a knee jerk approach, many things were tried, including killing the currently running workflow background process and re-submitting it. The clincher was that once or twice the workflow background process was killed , it marked the OEOL ITEM_KEY being processed into ERROR state and hence took it out of the deferred queue. This forced the Workflow background process to consider the next deferred OEOL item_keys in line.

In ~50% of the cases, it would work and the newly thread would start exhibiting a more acceptable throughput. But after a certain point, even killing and re-submitting workflow background process was simply not working. It did seem that the logic being executed by FNDWFBG was taking extremely long for processing one order line.

The observation was baffling. The following query was used to find out approximately which order line was a single thread of workflow background:

Query 1:

SQL> select * from (select to_char(enq_time,'dd-mon-rr hh24:mi') , 
     user_data


from wf_deferred_table_m where state = 0 and corrid = 'APPSOEOL'


order by priority, enq_time) where rownum < 5;


TO_CHAR(ENQ_TIM USER_DATA(ITEMTYPE, ITEMKEY, ACTID, FUNCTION_NAME, PARAM_LIST, RESULT)


----------------------------------------------------------------------


02-apr-07 14:25 WF_PAYLOAD_T('OEOL', '30273337', 141255, NULL, NULL, NULL)


02-apr-07 16:25 WF_PAYLOAD_T('OEOL', '30458068', 140850, NULL, NULL, NULL)


02-apr-07 16:25 WF_PAYLOAD_T('OEOL', '30458075', 140850, NULL, NULL, NULL)


02-apr-07 16:25 WF_PAYLOAD_T('OEOL', '30458089', 140850, NULL, NULL, NULL)


4 rows selected.


Tying it back to the order management tables, the following query tells us order level details like order_number, line_number, ordered_item etc, which is even more useful:

Query 2:

select distinct oh.org_id, oh.order_number,
ol.line_number, ol.shipment_number,              
ol.option_number, ol.ordered_item, ol.line_id, wiash.begin_date,
priority, wdtm.enq_time
from applsys.wf_deferred_table_m wdtm,
    applsys.wf_process_activities wpa,
     applsys.wf_item_activity_statuses wiash,
     apps.oe_order_lines_all ol,
     apps.oe_order_headers_all oh
where substr (wdtm.corrid,5) = 'OEOL'
and (wdtm.delay < sysdate or wdtm.delay is null)
and wpa.instance_id=wdtm.user_data.actid
and wiash.item_type=wdtm.user_data.itemtype
and wiash.item_key=wdtm.user_data.itemkey
and wiash.process_activity=wdtm.user_data.actid
and wiash.item_key = ol.line_id
and oh.header_id = ol.header_id
and wdtm.state = 0
group by
oh.org_id, oh.order_number, ol.line_number, ol.shipment_number,
ol.option_number,
ol.ordered_item, ol.line_id, wiash.begin_date, priority, wdtm.enq_time
order by priority, wdtm.enq_time ;

Observing the output over sometime gave the capability of identifying if workflow background process was stuck on a specific OEOL workflow.

Limited insight by conventional analysis..

The conventional debugging approaches provided limited insight into the root cause of the issue.

10046 trace could not bring out the drill down of time spent on pl/sql API calls clearly. In addition, there was an OS file limit on the trace file on most production servers, even if the max_dump_file_size=unlimited is used.

Another option was to use dbms_profiler API on FNDWFBG call. It could have worked, except the FNDWFBG API call never completed, so the data could never be collected.

Searching the metalink gives a host of notes for getting OM debugs (level 5), Pricing debug etc.

1. Note 255040.1:bde_wf_data.sql - Profiling Workflow Runtime Data related

2. Note 163208.1:bde_last_analyzed.sql - Verifies Statistics for all

3. Note 353991.1:OMSuiteDataChk.sql and OMSuiteDataChk11i10.sql


This is all good information, except that there can also be too much of un-meaningful and un-directed information. Interpreting detailed application debug logs can be useful when it is known what should be checked.

Taking the whitebox approach....

Lot of debug/trace information could be generated, but it was probably best to take the FNDWFBG execution apart in slow motion. Once the execution model has been understood, more brainstorming using divide and conquer strategy is effective.

An alternate way of looking at FNDWFBG is that it is nothing but a serial simulator of steps outlined in the oracle workflow designer.

A whitebox (a software engineering term, as opposed to blackbox) approach was the need of the hour, coupled with some simple thinking. Using the workflow status monitor from Workflow Administrator responsbility in Oracle Applications 11i, the step at which an OEOL is currently at can be seen.

This may also be verified by the output of $FND_TOP/sql/wfstat.sql. Combining the observations from Query 2 above and drilling down via Workflow Monitor, it was clear that almost all the lines having performance issue were at Fulfill-Deferred step, which means that if FNDWFBG were to pick them up, it would then execute the next activity in line, which was Fulfill step.

Note: A quick workaround was to 1) Kill the FNDWFBG process 2) SUSPEND (this status avoids any processing by workflow) the problematic OEOL workflow via workflow monitor and 3) resubmit the FNDWFBG process.

Irreconcilable paradoxes of a Test case..

We really needed to analyze the drill down on the bad test case.

From the online examination of sqls being executed by FNDWFBG session, it was seen that a lot of time was being spent on XXONT Cascade Terms and Agreement, multiple times for apparently processing one OEOL ITEM_KEY. This was a really paradoxical observation and did not make any sense.

The only acid test of this was to call OE_FULFILL_WF.START_FULFILLMENT for specific OEOL line id or ITEM_KEY. The script run_start_fulfillment_with_om_debug.sql (Listing in Appendix A) was used to simulate a single execution of Fulfill step, which in turn called OE_FULFILL_WF.START_FULFILLMENT and did a rollback at the end. One such case ran for ~6 hours and finally finished.

Most surprisingly, an examination of raw trace file did show that sub-sequent steps like XXONT Cascade Terms and Agreement and Invoice Interface- Line had indeed been called multiple times. Therefore, it seemed that a single call to OE_FULFILL_WF.START_FULFILLMENT for one kind of OEOL was triggering off sequential execution of OM workflow for other OEOLs in the same Order, which were somehow related to the order line in question.

$ grep -i xxont_cascade_pkg.set_ln_trm_agr_proc prod_ora_677526.trc 
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end;
begin xxont_cascade_pkg.set_ln_trm_agr_proc (:v1, :v2,:v3, :v4, :v5); end; 
value="XXONT_CASCADE_PKG"
value="XXONT_CASCADE_PKG"

This simply did not make any sense because Fulfill and XXONT Cascade Terms and Agreement step were discrete steps, as verfiable from viewing the OM workflow diagram flow above.

Another very interesting and puzzling observation was that without fail, the top 2 sql calls in TKPROF output (raw trace sorted with sort=prsela,exeela,fchela) showed that ~45% of the time was spent on OE_FULFILL_WF.START_FULFILLMENT API and ~45% of it was spent on the call to xxont_cascade_pkg.set_ln_trm_agr_proc API.

"tkprof output for 10046 of start_fulfillment API call"

Resolving the paradox...

The answer, which is difficult to digest, does make sense. Order management has unique processing requirements for configured ordered items created using Oracle Configurator module. Since OE_FULFILL_WF.START_FULFILLMENT is a workflow aware API (notice the _WF in its name), it was obviously calling more Workflow routines inside it to progress the business flow to next logical set of activities.

We first need to understand the concept of Fulfillment set. A fulfillment set is a set of related order lines that represent a configurable item. Whenever a fulfillment set pr a model is ready to fulfill or get built in production floor - i.e., all members' OEOL workflow are internally in FULFILL_LINE:Notified (blocked) status. The last member reaching this activity signals completion of FULFILL_LINE activities of all other members before proceeding itself. Since the XXONT Cascade Terms and Agreement step comes right after FULFILL_LINE step, XXONT Cascade Terms and Agreement step will be executed for all members of the current model or fulfillment set in that one go.

The relation between other lines whose OM workflow was being resumed and the problematic line in question could be found using the following sql:

Select top_model_line_id from oe_order_lines_all where line_id=<problematic_line>;

This query defines a FULFILLMENT SET (a set of lines which are related together

and constitute an orderable item):

Select line_id from oe_order_lines_all where top_model_line_id=<top_model_line_id_from_above_query>;

The user first configures the components of a confgurable product/item and then the item is actually built. Depending on the kind of configured model (Assemble to order -- ATO or Pick to order -- PTO), the oe_order_lines_all.flow_status_code (order management workflow status) stays at either SHIPPED or BOOKED status or becomes AWAITING_FULFILLMENT (internally blocked).

Here is an example of this intermediate stage of processing for different kind of configured Items (ATO/PTO/Stock):

Now, since a lot of time was being spent on XXONT Cascade Terms and Agreement step, After reviewing the code in the package XXONT_CASCADE_PKG.pkb, it could be seen that OE_ORDER_PUB.PROCESS_ORDER() API was being called to sync up the terms and agreements across Order header and line in question.

There were some penalties for doing this: the payment term and agreement fields on the order line triggered repricing of the order lines. Also, for especially large orders (~4000 lines) with mostly mis-matched payment terms and agreement lines, repeated calls to OE_ORDER_PUB.PROCESS_ORDER() were causing a huge overhead.

The reason why the timings for OE_FULFILL_WF.START_FULFILLMENT() showed high in the trace files was that FULFILL_LINE step syncronizes accross all components in a configurable model or fulfillment set. The following output from a workflow query in Appendix B lists out the time recorded by workflow for executing XXONT_PROD_SEC_STANDARD_LINE->FULFILL_LINE step:

Visualizing it in slow motion...

This means that, whenever a fulfillment set pr a model is ready to fulfill - i.e., all members OEOL workflow is internally in FULFILL_LINE:Notified (blocked) status, the last member reaching this activity will complete FULFILL_LINE activities of all other members before proceeding itself.

Since the Cascading activity comes right after FULFILL_LINE, it means that it will be executed for all members of the current model or fulfillment set in that one go. The aggregate timings for completing all other related OEOL workflows in the same fulfillment set goes into the timing recorded for OE_FULFILL_WF.START_FULFILLMENT() step.

The following diagrams attempt to represent the dynamics more clearly:

BEFORE calling OE_FULFILL_WF.START_FULFILLMENT() step:

AFTER calling OE_FULFILL_WF.START_FULFILLMENT() step:

Projecting the poor performance..

Based on the situation described above, a simple projection of poor performance can be done for a particular kind of order. Lets assume that there are 100 lines in a configurable model/Item. Also, lets assume that for 75% of the order lines, payment terms or/and agreements do not match with those in the Header. For each mis-matched (terms and agreements - with header) line, the time taken for pricing varied from 150 seconds to 750 seconds.

When we calcualte the cumulative time taken for executing xxont_cascade_pkg.set_ln_trm_agr_proc API for about 75% mis-matched lines of 100 lines, with ~5 mins taken for each line, the overall time taken comes to 75 x 5 mins ~ 8 hrs. That's the time for processing OE_FULFILL_WF.START_FULFILLMENT() step for one model. That was exactly the performance problem being experienced.

Now, just imagine the performance strain if the order had 10 such configurable items.

Mathmatically speaking:

The Solution: Simplify by taking the Highest Common Factor...

There were two important dimensions to the solution:

There was a bug in 11.5.9 Order management module which prevented the cascading order header changes to order lines from working properly (Profile option OM: Sales Order Form: Cascade Header Changes to Line should be Yes). This was prompting the calls to OE_ORDER_PUB.PROCESS_LINE() APIs.

Patch 5981436 is to be delivered by Oracle order management development for resolving this issue. This fix is also made part of 5950291 -11.5.9 OM CUMULATIVE PATCH. This would automatically match the payment terms and agreements from Order Header to order lines for majority of the cases, except if the user purposely chose to say No to the cascade option.

Another smart option was to do batch processing of all the mis-matched component lines of a configured model in ONE call to PROCESS_LINE() API by passing them as a pl/sql table argument. Also, this was to be done ONLY if the order line in question was for a Shippable Item. This was especially beneficial since only 1-5% of sub-components in a configured model are shippable. So that made it quite restrictive. This single change took out 95-97% of the repetitive work being done by OE_ORDER_PUB.PROCESS_ORDER() calls.

       --
       -- Calling API to update Terms and/or Agreement value(s) on the order line
       --
       oe_order_pub.process_line (p_line_tbl      => l_line_tbl,
                                  x_line_out_tbl  => o_line_tbl,
                                  x_return_status => l_return_status,
                                  x_msg_count     => l_msg_count,
                                  x_msg_data      => l_msg_data);

Seeing the performance Boost...

Before putting in the code change:

For a fulfillment set with top_model_line_id=9905479, there were 59 mis-matched lines, and the avg time for processing one call of process_line() was ~ 580 seconds.

After putting in the code change:

The time taken to same 59 mis-matched lines (of the same fulfillment set with top_model_line_id=29905479) in one pls/sql table (as an argument) in process_line() API call was ~ 625 seconds.

Putting statspack data into perspective...

This also explained why statspack reports showed xxont_cascade_pkg.set_ln_trm_agr_proc as one of  the major I/O offenders per execution:

Summary of learnings from the Case study..


  • Don't have expensive custom activities defined after START_FULFILLMENT step in OM workflow. It will pronounce the performance hit many times, especially when Oracle configurator and order management modules are being used in tamdem (Better, don't  have expensive workflow custom activities at all)
  • Batch processing should be over piece-meal processing, especially when an API has provision for the same. This reduces most of the repetitive processing
  • The whitebox (drill down) approach works for taking apart a baffling performance problem. Trying to simulate a test case usually leads to the heart of the performance issue in a more reliable way
  • Getting extensive debugs and traces is great, but only when it is known what is being looked for. Asking the right probing questions is very important. Question, question, question.
  • A well though out plan with minimal tracing and using a drill down approach, can bring better results than a shot gun or blunderbuss approach
  • Sometimes, a high level functional knowledge of the processing being done can be very useful in understanding the nature of problem. A balance between strict technical and pure functional knowledge can be fruitful towards solving performance issues

TRIVIA: Why was the customization put in the first place?

Cascading is the process in which values from the sales order header level automatically populate to the corresponding fields at the sales order line level.

Due to a bug in 11.5.9, Agreement field at Header level was not populating to the child lines of a configuration.

Due to this, the autoinvoicing program split lines into separate invoices by agreement. There were thousands of invoices where the components were being invoicing separately from the top model. It was an invoicing fiasco.

Appendix A:

sandbox:dev> more run_start_fulfillment_with_om_debug.sql

var resultout varchar2(1000)
set serveroutput on time on timing on

prompt
prompt setting userid, responsbility and application id for JHASLAGE FOR order management super user
prompt

define user_id=7962
define resp_id=52560
define appl_id=660

define item_key=29904309
define activity_id=141236

declare

l_debug_file varchar2(400);
l_org_id NUMBER;
l_count NUMBER;
l_result VARCHAR2(30);
l_spid varchar2(20);

begin

fnd_global.apps_initialize(&user_id,&resp_id,&appl_id);

select p.spid
into l_spid
from v$process p, v$session s
where s.audsid=userenv('sessionid')
and paddr = addr;

OE_Standard_WF.OEOL_SELECTOR
(p_itemtype => 'OEOL'
,p_itemkey => '&&item_key'
,p_actid => 12345
,p_funcmode => 'SET_CTX'
,p_result => l_result
);

oe_debug_pub.debug_off;

--enable OM debug
oe_debug_pub.debug_on;
oe_debug_pub.initialize;
l_debug_file := OE_DEBUG_PUB.Set_Debug_Mode('FILE');
oe_debug_pub.setdebuglevel(5);

dbms_output.put_line('Debug file is : '|| l_debug_file);

--enable trace
EXECUTE IMMEDIATE ('alter session set max_dump_file_size=unlimited');
EXECUTE IMMEDIATE ('alter session set tracefile_identifier=''gverma_start_fulfillment_with_om_debug_&&item_key''');
EXECUTE IMMEDIATE ('ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 12''');

OE_FULFILL_WF.START_FULFILLMENT('OEOL','&&item_key',&&activity_id,'RUN',:resultout);

--disable trace
EXECUTE IMMEDIATE ('ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT OFF''');

--turn debug off
oe_debug_pub.debug_off;
dbms_output.put_line('File name '||OE_DEBUG_PUB.G_DIR||'/'||OE_DEBUG_PUB.G_FILE);
dbms_output.put_line('Trace file generated for this request:'||'/*'||l_spid||'*.trc');

rollback;

EXCEPTION WHEN OTHERS THEN

dbms_output.put_line('reached error in script ' ||sqlerrm);

rollback;
end;
/

select :resultout from dual
/

Appendix B:

col ITEM_KEY for a10 Head "Item Key"

col BEGIN_DATE for a18 head "Begin Date"


col END_DATE for a18 head "End Date"


col ROOT_ACTIVITY for a30 head "Root Activity"


col RESULT for a15


col STATUS for a8 Head Status


col time_taken_mins for 99999 head "Time(Mins)"


col step format a60

set pages 100 time on timing on lines 150

spool get_details_for_wf_step

select ias.item_key,

to_char(ias.begin_date,'DD-MON-RR HH24:MI:SS') begin_date, to_char(ias.end_date,'DD-MON-RR HH24:MI:SS') end_date,


(ias.end_date-ias.begin_date)*24*60 time_taken_mins,


pa.process_name||'->'||pa.instance_label step,


ias.activity_status Status,


ias.activity_result_code Result


from wf_item_activity_statuses ias,


wf_process_activities pa


where ias.item_type = 'OEOL'


and ias.process_activity = pa.instance_id


and ias.begin_date >= sysdate-14


and pa.process_name||'->'||pa.instance_label = 'XXONT_PROD_SEC_STANDARD_LINE->FULFILL_LINE'


and ias.activity_status = 'COMPLETE'


--


-- and the step was not cancelled


--


and ias.activity_result_code != '#FORCE'


--


-- it took more than so many mins to complete this step


--


and (ias.end_date-ias.begin_date)*24*60 > 30


order by (ias.end_date-ias.begin_date)*24*60 desc, ias.begin_date

/

spool off

Pages

Subscribe to Oracle FAQ aggregator