Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Mailing Lists -> Oracle-L -> RE: oracle parallel server

RE: oracle parallel server

From: Adams, Matthew (GEA, 088130) <MATT.ADAMS_at_APPL.GE.COM>
Date: Fri, 22 Sep 2000 10:37:00 -0400
Message-Id: <>


OK, there seems to be enough interest to post this publicly rather than reply directly to Chuck as I had planned.  

Attend the class as soon as possible. Hope and pray that Dave Austin is still teaching it.  

I've done a few OPS systems, mostly in IBM SP and 8.1.6. on HP-UX clusters. Overall they have worked well for what we desinged them for. And that is the key to a successful OPS implementation.
YOU CANNOT JUST THROW AN APPLICATION AT AN OPS SERVER AND EXPECT OPS TO SOLVE ALL YOU'RE PROBLEMS. (I'm not shouting, I'm just making sure that this point gets made)  

You must decide what you want the OPS to give you and then plan and configure and design and code with that objective in mind.  

Some OPS systems I've done were implemented primarily to allow the system to scale up, others were primarily for fault tolerance. All of these systems were OLTP systems, which means a lot of thought needs to be put into the PCM locking scheme. You may luck into a situation as one major retail chain did with their OPS installation where the transactions were random enough that they had no pinging problems, but I wouldn't bet on it. OPS was originally designed for read-intensive, minimal
write type databases (Decision support, data warehouse, etc). Some of the advancements in recent years have made OLTP possible. Our systems have handled over 5000 tpm on occasions. We probably could have gotten more out of them, but we didn't need to at the time.  

In our case, we used dynamic PCM lock allocation (a beta feature on rather than the static lock mechanism. This add a little overhead (1-3%) but
dramatically cut down the false pinging problem. Just make sure that you have a very fast DIRECT connection between all the nodes involved.  

Backup and recovery is not too terribly difficult for OPS, just be sure to backup the archive files from both nodes. Each node has its own set of online redo logs and generates it own set of archive log files. The redo logs must
be on raw devices (along with the control files and data files) so that a surviving node can perform a recovery for a failed node, but other than that, the logs are idependent of each other.  

Any load balancing you want to do across nodes will be your responsiblity to plan for and implement. I've never found anyway to get Oracle to load balance anything. It can do some connection balancing across sqlnet, but
that balances connections, not load.  

If you're going to use 8.1.6 or better (and I would recommend that you do), connection
failover is really pretty easy. You can set up the TNSNAMES file to reconnect to the
alternate node automatically in event of failure. Oracle Docs say this only works for
applications written with OCI, but it worked with our NAS servers with no problem.  

When a node does fail, all in-flight uncommited transactions will be rolled back.
Recover for the failed node is performed by the surviving node. Data modified by
the failed node that has been committed will be unavailable on the surviving

node until this recovery is complete. If you set up your system well, this recovery time will be short (less than 30 seconds for our HP cluster in every
test case).  

I have heard that some clusters do not detect a network card failure well, but when
we tried to simulate that, we turned off the card at the OS level, the node panicked,
went down and everything was fine because all applications failed to the other node.    

That's probably enough for today. Let me know what questions I've missed or other questions you thought of.

R. Matt Adams - GE Appliances - It will make sense when you stop thinking logically and start thinking Oracle-ly - Jim Droppa  

-----Original Message-----

From: Chuck Hamilton [] Sent: Thursday, September 21, 2000 8:54 AM To: Adams, Matthew (GEA, 088130)
Subject: RE: oracle parallel server

Anything you can tell me. I've never used OPS and plan to attend a class in November.

What version of Oracle did you use? Does it work as expected? What type of application is it? What sort of transaction volumes do you have? How many users? How many instances are you running on how many nodes? How is locking between instances handled? Have you seen performance degradation because of lock management between instances or any other inter-instance communication? How is logging handled? Are there separate logs for each instance? Have you succesfully tested backup and recovery? Are there any administration things to watch out for that are introduced by OPS? How well does the load balancing work and is it automatic? How much application work is involved to switch instances in the event of a failure? Is anything lost when a node fails? Have you encountered situations where a node fails but the applications haven't detected the failure?

TIA   "Adams, Matthew (GEA, 088130)" <MATT.ADAMS_at_APPL.GE.COM> wrote:

Yes. What would you like to know?

matt adams -<?XML:NAMESPACE PREFIX = MAILTO /> <>

against the Martian Landscape>
"That is the top of the calibration target, that is _not_ in fact a monolith."
- NASA TV Commentator - 7/5/1997

-----Original Message-----

Sent: Wednesday, September 20, 2000 10:01 AM To: Multiple recipients of list ORACLE-L

Has anyone succesfully used OPS in a mission critical production environment? I'd like to hear about it.

Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger .

Please see the official ORACLE-L FAQ:
Author: Adams, Matthew (GEA, 088130)
Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
San Diego, California -- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message to: (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Received on Fri Sep 22 2000 - 09:37:00 CDT

Original text of this message