Message-Id: <10627.117665@fatcity.com> From: "Adams, Matthew (GEA, 088130)" Date: Fri, 22 Sep 2000 10:37:00 -0400 Subject: RE: oracle parallel server OK, there seems to be enough interest to post this publicly rather than reply directly to Chuck as I had planned. Attend the class as soon as possible. Hope and pray that Dave Austin is still teaching it. I've done a few OPS systems, mostly 7.3.3.3 in IBM SP and 8.1.6. on HP-UX clusters. Overall they have worked well for what we desinged them for. And that is the key to a successful OPS implementation. YOU CANNOT JUST THROW AN APPLICATION AT AN OPS SERVER AND EXPECT OPS TO SOLVE ALL YOU'RE PROBLEMS. (I'm not shouting, I'm just making sure that this point gets made) You must decide what you want the OPS to give you and then plan and configure and design and code with that objective in mind. Some OPS systems I've done were implemented primarily to allow the system to scale up, others were primarily for fault tolerance. All of these systems were OLTP systems, which means a lot of thought needs to be put into the PCM locking scheme. You may luck into a situation as one major retail chain did with their OPS installation where the transactions were random enough that they had no pinging problems, but I wouldn't bet on it. OPS was originally designed for read-intensive, minimal write type databases (Decision support, data warehouse, etc). Some of the advancements in recent years have made OLTP possible. Our systems have handled over 5000 tpm on occasions. We probably could have gotten more out of them, but we didn't need to at the time. In our case, we used dynamic PCM lock allocation (a beta feature on 7.3.3.3) rather than the static lock mechanism. This add a little overhead (1-3%) but dramatically cut down the false pinging problem. Just make sure that you have a very fast DIRECT connection between all the nodes involved. Backup and recovery is not too terribly difficult for OPS, just be sure to backup the archive files from both nodes. Each node has its own set of online redo logs and generates it own set of archive log files. The redo logs must be on raw devices (along with the control files and data files) so that a surviving node can perform a recovery for a failed node, but other than that, the logs are idependent of each other. Any load balancing you want to do across nodes will be your responsiblity to plan for and implement. I've never found anyway to get Oracle to load balance anything. It can do some connection balancing across sqlnet, but that balances connections, not load. If you're going to use 8.1.6 or better (and I would recommend that you do), connection failover is really pretty easy. You can set up the TNSNAMES file to reconnect to the alternate node automatically in event of failure. Oracle Docs say this only works for applications written with OCI, but it worked with our NAS servers with no problem. When a node does fail, all in-flight uncommited transactions will be rolled back. Recover for the failed node is performed by the surviving node. Data modified by the failed node that has been committed will be unavailable on the surviving node until this recovery is complete. If you set up your system well, this recovery time will be short (less than 30 seconds for our HP cluster in every test case). I have heard that some clusters do not detect a network card failure well, but when we tried to simulate that, we turned off the card at the OS level, the node panicked, went down and everything was fine because all applications failed to the other node. That's probably enough for today. Let me know what questions I've missed or other questions you thought of. ---- R. Matt Adams - GE Appliances - matt.adams@appl.ge.com It will make sense when you stop thinking logically and start thinking Oracle-ly - Jim Droppa -----Original Message----- From: Chuck Hamilton [mailto:chuck_hamilton@yahoo.com] Sent: Thursday, September 21, 2000 8:54 AM To: Adams, Matthew (GEA, 088130) Cc: ORACLE-L@fatcity.com Subject: RE: oracle parallel server Anything you can tell me. I've never used OPS and plan to attend a class in November. What version of Oracle did you use? Does it work as expected? What type of application is it? What sort of transaction volumes do you have? How many users? How many instances are you running on how many nodes? How is locking between instances handled? Have you seen performance degradation because of lock management between instances or any other inter-instance communication? How is logging handled? Are there separate logs for each instance? Have you succesfully tested backup and recovery? Are there any administration things to watch out for that are introduced by OPS? How well does the load balancing work and is it automatic? How much application work is involved to switch instances in the event of a failure? Is anything lost when a node fails? Have you encountered situations where a node fails but the applications haven't detected the failure? TIA "Adams, Matthew (GEA, 088130)" wrote: Yes. What would you like to know? ---- matt adams - matt.adams@appl.ge.com against the Martian Landscape> "That is the top of the calibration target, that is _not_ in fact a monolith." - NASA TV Commentator - 7/5/1997 -----Original Message----- Sent: Wednesday, September 20, 2000 10:01 AM To: Multiple recipients of list ORACLE-L Has anyone succesfully used OPS in a mission critical production environment? I'd like to hear about it. _____ Do You Yahoo!? Send instant messages & get email alerts with Yahoo! Messenger . -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Adams, Matthew (GEA, 088130) INET: MATT.ADAMS@APPL.GE.COM Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists -------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru@fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).