Oracle-L: RE: clustering

From: Ron Yount <ronwy_at_swbell.net>
Date: Wed, 30 Jul 2003 18:09:24 -0800
Message-ID: <F001.005C7EAE.20030730180924@fatcity.com>

Bala,

I would be interested in knowing more as well. We are on AIX (recently 5.2) and using HACMP. We have done very extensive testing, but I have not run into this issue (not to say that I won't) Do you have standby NICS for your HSIC? We have performed tests where we fail the primary, and then allow the standby to take over. This has worked okay under load thus far.

Regards,
-Ron-

-----Original Message-----

Loughmiller, Greg
Sent: Wednesday, July 30, 2003 6:54 PM
To: Multiple recipients of list ORACLE-L

I would be interested if it's other platforms as well..

we are embarking on the "golden path of RAC"

-----Original Message-----

Sent: Wednesday, July 30, 2003 11:30 AM
To: Multiple recipients of list ORACLE-L

Bala,

Do you have a bug# associated with this one? We are on AIX, so this is *very* interesting to me.

TIA
Raj

Rajendra dot Jamadagni at nospamespn dot com All Views expressed in this email are strictly personal. QOTD: Any clod can have facts, having an opinion is an art !

-----Original Message-----

[mailto:Balakrishnan.Ashok_at_vectorscm.com] Sent: Tuesday, July 29, 2003 6:09 PM
To: Multiple recipients of list ORACLE-L

We used to experience problems in our RAC environment when there's an interconnect failure. There's a workaround for that problem, that was worked for us -

Create a directory under $ORACLE_HOME/rdbms called ".aixopt". Create (touch) a file called SUSTAIN_IPC_FAILURE (uppercase - 0 byte file).

We're using 9.2.0.3 2node RAC on AIX 5L / HACMP 4.4

Does Sun or Tru64 have similar workarounds or does it work flawlessly without the workaound. Having this workaround tells RAC to make sure atleast there's one surviving instance in the cluster instead of all instances crashing. Here's the section from alert log file with an example of handling failures of all 3 interconnects.

Marking down Network with IP 192.168.17.11 Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:29 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:29 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:29 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:30 2003
Marking down Network with IP 192.168.18.11 Thu Apr 10 23:29:33 2003
Marking down Network with IP 192.168.19.11 WARNING!!! NO COMMON NETWORKS FOR ALL NODES TO COMMUNICATE SUSTAINING IPC FAILURE
THIS SHOULD BE THE ONLY INSTANCE RUNNING IN THIS CLUSTER
-----Original Message-----

Sent: Tuesday, July 29, 2003 9:29 AM
To: Multiple recipients of list ORACLE-L

Hrrrmm - well, we've never seen the problem you describe, and we've got a pretty big RAC environment here (clusters from two to six nodes, and we combine dev clusters to build bigger ones as we need). What the situation you describe sounds like is what happens when there's interconnect failure. Each node thinks independently that its been separated from the rest of the cluster and (effectively) shoots itself in the head. This causes every instance to hang. This is why the crafty RAC Jedi designs well their interconnect architecture.

But yes, if you're willing to take the "completely 2n capacity" cluster route and have two databases, double the oracle licenses, two storage arrays, two fibre channel networks, etc. , that is the highest availability/reliability cluster you can have - although at the highest cost and complexity.

Which clustering solution is right for you? Cheap and inelegant? Expensive and bullet-proof? Well, that's why we get paid the big bucks, right? :)

Thanks,
Matt

--

Matthew Zito
GridApp Systems
Email: mzito_at_gridapp.com
Cell: 646-220-3551
Phone: 212-358-8211 x 359
http://www.gridapp.com <http://www.gridapp.com/>

-----Original Message-----

Tanel Poder
Sent: Monday, July 28, 2003 7:05 PM
To: Multiple recipients of list ORACLE-L

However, failed transactions must be handled from client side. Queries may migrate to surviving nodes transparently. Also, currently RAC has many problems, such all nodes hanging when one node dies. Completely separate systems are still (an will always be) the most available solution.

Tanel.

Original Message -----

To: Multiple recipients of list <mailto:ORACLE-L_at_fatcity.com> ORACLE-L Sent: Monday, July 28, 2003 7:49 PM

Another Important different is that RAC is best High Availability solution in case of System/Instance Failure where in case of HP or Veritas Cluster, all of the resource get stopped on live system/node of the cluster and then get started on second node and hence user will be affected. But in case of system or Instance failure, there is seamless transition of the User session in RAC

Indy Johal

"Ron Rogers" <RROGERS_at_galottery.org>
Sent by: ml-errors_at_fatcity.com

07/28/03 12:29 PM
Please respond to ORACLE-L

        
        To:        Multiple recipients of list ORACLE-L
<ORACLE-L_at_fatcity.com> 
        cc:         
        Subject:        Re: clustering

ak,
As I understand it, an HP cluster is 2 boxes that have the capability to access the same disks and data but only one can have the oracle instance running and accessing the datafiles(active). Sort of like a high availability option.
With RAC both boxes can access the instance and datafiles at the same time.
List, Correct me if I need it.
Ron

>>> oramagic_at_hotmail.com 07/28/03 12:14PM >>> Hi Guys ,
I am new to this clustering concept. Just trying to understand few basics . Need ur help .

what is differece between oracle running on sun /hp cluster with 2 nodes and oracle with RAC running on 2 nodes ?

thanks,
-ak
--

Please see the official ORACLE-L FAQ: http://www.orafaq.net
--

Author: Ron Rogers
INET: RROGERS_at_galottery.org

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services


---------------------------------------------------------------------

To REMOVE yourself from this mailing list, send an E-Mail message

to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).

--

Please see the official ORACLE-L FAQ: http://www.orafaq.net
--

Author: Ron Yount
INET: ronwy_at_swbell.net

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services


---------------------------------------------------------------------

To REMOVE yourself from this mailing list, send an E-Mail message