Home » Server Options » RAC & Failsafe » How to create a service that fails over in a 2 node admin-managed RAC (12.1.0.2 SE2, rh 7.3)
How to create a service that fails over in a 2 node admin-managed RAC [message #668191] Mon, 12 February 2018 02:05 Go to next message
juniordbanewbie
Messages: 217
Registered: April 2014
Senior Member
Dear all,

How to create a service that fails over in a 2 node admin-managed RAC if the other node is down

srvctl add service -database dwhdb -service dwh_auto_failover -preferred dwhdb1,dwhdb2 -tafpolicy BASIC -failovertype SESSION -verbose

however when i down dwhdb1, the service fail to fail over to dwhdb2, why is this so?

Maybe I was creating the service in a wrong way, so it cannot failover.

many thanks in advance!
Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668193 is a reply to message #668191] Mon, 12 February 2018 02:18 Go to previous messageGo to next message
John Watson
Messages: 7620
Registered: January 2010
Location: Global Village
Senior Member
Your service is preferred on both instances. It can't failover from one to the other, because it will already be running on both. Did you want to make it "preferred" on one and "available" on the other?
Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668298 is a reply to message #668193] Sat, 17 February 2018 03:51 Go to previous messageGo to next message
juniordbanewbie
Messages: 217
Registered: April 2014
Senior Member
Dear John,

yes are you right.

this is how I resolve my issues.

create two service which has preferred and available instances opposite of each other


srvctl add service -db dwhdb -service auto_failover -preferred dwhdb1 -available dwhdb2 -tafpolicy BASIC -failovertype SESSION -verbose

srvctl add service -db dwhdb -service auto_failover_pre_2 -preferred dwhdb2 -available dwhdb1 -tafpolicy BASIC -failovertype SESSION -verbose


from the 32 bit client server
in $ORACLE_HOME/network/admin/tnsnames.ora

insert the following network service name

dwhdb =
 (DESCRIPTION_LIST=
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover.moonlight.com)
         (INSTANCE_NAME=dwhdb1)))
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover.moonlight.com)
         (INSTANCE_NAME=dwhdb2)))
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover_pre_2.moonlight.com)
         (INSTANCE_NAME=dwhdb2)))
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover.moonlight.com)
         (INSTANCE_NAME=dwhdb1)))
)

instance name has to be added due to sqlplus internal error state 2130 ORA-24315:Illegal Attribute Type (Doc ID 1060880.1)

I wonder if thing can be so complicated I were to implement this in a policy manged db rather the admin managed db.

thanks

[Updated on: Sat, 17 February 2018 03:51]

Report message to a moderator

Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668299 is a reply to message #668298] Sat, 17 February 2018 04:03 Go to previous messageGo to next message
John Watson
Messages: 7620
Registered: January 2010
Location: Global Village
Senior Member
THat is wrong. What you need is this,
dwhdb =
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover)))

[Updated on: Sat, 17 February 2018 04:31]

Report message to a moderator

Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668325 is a reply to message #668299] Sun, 18 February 2018 23:04 Go to previous messageGo to next message
juniordbanewbie
Messages: 217
Registered: April 2014
Senior Member
Dear John,


set linesize 100;
SET TRIMSPOOL ON

col failover_type for a33;
col FAILOVER_METHOD for a20;
col failed_over for a20;

select FAILOVER_TYPE,FAILOVER_METHOD,FAILED_OVER from v$session where sid in (select distinct sid from v$mystat) 
union select name,'(DB NAME)',null from v$database 
union select instance_name,host_name,null from v$instance 
union select machine,to_char(sid),osuser from v$session where sid in (select distinct sid from v$mystat)
union SELECT sys_context('userenv','service_name'), 'service_name', null FROM dual
union SELECT to_char(systimestamp,'YYYY-MM-DD HH24:MI:SS.FF TZR'), 'current_time_stamp', null FROM dual order by 1;


output as below:


FAILOVER_TYPE                     FAILOVER_METHOD      FAILED_OVER
--------------------------------- -------------------- --------------------
2018-02-19 03:44:42.459847 +01:00 current_time_stamp
dwhDB                             (DB NAME)
dwhProdApp1                       156                  oracle
SESSION                           BASIC                NO
auto_failover.moonlight.com            service_name
dwhdb1                            dwhProdDB1

6 rows selected.

SYSTEM@dwhprodapp1 auto_failover>host ssh root@dwhProdDB1 systemctl reboot

SYSTEM@dwhprodapp1 auto_failover>/
select FAILOVER_TYPE,FAILOVER_METHOD,FAILED_OVER from v$session where sid in (select distinct sid from v$mystat)
*
ERROR at line 1:
ORA-25408: can not safely replay call

SYSTEM@dwhprodapp1 auto_failover>/

FAILOVER_TYPE                     FAILOVER_METHOD      FAILED_OVER
--------------------------------- -------------------- --------------------
2018-02-19 03:46:50.380190 +01:00 current_time_stamp
dwhDB                             (DB NAME)
dwhProdApp1                       547                  oracle
SESSION                           BASIC                YES
auto_failover.moonlight.com            service_name
dwhdb2                            dwhProdDB2


SYSTEM@dwhprodapp1 auto_failover>host ping -c 3 dwhProdDB1

SYSTEM@dwhprodapp1 auto_failover>host ssh root@dwhProdDB2 systemctl reboot

6 rows selected.
SYSTEM@dwhprodapp1 auto_failover>/
select FAILOVER_TYPE,FAILOVER_METHOD,FAILED_OVER from v$session where sid in (select distinct sid from v$mystat)
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 12773
Session ID: 547 Serial number: 8463


SYSTEM@dwhprodapp1 auto_failover>/
ERROR:
ORA-03114: not connected to ORACLE

as you can observe the session cannot failover when to dwhproddb1 when dwhproddb2 is down.

there is no doubt when I did a srvctl status service -db dwhdb -service "auto_failover,auto_failover_pre_2", the services are in dwhproddb1, so do you consider it a bug?

that's why I implement an alternative, which is to make it work



dwhdb =
 (DESCRIPTION_LIST=
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover.moonlight.com)
         (INSTANCE_NAME=dwhdb1)))
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover.moonlight.com)
         (INSTANCE_NAME=dwhdb2)))
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover_pre_2.moonlight.com)
         (INSTANCE_NAME=dwhdb2)))
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL = TCP)(HOST = rh73-12102-scan.moonlight.com)(PORT = 1521))
   (CONNECT_DATA=
     (SERVICE_NAME=auto_failover_pre_2.moonlight.com)
         (INSTANCE_NAME=dwhdb1)))
)

however i discover even my implementation run into problem when I stop listener in dwhproddb1, it has this error:


ORA-12521: TNS:listener does not currently know of instance requested in
connect descriptor

however it can connect successfully when I up listener in dwhproddb1 and down listener dwhproddb2

Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668330 is a reply to message #668325] Mon, 19 February 2018 01:00 Go to previous messageGo to next message
John Watson
Messages: 7620
Registered: January 2010
Location: Global Village
Senior Member
Quote:
however i discover even my implementation run into problem when I stop listener in dwhproddb1, it has this error:


ORA-12521: TNS:listener does not currently know of instance requested in
connect descriptor

however it can connect successfully when I up listener in dwhproddb1 and down listener dwhproddb2
If you stop the node listener, nothing can connect to a service on that node. THis is the expected behaviour. And I have already told you not specify the instance name.
Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668334 is a reply to message #668330] Mon, 19 February 2018 02:16 Go to previous messageGo to next message
juniordbanewbie
Messages: 217
Registered: April 2014
Senior Member
Dear John,

SP2-0642: SQL*Plus Internal Error State 2130 ORA-24315: Illegal Attribute Type (Doc ID 1060880.1)

Quote:

The tnsnames.ora file entry must include the Instance Name in order to use the Service Name in the connect string.
See the online Oracle Database Documentation Library -- subtopic and keywords,
"Oracle Database Net Services Reference"
For example:
Oracle Databa se Net Services Reference
11g Release 1 (11.1)
Part Number B28317- 02
(available on the Oracle OTN website)
Section: 6.6.4.5 INSTANCE_NAME
Use the INSTANCE_NAME parame ter to identify the database instance to access. Set the value to the value specified by
the INSTANCE_NAME parameter in the initialization parameter file.
Embed this parameter under the CONNECT_DATA parameter.


Add the INSTANCE_NAME parameter (to identify the database instance) for access in the tnsnames.ora file.
I encounter the above error that's why I include the instance name. may be it was implemented wrongly so how do I implement it?

many thanks in advance!

Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668337 is a reply to message #668334] Mon, 19 February 2018 02:44 Go to previous messageGo to next message
John Watson
Messages: 7620
Registered: January 2010
Location: Global Village
Senior Member
Why do you think that document has any relevance to your problem? My very professional opinion of that article is that it is a perfect example of all that is wring with MOS. It was written five yeas ago and supposedly updated one year ago. And it is rubbish. The idea that you cannot have more than one entry in a tnsnames.ora file unless you specify an instance name is ridiculous.

So, some facts:
Nowhere in the RAC documentation will you EVER find an instruction to use an instance name.
The whole point of load balancing and fault tolerance is that you use service names only.
If you were to be using policy management, you wouldn't even know what the the instance names are.
How can you possible have failover when you demand a named instance?

If you do not believe me, raise a TAR and Support will tell you the same thing. Assuming that this system is in fact under a support contract, which it would have to be for your use of MOS to be legal.
Re: How to create a service that fails over in a 2 node admin-managed RAC [message #668551 is a reply to message #668337] Thu, 01 March 2018 02:10 Go to previous message
juniordbanewbie
Messages: 217
Registered: April 2014
Senior Member
Hi John, you are right.

Quote:

RAC: SQL*Plus Connection to Service Yields SP2-0642: SQL*Plus Internal Error State 2130
(Doc ID 1669959.1)

Do not include any form of the word FAIL in the creation of a RAC Service.

The use of the word FAIL as part of the SERVICE label caused this failure. In this case, when the service label was
changed and the word FAILOVER eliminated,
the error cleared.
Further testing revealed that the SERVICE_NAME label cannot contain any form of the word FAIL.
but right now i have another issue,

when the private interconnect is disconnected, node eviction occurs=> this one is expected

but when the private interconnects is reconnected back, the node don't seem to be able to join back the cluster.

I have to do the following:
su -c 'crsctl stop crs -f'
su -c 'crsctl start crs'

I have read many online articles, but they never mentioned about manual intervention.

so when the private interconnect is connected back, the evicted nodes is supposed to join back the cluster automatically without intervention right?

thanks
Previous Topic: 12.1.0.2 root.sh fails to start after deconfiguring clusterware
Next Topic: [INS-30512] Automatic Storage Management software is not configured on this cluster
Goto Forum:
  


Current Time: Sun Oct 21 10:03:35 CDT 2018