Using srvctl to Manage your 10g RAC Database

Natalka Roshak's picture
articles: 

Oracle recommends that RAC databases be managed with srvctl, an Oracle-supplied tool that was first introduced with 9i RAC. The 10g version of srvctl is slightly different from the 9i implementation. In this article, we will look at how -- and why -- to manage your 10g databases with srvctl.

RAC Architecture Overview

Let's begin with a brief overview of RAC architecture.

  • A cluster is a set of 2 or more machines (nodes) that share or coordinate resources to perform the same task.
  • A RAC database is 2 or more instances running on a set of clustered nodes, with all instances accessing a shared set of database files.
  • Depending on the O/S platform, a RAC database may be deployed on a cluster that uses vendor clusterware plus Oracle's own clusterware (Cluster Ready Services), or on a cluster that solely uses Oracle's own clusterware.

Thus, every RAC sits on a cluster that is running Cluster Ready Services. srvctl is the primary tool DBAs use to configure CRS for their RAC database and processes.

Cluster Ready Services and the OCR

Cluster Ready Services, or CRS, is a new feature for 10g RAC. Essentially, it is Oracle's own clusterware. On most platforms, Oracle supports vendor clusterware; in these cases, CRS interoperates with the vendor clusterware, providing high availability support and service and workload management. On Linux and Windows clusters, CRS serves as the sole clusterware. In all cases, CRS provides a standard cluster interface that is consistent across all
platforms.

CRS consists of four processes (crsd, occsd, evmd, and evmlogger) and two disks: the Oracle Cluster Registry (OCR), and the voting disk.

CRS manages the following resources:

  • The ASM instances on each node
  • Databases
  • The instances on each node
  • Oracle Services on each node
  • The cluster nodes themselves, including the following processes, or "nodeapps":
    • VIP
    • GSD
    • The listener
    • The ONS daemon

CRS stores information about these resources in the OCR. If the information in the OCR for one of these resources becomes damaged or inconsistent, then CRS is no longer able to manage that resource. Fortunately, the OCR automatically backs itself up regularly and frequently.

Interacting with CRS and the OCR: srvctl

srvctl is the tool Oracle recommends that DBAs use to interact with CRS and the cluster registry. Oracle does provide several tools to interface with the cluster registry and CRS more directly, at a lower level, but these tools are deliberately undocumented and intended only for use by Oracle Support. srvctl, in contrast, is well documented and easy to use. Using other tools to modify the OCR or manage CRS without the assistance of Oracle Support runs the risk of damaging the OCR.

Using srvctl

Even if you are experienced with 9i srvctl, it's worth taking a look at this section; 9i and 10g srvctl commands are slightly different.

srvctl must be run from the $ORACLE_HOME of the RAC you are administering. The basic format of a srvctl command is

srvctl <command> <target> [options]

where command is one of

enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config

and the target, or object, can be a database, instance, service, ASM instance, or the nodeapps.

The srvctl commands are summarized in this table:

Table 1. Summary of srvctl commands.
CommandTargetsDescription
srvctl add
srvctl modify
srvctl remove
database
instance
service
nodeapps
srvctl add / remove adds/removes target's configuration information to/from the OCR.

srvctl modify allows you to change some of target's configuration information in the OCR without wiping out the rest.

srvctl relocate service Allows you to reallocate a service from one named instance to another named instance.
srvctl config database
service
nodeapps
asm
Lists configuration information for target from the OCR.
srvctl disable
srvctl enable
database
instance
service
asm
srvctl disable disables target, meaning CRS will not consider it for automatic startup, failover, or restart. This option is useful to ensure an object that is down for maintenance is not accidentally automatically restarted.

srvctl enable reenables the specified object.

srvctl getenv
srvctl setenv
srvctl unsetenv
database
instance
service
nodeapps
srvctl getenv displays the environment variables stored in the OCR for target.

srvctl setenv allows these variables to be set, and unsetenv unsets them.

srvctl start
srvctl status
srvctl stop
database
instance
service
nodeapps
asm
Start, stop, or display status (started or stopped) of target.

As you can see, srvctl is a powerful utility with a lot of syntax to remember. Fortunately, there are only really two commands to memorize: srvctl -help displays a basic usage message, and srvctl -h displays full usage information for every possible srvctl command.

Examples

Example 1. Bring up the MYSID1 instance of the MYSID database.

[oracle@myserver oracle]$ srvctl start instance -d MYSID -i MYSID1

Example 2. Stop the MYSID database: all its instances and all its services, on all nodes.

[oracle@myserver oracle]$ srvctl stop database -d MYSID

Example 3. Stop the nodeapps on the myserver node. NB: Instances and services also stop.

[oracle@myserver oracle]$ srvctl stop nodeapps -n myserver

Example 4. Add the MYSID3 instance, which runs on the myserver node, to the MYSID
clustered database.

[oracle@myserver oracle]$ srvctl add instance -d MYSID -i MYSID3 -n myserver

Example 4. Add a new node, the mynewserver node, to a cluster.

[oracle@myserver oracle]$ srvctl add nodeapps -n mynewserver -o $ORACLE_HOME -A 
149.181.201.1/255.255.255.0/eth1

(The -A flag precedes an address specification.)

Example 5. To change the VIP (virtual IP) on a RAC node, use the command

[oracle@myserver oracle]$ srvctl modify nodeapps -A new_address

Example 6. Find out whether the nodeapps on mynewserver are up.

[oracle@myserver oracle]$ srvctl status nodeapps -n mynewserver
VIP is running on node: mynewserver
GSD is running on node: mynewserver
Listener is not running on node: mynewserver
ONS daemon is running on node: mynewserver

Example 7. Disable the ASM instance on myserver for maintenance.

[oracle@myserver oracle]$ srvctl disable asm -n myserver

Debugging srvctl

Debugging srvctl in 10g couldn't be easier. Simply set the SRVM_TRACE environment variable.

[oracle@myserver bin]$ export SRVM_TRACE=true

Let's repeat Example 6 with SRVM_TRACE set to true:

[oracle@myserver oracle]$ srvctl status nodeapps -n mynewserver
/u01/app/oracle/product/10.1.0/jdk/jre//bin/java -classpath 
/u01/app/oracle/product/10.1.0/jlib/netcfg.jar:/u01/app/oracle/product/10.1.0/jdk/jre//lib/rt.jar:
/u01/app/oracle/product/10.1.0/jdk/jre//lib/i18n.jar:/u01/app/oracle/product/10.1.0/jlib/srvm.jar:
/u01/app/oracle/product/10.1.0/jlib/srvmhas.jar:/u01/app/oracle/product/10.1.0/jlib/srvmasm.jar:
/u01/app/oracle/product/10.1.0/srvm/jlib/srvctl.jar 
-DTRACING.ENABLED=true -DTRACING.LEVEL=2 oracle.ops.opsctl.OPSCTLDriver status nodeapps -n 
mynewserver
[main] [19:53:31:778] [OPSCTLDriver.setInternalDebugLevel:165]  tracing is true at level 2 to 
file null
[main] [19:53:31:825] [OPSCTLDriver.<init>:94]  Security manager is set
[main] [19:53:31:843] [CommandLineParser.parse:157]  parsing cmdline args
[main] [19:53:31:844] [CommandLineParser.parse2WordCommandOptions:900]  parsing 2-word 
cmdline
[main] [19:53:31:866] [GetActiveNodes.create:212]  Going into GetActiveNodes constructor...
[main] [19:53:31:875] [HASContext.getInstance:191]  Module init : 16
[main] [19:53:31:875] [HASContext.getInstance:216]  Local Module init : 19
...
[main] [19:53:32:285] [ONS.isRunning:186]  Status of ora.ganges.ons on mynewserver is true
ONS daemon is running on node: mynewserver
[oracle@myserver oracle]$

Pitfalls

A little impatience when dealing with srvctl can corrupt your OCR, ie, put it into a state where the information for a given object is inconsistent or partially missing. Specifically, the srvctl remove command provides the -f option, to allow you to force removal of an object from the OCR. Use this option judiciously, as it can easily put the OCR into an inconsistent state.

Restoring the OCR from an inconsistent state is best done with the assistance of Oracle Support, who will guide you in using the undocumented $CRS_HOME/bin/crs_* tools to repair it. The OCR can also be restored from backup.

Error messages

srvctl errors are PRK% errors, which are not documented in the 10gR1 error messages manual. However, for those with a Metalink account, they are documented on Metalink here.

Conclusion

srvctl is a powerful tool that will allow you to administer your RAC easily and effectively. In addition, it provides a valuable buffer between the DBA and the OCR, making it more difficult to corrupt the OCR.

Comments

Thanks for nice doc.

What is nodeapps as mentioned in the following ???

--------- srvctl modify nodeapps -A new_address

Rgds

I am new to RAC, though not to OPS and this document was useful. However, it might help to clarify the difference between server control and the CRS commands such crs_start INSTANCE_NAME. At the moment they seem interchangeable, although I know that if I want to bounce my instance on one node the crs_shut command is the one I want. It seems that if want to bounce the dataabse, across all nodes then I might use svrctl stop as in your example.

Two other thoughts for comment might be the change in entries in the oratab to reflect RAC and non-RAC databases, OFA standards for installing Oracle software including RAC and the variables that now need assigning other than ORACLE_HOME and ORACLE_SID (ORA_CRS_HOME for instance)

John

Thanks for sharing this with us. It is very informative. Appreciate your efforts.