Wednesday, April 1, 2009

Cluster Ready Services (RAC)

CRS and 10g Real Application Clusters:

PURPOSE
----------
This document is to provide additional information on CRS (Cluster Ready Services)in 10g Real Application Clusters.

SCOPE & APPLICATION
--------------------------
This document is intended for RAC Database Administrators and Oracle support enginneers.

CRS and 10g REAL APPLICATION CLUSTERS
------------------------------------- -----------
CRS (Cluster Ready Services) is a new feature for 10g Real Application Clustersthat provides a standard cluster interface on all platforms and performs new high availability operations not available in previous versions.

CRS KEY FACTS
-----------------
Prior to installing CRS and 10g RAC, there are some key points to remember about CRS and 10g RAC:
- CRS is REQUIRED to be installed and running prior to installing 10g RAC.
- CRS can either run on top of the vendor clusterware (such as Sun Cluster, HP Serviceguard, IBM HACMP, TruCluster, Veritas Cluster, Fujitsu Primecluster, etc...) or can run without the vendor clusterware. The vendor clusterware was required in 9i RAC but is optional in 10g RAC.
- The CRS HOME and ORACLE_HOME must be installed in DIFFERENT locations.
- Shared Location(s) or devices for the Voting File and OCR (Oracle Configuration Repository) file must be available PRIOR to installing CRS. The voting file should be at least 20MB and the OCR file should be at least 100MB.
- CRS and RAC require that the following network interfaces be configured prior to installing CRS or RAC:
- Public Interface
- Private Interface
- Virtual (Public) Interface
For more information on this, see Note 264847.1 in metalink.oracle.com
- The root.sh script at the end of the CRS installation starts the CRS stack.
If your CRS stack does not start, see Note 240001.1 in metalink.oracle.com
- Only one set of CRS daemons can be running per RAC node.
- On Unix, the CRS stack is run from entries in /etc/inittab with "respawn".
- If there is a network split (nodes lose communication with each other). One or more nodes may reboot automatically to prevent data corruption.
- The supported method to start CRS is booting the machine. MANUAL STARTUP OF THE CRS STACK IS NOT SUPPORTED UNTIL 10.1.0.4 OR HIGHER.
- The supported method to stop is shutdown the machine or use "init.crs stop".
- Killing CRS daemons is not supported unless you are removing the CRS installation via Note 239998.1 in metalink.oracle.com because flag files can become mismatched.
- For maintenance, go to single user mode at the OS.Once the stack is started, you should be able to see all of the daemon processeswith a ps -ef command:
[rac1]/u01/home/beta> ps -ef grep crs
oracle 1363 999 0 11:23:21 ? 0:00 /u01/crs_home/bin/evmlogger.bin -o /u01
oracle 999 1 0 11:21:39 ? 0:01 /u01/crs_home/bin/evmd.bin
root 1003 1 0 11:21:39 ? 0:01 /u01/crs_home/bin/crsd.bin
oracle 1002 1 0 11:21:39 ? 0:01 /u01/crs_home/bin/ocssd.binCRS DAEMON

FUNCTIONALITY
------------------------
Here is a short description of each of the CRS daemon processes:
CRSD:
- Engine for HA operation
- Manages 'application resources'
- Starts, stops, and fails 'application resources' over
- Spawns separate 'actions' to start/stop/check application resources
- Maintains configuration profiles in the OCR (Oracle Configuration Repository)
- Stores current known state in the OCR.
- Runs as root
- Is restarted automatically on failure
OCSSD:
- OCSSD is part of RAC and Single Instance with ASM
- Provides access to node membership
- Provides group services
- Provides basic cluster locking
- Integrates with existing vendor clusteware, when present
- Can also runs without integration to vendor clustware
- Runs as Oracle.
- Failure exit causes machine reboot.
--- This is a feature to prevent data corruption in event of a split brain.
EVMD:
- Generates events when things happen
- Spawns a permanent child evmlogger
- Evmlogger, on demand, spawns children
- Scans callout directory and invokes callouts.
- Runs as Oracle.
- Restarted automatically on failure

CRS LOG DIRECTORIES
--------------------------
When troubleshooting CRS problems, it is important to review the directoriesunder the CRS Home.
$ORA_CRS_HOME/crs/log - This directory includes traces for CRS resources that arejoining, leaving, restarting, and relocating as identified by CRS.
$ORA_CRS_HOME/crs/init - Any core dumps for the crsd.bin daemon should be writtenhere. Note 1812.1 of metalink.oracle.com could be used to debug these.
$ORA_CRS_HOME/css/log - The css logs indicate all actions such as reconfigurations, missed checkins , connects, and disconnects from the clientCSS listener . In some cases the logger logs messages with the category of (auth.crit) for the reboots done by oracle. This could be used for checking the exact time when the reboot occured.
$ORA_CRS_HOME/css/init - Core dumps from the ocssd primarily and the pid for the css daemon whose death is treated as fatal are located here. If there are abnormal restarts for css then the core files will have the formats of core.. Note 1812.1 in metalink.oracle.com could be used to debug these.
$ORA_CRS_HOME/evm/log - Log files for the evm and evmlogger daemons. Not used as often for debugging as the CRS and CSS directories.
$ORA_CRS_HOME/evm/init - Pid and lock files for EVM. Core files for EVM shouldalso be written here. Note 1812.1 in metalink.oracle.com could be used to debug these.
$ORA_CRS_HOME/srvm/log - Log files for OCR.

STATUS FOR CRS RESOURCES
----------------------------------
After installing RAC and running the VIPCA (Virtual IP Configuration Assistant)launched with the RAC root.sh, you should be able to see all of your CRSresources with crs_stat. Example:
cd $ORA_CRS_HOME/bin ./crs_stat
NAME=ora.rac1.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac1.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac1.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac1.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE

There is also a script available to view CRS resources in a format that is easier to read. Just create a shell script with:
--------------------------- Begin Shell Script -------------------------------
#!/usr/bin/ksh
#
# Sample 10g CRS resource status query script
#
# Description:
# - Returns formatted version of crs_stat -t, in tabular
# format, with the complete rsc names and filtering keywords
# - The argument, $RSC_KEY, is optional and if passed to the script, will
# limit the output to HA resources whose names match $RSC_KEY.
# Requirements:
# - $ORA_CRS_HOME should be set in your environment
RSC_KEY=$1
QSTAT=-u
AWK=/usr/xpg4/bin/awk # if not available use /usr/bin/awk

# Table header:echo ""
$AWK \
'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State";
printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}'

# Table body:
$ORA_CRS_HOME/bin/crs_stat $QSTAT $AWK \
'BEGIN { FS="="; state = 0; }
$1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1};
state == 0 {next;}
$1~/TARGET/ && state == 1 {apptarget = $2; state=2;}
$1~/STATE/ && state == 2 {appstate = $2; state=3;}
state == 3 {printf "%-45s %-10s %-18s\n", appname, apptarget, appstate; state=0;}'
--------------------------- End Shell Script -------------------------------

Note:For more information on CRS refer CRS Administration (RAC) link of this blog.

No comments:

Post a Comment