IBM HACMP - Action administration tasks

 

Cluster resource group fallover/fallback

Use smitty to move a resource group between nodes in a cluster

smitty hacmp

                    HACMP for AIX

Move cursor to desired item and press Enter.

  Initialization and Standard Configuration
  Extended Configuration
  System Management (C-SPOC)
  Problem Determination Tools

Select "System Management"

----------------------------------------------------------------------------------------------------------------

         System Management (C-SPOC)

Move cursor to desired item and press Enter.

  Manage HACMP Services
  HACMP Communication Interface Management
  HACMP Resource Group and Application Management
  HACMP Log Viewing and Management
  HACMP File Collection Management
  HACMP Security and Users Management
  HACMP Logical Volume Management
  HACMP Concurrent Logical Volume Management
  HACMP Physical Volume Management

  Open a SMIT Session on a Node

Select "HACMP Resource Group and Application Management"

----------------------------------------------------------------------------------------------------------------

          HACMP Resource Group and Application Management

Move cursor to desired item and press Enter.

  Show the Current State of Applications and Resource Groups
  Bring a Resource Group Online
  Bring a Resource Group Offline
  Move a Resource Group to Another Node / Site

  Suspend/Resume Application Monitoring
  Application Availability Analysis

Select "Move a Resource Group to Another Node / Site"

----------------------------------------------------------------------------------------------------------------

           Move a Resource Group to Another Node / Site

Move cursor to desired item and press Enter.

  Move Resource Groups to Another Node
  Move Resource Groups to Another Site

Select "Move Resource Groups to Another Node"

----------------------------------------------------------------------------------------------------------------

		   Select a Resource Group                      
                                                                         
 Move cursor to desired item and press Enter.                             
                                                                          
   #                                                                      
   # Resource Group                State                Node(s) / Site    
   #                                                                      
     app_rg                       ONLINE                node1 /         
     data_rg                      ONLINE                node2 /        
                                                                          
   #                                                                      
   # Resource groups in node or site collocation configuration:           
   # Resource Group(s)                           State    Node / Site  
   #  

Select the resource group to be moved.

----------------------------------------------------------------------------------------------------------------

		   Select a Destination Node                         
                                                                          
  Move cursor to desired item and press Enter.                             
                                                                                                 
    # *Denotes Originally Configured Highest Priority Node                 
      node2   
      

Select a destination node.

----------------------------------------------------------------------------------------------------------------

         Move Resource Group(s) to Another Node

Type or select values in entry fields.
Press Enter AFTER making all desired changes.

                                                        [Entry Fields]
  Resource Group(s) to be Moved                       app_rg
  Destination Node                                    node2

Confirm the entry fields are correct and press Enter

----------------------------------------------------------------------------------------------------------------

  Command: OK            stdout: yes           stderr: no

Before command completion, additional instructions may appear below.

Attempting to move resource group app_rg to node node2.

Waiting for the cluster to process the resource group movement request....

Waiting for the cluster to stabilize........................................

Resource group movement successful.
Resource group app_rg is online on node2.


Cluster Name: node2

Resource Group Name: app_rg
Node                         State
---------------------------- ---------------
node2                       ONLINE
node1                       OFFLINE

Resource Group Name: data_rg
Node                         State
---------------------------- ---------------
node2                       OFFLINE
node1                       ONLINE

----------------------------------------------------------------------------------------------------------------

Whilst the resource group is being moved, open another terminal and monitor the log file /tmp/hacmp.out by typing the following.

tail -f /tmp/hacmp.out

 

Creating a snap for IBM analysis

snap -r

snap -e

To verify the snap

zcat snap.pax.Z | pax -vf -

 

To start the cluster services on a node use the following HACMP script

/usr/sbin/cluster/etc/rc.cluster

Nov 14 2009 12:36:09  Starting execution of /usr/sbin/cluster/etc/rc.cluster
with parameters:

Nov 14 2009 12:36:14  Checking for srcmstr active...
Nov 14 2009 12:36:14 complete.
 213174      -  0:00 syslogd
Setting routerevalidate to 1
Nov 14 2009 12:36:14
/usr/sbin/cluster/utilities/clstart : called with flags -m -G -A

Verifying Cluster Configuration Prior to Starting Cluster Services.

Verifying node(s): node2 against the running node node1


WARNING: The following resource type have the same resource name.
Resource name: cluster
Resource Types: Service IP Label and Application
WARNING: Having cluster resources with the same name can lead to confusion and
difficulties with cluster planning and administration. Cluster services may
not function properly if resources do not have unique names.

WARNING: Application monitors are required for detecting application failures
in order for HACMP to recover from them.  Application monitors are started
by HACMP when the resource group in which they participate is activated.
The following application(s), shown with their associated resource group,
do not have an application monitor configured:

  Application Server                Resource Group
  --------------------------------  ---------------------------------
   cluster                           cluster
WARNING: The LVM time stamp for shared volume group: datavg is inconsistent
with the time stamp in the VGDA for the following nodes:
node1

Successfully verified node(s): node2
0513-059 The topsvcs Subsystem has been started. Subsystem PID is 495748.
0513-059 The grpsvcs Subsystem has been started. Subsystem PID is 467176.
0513-059 The emsvcs Subsystem has been started. Subsystem PID is 409756.
0513-059 The emaixos Subsystem has been started. Subsystem PID is 401634.
Nov 14 2009 12:36:48

Completed execution of /usr/sbin/cluster/etc/rc.cluster
with parameters: .
Exit Status = 0.

 

Alternatively to start the cluster services on a node use smitty clstart

smitty clstart

Type or select values in entry fields.
Press Enter AFTER making all desired changes.

                                                        [Entry Fields]
* Start now, on system restart or both                now                                                                 
  Start Cluster Services on these nodes              [node1]                                                             
* Manage Resource Groups                              Automatically                                                       
  BROADCAST message at startup?                       false                                                               
  Startup Cluster Information Daemon?                 true                                                                
  Ignore verification errors?                         false                                                               
  Automatically correct errors found during           Interactively                                                       
  cluster start?

Modify the options as required such as the node to start the cluster services on and press return.

 
Starting Cluster Services on node: node1
This may take a few minutes.  Please wait...
node1: start_cluster: Starting HACMP
node1: 0513-029 The portmap Subsystem is already active.
node1: Multiple instances are not supported.
node1: 0513-029 The inetd Subsystem is already active.
node1: Multiple instances are not supported.
node1:    77872      -  0:00 syslogd
node1: Setting routerevalidate to 1
node1: 0513-059 The topsvcs Subsystem has been started. Subsystem PID is 585862.
node1: 0513-059 The grpsvcs Subsystem has been started. Subsystem PID is 860414.
node1: 0513-059 The emsvcs Subsystem has been started. Subsystem PID is 929926.
node1: 0513-059 The emaixos Subsystem has been started. Subsystem PID is 704586.
node1: 0513-059 The gsclvmd Subsystem has been started. Subsystem PID is 643118.
node1: 0513-059 The clinfoES Subsystem has been started. Subsystem PID is 794646.
node1: Mar  9 2010 17:10:25 Starting execution of /usr/es/sbin/cluster/etc/rc.cluster
node1: with parameters: -boot -N -A -i -C interactive -P cl_rc_cluster
node1:
node1: Mar  9 2010 17:10:34 Checking for srcmstr active...
node1: Mar  9 2010 17:10:34 complete.
node1: Mar  9 2010 17:10:34
node1: /usr/es/sbin/cluster/utilities/clstart: called with flags -m -G -i -P cl_rc_cluster -C interactive -B -A
node1:
node1:         Mar  9 2010 17:10:56
node1: Completed execution of /usr/es/sbin/cluster/etc/rc.cluster
node1: with parameters: -boot -N -A -i -C interactive -P cl_rc_cluster.
node1: Exit status = 0
node1:

 

To stop the cluster services on a node use the following HACMP script

/usr/sbin/cluster/etc/clstop

 

To start the clcomdES daemon

startsrc -s clcomdES

 

How to modify the heartbeat failure detection rate

This procedure should be done on an active HACMP node

1. Type smitty hacmp.
2. Go to Extended Configuration.
3. Select Extended Topology Configuration.
4. Select Configure HACMP Network Modules.
5. Select Change a Network Module using Predefined Values and press Enter. SMIT displays a list of defined network modules.
6. Select the name of the network module for which you want to see current settings and press Enter.

Type or select values in entry fields.
Press Enter AFTER making all desired changes.
                                                        [Entry Fields]
* Network Module Name                                 tmscsi
  Description                                         TMSCSI Serial protocol
  Failure Detection Rate                              Normal

  NOTE: Changes made to this panel must be
        propagated to the other nodes by
        Verifying and Synchronizing the cluster 

Change normal to slow if for instance you are experiencing network contention and the heartbeats are taking some time to complete.

Do not follow the above procedure on the other nodes in the cluster but instead synchronise the changes by completing the following procedure.

1. Type smitty hacmp.
2. Go to Extended Configuration.
3. Select Extended Verification and Synchronization.
4. Press Enter with the default settings.

Changing the location of HACMP log files

If you know the log file whose location you would like to change use the following command replacing 'clstrmgr.debug' as necessary.

/usr/es/sbin/cluster/utilities/cllog -c 'clstrmgr.debug' -v '/var/adm'

Use smitty if you are not sure of the current log file names and locations as follows.

smitty hacmp

                    HACMP for AIX

Move cursor to desired item and press Enter.

  Initialization and Standard Configuration
  Extended Configuration
  System Management (C-SPOC)
  Problem Determination Tools

Select "Problem Determination Tools"

----------------------------------------------------------------------------------------------------------------

               Problem Determination Tools



Move cursor to desired item and press Enter.

  HACMP Verification
  View Current State
  HACMP Log Viewing and Management
  Recover From HACMP Script Failure
  Restore HACMP Configuration Database from Active Configuration
  Release Locks Set By Dynamic Reconfiguration
  Clear SSA Disk Fence Registers
  HACMP Cluster Test Tool
  HACMP Trace Facility
  HACMP Event Emulation
  HACMP Error Notification
  Manage RSCT Services

  Open a SMIT Session on a Node

Select "HACMP Log Viewing and Management"

----------------------------------------------------------------------------------------------------------------

             HACMP Log Viewing and Management

Move cursor to desired item and press Enter.

  View/Save/Remove HACMP Event Summaries
  View Detailed HACMP Log Files
  Change/Show HACMP Log File Parameters
  Change/Show Cluster Manager Log File Parameters
  Change/Show a Cluster Log Directory
  Collect Cluster log files for Problem Reporting

Select "Change/Show a Cluster Log Directory"

----------------------------------------------------------------------------------------------------------------

              Select a Cluster Log Directory                      
                                                                         
  Move cursor to desired item and press Enter. Use arrow keys to scroll.   
                                                                                                 
  clstrmgr.debug           - Generated by the clstrmgr daemon            
  cluster.log              - Generated by cluster scripts and daemons    
  cluster.mmddyyyy         - Cluster history files generated daily       
  cspoc.log                - Generated by CSPOC commands                 
  emuhacmp.out             - Generated by the event emulator scripts     
  hacmp.out                - Generated by event scripts and utilities    
  clavan.log               - Generated by Application Availability Analy 
  clverify.log             - Generated by Cluster Verification utility   
  clcomd.log               - Generated by clcomd daemon                  
  clcomddiag.log           - Generated by clcomd daemon, debug informati 
  clconfigassist.log       - Generated by Two-Node Cluster Configuration 
  clutils.log              - Generated by cluster utilities and file pro 
  cl_testtool.log          - Generated by the Cluster Test Tool          
  autoverify.log           - Generated by Auto Verify and Synchronize    
  sa.log                   - Generated by Application Discovery  

Select a log file such as clstrmgr.debug

              Change/Show a Cluster Log Directory

Type or select values in entry fields.
Press Enter AFTER making all desired changes.

                                                        [Entry Fields]
  Cluster Log Name                                    clstrmgr.debug
  Cluster Log Description                             Generated by the clstrmgr daemon
  Default Log Destination Directory                   /tmp
* Log Destination Directory                          [/var/adm]
  Allow Logs on Remote Filesystems                    false    

Change the "Log Destination Directory" value

Do not follow the above procedure on the other nodes in the cluster but instead synchronise the changes by completing the following procedure.

1. Type smitty hacmp.
2. Go to Extended Configuration.
3. Select Extended Verification and Synchronization.
4. Press Enter with the default settings.