Graceful shutdown of a SFRAC 5.1SP1 node, triggered panic on the other node.
From messages log of the node 0 that was being shutdown for maintenance, we see the CVM did not go down clean because application was still not shutdown properly.
Jun 1 14:13:19 top1 snmpd: Received TERM or STOP signal... shutting down...
Jun 1 14:13:20 top1 xinetd: Exiting...
Jun 1 14:13:22 top1 kernel: GAB INFO V-15-1-20032 Port d closed
Jun 1 14:13:32 top1 kernel: GAB ERROR V-15-1-20015 unconfigure failed: clients still registered <<<<<===
From the vxfen.log, we see vxfenconfig –u failed because VCS is still running. Port h is _HAD_
Fri Jun 1 14:13:22 EDT 2012 vxfenconfig -U returned 1
Fri Jun 1 14:13:22 EDT 2012 vxfenconfig -U output is VXFEN vxfenconfig ERROR V-11-2-1023 Unable to unconfigure fencing since clients still active
VXFEN vxfenconfig ERROR V-11-2-1060 Please retry after shutting down VCS (GAB port h) <<<<<===
and/or CVM (GAB ports u/v/w).
From the vxfendebug debug log captured after vxfen stop failed. Note this node 0 is racing for coordinator disks because of the force unload triggered it.
lbolt: 5367264401 vxfen_io.c ln 2574 VXFEN: vxfen_set_roles: RACER NODE is: 0
lbolt: 5367264401 vxfen_io.c ln 2575 VXFEN: Change fence state node: 0 from: STADIUM_OPEN to: RACER
lbolt: 5367264401 vxfen_io.c ln 2621 VXFEN: vxfen_set_roles: end
lbolt: 5367264401 vxfen_fence.c ln 154 VXFEN: vxfen_grab_coord_pt - begin
lbolt: 5367264401 vxfen_scsi3.c ln 198 VXFEN: vxfen_grab_coord_disks: - begin
lbolt: 5367264401 vxfen_scsi3.c ln 210 VXFEN: vxfen_grab_coord_disks: lowest_node: -1
lbolt: 5367264401 vxfen_scsi3.c ln 465 vxfen_grab_coord_disks: ejecting other node: 1
lbolt: 5367264401 vxfen_scsi3_device.c ln 469 VXFEN: vxfen_preempt_abort: begin lowest node 1
lbolt: 5367264401 vxfen_scsi3_device.c ln 483 VXFEN: node_num: 0
lbolt: 5367264401 vxfen_scsi3_device.c ln 494 VXFEN: resv_key: V victim: V <<<<== these are the PGR keys.
and was able to clear the PGR keys for node 1.
Because node 1 sees the GAB ports u/v/w/h were active when node 0 went down, node 1 (top2) fencing gets triggered and it races for fencing disks. But, it is unable to find its PGR keys because node 0 had already cleared it before shutting down. So, node 1 thinks it lost the race and panics.
Jun 1 14:14:03 top2 Had: VCS INFO V-16-1-10077 Received new cluster membership
Jun 1 14:14:03 top2 kernel: sd 1:0:0:88: reservation conflict
Jun 1 14:14:03 top2 kernel: VXFEN WARNING V-11-1-65 Could not eject node 0 from disk
Jun 1 14:14:03 top2 kernel: with serial number 60060480000190103338533031374632 since
Jun 1 14:14:03 top2 kernel: keys of node 1 are not registered with it
Jun 1 14:14:03 top2 kernel: sd 1:0:0:89: reservation conflict
From the core, we see that following in vxfendebug:
lbolt: 4577747279 vxfen_linux.c ln 557 VXFEN: vxfen_plat_pgr_in: end. Status: 0
lbolt: 4577747279 vxfen_scsi3_device.c ln 114 VXFEN: vxfen_readkeys: end ret: 0
lbolt: 4577747279 vxfen_scsi3_device.c ln 170 VXFEN: vxfen_checkreg: num_keys: 2 <<<== two PGR keys found
lbolt: 4577747279 vxfen_scsi3_device.c ln 208 VXFEN: vxfen_checkreg: end
lbolt: 4577747279 vxfen_scsi3.c ln 343 vxfen_grab_coord_disks:READ KEYS shows LOCAL NODE no longer registered. <<<<=== but not for this node
lbolt: 4577747289 vxfen_scsi3.c ln 69 vxfen_skip_multipaths: skipping the device: device_num: 8388736 serial_num: 60060480000190103338533031374634 npaths: 1
lbolt: 4577747289 vxfen_scsi3.c ln 432 Total coord disks: 3, grabbed disks: 0
lbolt: 4577747289 vxfen_scsi3.c ln 443 : vxfen_grab_coord_disks: node 1 lost the race and committing suicide <<<<===
lbolt: 4577747289 vxfen_io.c ln 1627 VXFEN: vxfen_bcast_msg: begin
lbolt: 4577747289 vxfen_io.c ln 1643 VXFEN: vxfen_bcast_msg: end
lbolt: 4577747289 vxfen_fence.c ln 307 VXFEN: vxfen_racer_lost: Sent VXFEN_MSG_LOST_RACE. Shall panic when broadcast completes, or after 6 secondslbolt: 45777472
The key thing here is the cluster did not get shutdown gracefully and it triggered the vxfencing on both the nodes to race for fencing. However, VCS Fencing did not have the checks to stop the node going down from winning the race for coordinator disks. These checks will be included in the 6.0 and 5.1SP1RP3 releases so a node leaving (graceful or ungraceful) won't win the race for coordinator disks.
This race condition can be prevented if the application was shutdown before unmounting the cluster filesystems. Symantec recommend configuring the application as Application resource under VCS so the application gets shutdown properly and all files closed before CFS/CVM shutdown gets initiated.
Symantec also identified a problem where graceful shutdown of a cluster node triggered race on the existing cluster and the racer somehow could not eject the keys of the rebooting node and this caused racer to panic itself. There is a fix available for the graceful node shutdown in 6.0 & 5.1SP1RP3 but no HF on top of 5.1SP1RP2. 5.1SP1RP3 is expected to be released by end of Aug 2012.
The following is extracted from the VCS 6.0 Release Notes.
Changes introducted in 6.0
Graceful shutdown of a node no longer triggers I/O fencing race condition on peer nodes
In the earlier releases, a gracefully leaving node clears its I/O fencing keys from coordination points. But the remaining sub-cluster races against the gracefully leaving node to remove its registrations from the data disks. During this operation, if the sub-cluster loses access to the coordination points, the entire cluster may panic if the racer loses the race for coordination points.
In this release, this behavior has changed. When a node leaves gracefully, the CVM or other clients on that node are stopped before the VxFEN module is unconfigured. Hence, data disks are already clear of its keys. The remaining sub-cluster tries to clear the gracefully leaving node’s keys from the coordination points but does not panic if it is not able to clear the keys.