Error: "PR operation failed: retry to add a node failed" reported when node(s) attempt to rejoin a cluster.
Problem
Node(s) fail to rejoin a cluster.
Error Message
The following is written in the /opt/VRTSvcs/log/cvm_cluster_A.log:
VCS ERROR V-16-20006-1005 (xxxxxx) CVMCluster:cvm_clus:monitor:node - state: out of cluster
reason: SCSI-3 PR operation failed: retry to add a node failed
The vxfen service group may also fault with the following being written in the /opt/VRTSVCS/log/Coordpoint_A.log:
VCS WARNING V-16-10061-663 CoordPoint:coordpoint:monitor:Node 0 not registered on disk /dev/vx/rdmp/xxxxxxx_1 2 .. 3
Where /dev/vx/rdmp/xxxxxxx_1 2 .. 3
are devices in the fencing disk group.
Cause
The cluster is running on Vmware, SCSI-3 disk based fencing is enabled and the nodes in the cluster are sharing ESXi hosts.
Solution
Ensure each node in the cluster is on a separate ESXi host. If fencing is configured, the Arctera InfoScale Enterprise nodes (in the same Arctera InfoScale Enterprise cluster) have to be running on separate physical ESXi hosts.This is documented in a number of the User Guides, specifically the Infoscale Virtualization Guide.