CVM Cluster join cannot be established unless the nodes are started in a specific order and sequence
Problem
CVM Cluster join cannot be established unless the nodes are started in a specific order and sequence. This particular scenario happens when the nodes within a cluster do not see the same number of paths to the disk.
For example, node A sees 2 paths to all of the data disk group whereas node B sees only 1 path to one of the data disk group.
Error Message
Here are the sequence of logged events in /var/adm/messages when a similar situation occurs. It can be seen here that port v and w have established membership (ie: membership 01) but simply produces the connection time out error message which leads to the CVM join to fail.
Aug 15 03:21:16 H22RRMOBDB01 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port v gen 143b11b membership 01
Aug 15 03:21:27 H22RRMOBDB01 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port w gen 143b11d membership 01
Aug 15 03:21:27 H22RRMOBDB01 vxvm:vxconfigd: [ID 511694 daemon.error] V-5-1-8756 allow join for node 1 failed: Connection timed out
Aug 15 03:21:27 H22RRMOBDB01 vxvm:vxconfigd: [ID 448643 daemon.notice] V-5-1-3765 master: cluster join complete for node 1
Aug 15 03:21:27 H22RRMOBDB01 vxvm:vxconfigd: [ID 699813 daemon.notice] V-5-1-7899 CVM_VOLD_CHANGE command received
Aug 15 03:21:27 H22RRMOBDB01 vxvm:vxconfigd: [ID 322665 daemon.notice] V-5-1-7961 establishing cluster
Aug 15 03:21:27 H22RRMOBDB01 vxvm:vxconfigd: [ID 277465 daemon.notice] V-5-1-8062 master: not a cluster startup
Aug 15 03:24:42 H22RRMOBDB01 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port w gen 143b11e membership 0
Aug 15 03:24:42 H22RRMOBDB01 gab: [ID 674723 kern.notice] GAB INFO V-15-1-20038 Port w gen 143b11e k_jeopardy ;1
Aug 15 03:24:42 H22RRMOBDB01 gab: [ID 513393 kern.notice] GAB INFO V-15-1-20040 Port w gen 143b11e visible ;1
Aug 15 03:24:42 H22RRMOBDB01 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port v gen 143b11c membership 0
Aug 15 03:24:42 H22RRMOBDB01 gab: [ID 674723 kern.notice] GAB INFO V-15-1-20038 Port v gen 143b11c k_jeopardy ;1
Aug 15 03:24:42 H22RRMOBDB01 gab: [ID 513393 kern.notice] GAB INFO V-15-1-20040 Port v gen 143b11c visible ;1
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 699813 daemon.notice] V-5-1-7899 CVM_VOLD_CHANGE command received
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 778436 daemon.error] V-5-1-4109 -1 returned from volcvm_establish
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 886039 daemon.error] V-5-1-4852 cluster_establish: timed out
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 391371 daemon.error] V-5-1-11111 kernel_fail_join() : master_takeover is -1
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 565473 daemon.notice] V-5-1-9543 Timeout is not reset: another reconfig in progress
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 322665 daemon.notice] V-5-1-7961 establishing cluster
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 277465 daemon.notice] V-5-1-8062 master: not a cluster startup
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 451250 daemon.notice] V-5-1-8061 master: no joiners
Aug 15 03:24:42 H22RRMOBDB01 vxvm:vxconfigd: [ID 738708 daemon.notice] V-5-1-4123 cluster established successfully