CVM: Clusterid mismatch is preventing CVM from starting and importing shared disk groups

Article: 100046796
Last Published: 2025-07-02
Ratings: 1 1
Product(s): InfoScale & Storage Foundation

Problem


In this instance, CVM is failing to start on a single host as one or more disks have conflicting cluster IDs for the same disks accessible and imported on other nodes in the existing cluster.
 

In a CVM cluster, the vxdg -s import option imports a disk group as cluster-sharable.
When attempting to start the CVM service group on a node, it fails stating "the disks are in use by another cluster" while starting the cluster or "No valid disk found containing disk group: retry to add a node failed" if one or more nodes rebooted from the running cluster.

The vxdg -s import operation is only valid if the CVM clustering components are active on the importing host.

Ensure that all the disks in a shared disk group are physically accessible by all hosts. A host which cannot access all the disks in a shared disk group cannot join the cluster.

Disks in a shared disk group are stamped with the ID of the cluster and with the shared flag.

When a host joins the cluster, it automatically imports disk groups whose disks are stamped with the cluster ID. 

 

Error Message


When attempting to online the CVM service group the following log message can be seen:

2019/12/03 12:19:39 VCS ERROR V-16-20006-1005 CVMCluster:cvm_clus:monitor:node - state: out of cluster
reason: Disk in use by another cluster: retry to add a node failed

 

Or

Jul  1 17:32:09 VCS ERROR V-16-20006-1005 CVMCluster:cvm_clus:monitor:node - state: out of cluster#012reason: No valid disk found containing disk group: retry to add a node failed

Cause


When verifying the cluster ID for all shared disks, one of the nodes in the cluster reports a cluster ID mismatch.

In this instance, the correct cluster ID should be reflected as "fred". However, a subset of shared disks are reporting a different incorrect cluster ID of "barney".


How to display the cluster ID for all shared disks

# for i in $disk; do echo $i; vxdisk list $i | grep -i clusterid; done
3pardata0_60
clusterid: barney
3pardata0_61
clusterid: barney
3pardata0_62
clusterid: fred
3pardata0_63
clusterid: fred


The cluster ID mismatch needs to be corrected and aligned across all the nodes in the Veritas cluster.

The /etc/vx/diag.d/vxprivutil utility can be used to validate the cluster ID written in the disks private region (on-disk).
 
# /etc/vx/diag.d/vxprivutil list /dev/vx/rdmp/3pardata0_60 hostid
diskid:  1478039917.53.charlie
group:   name=datadg id=1478040527.89.charlie
flags:   shared autoimport cds
hostid:  barney                                     <<<< should be stating fred
version: 3.1
iosize:  512
public:  slice=3 offset=65792 len=503229520
private: slice=3 offset=256 len=65536
update:  time=1574987544  seqno=0.75
ssb:     actual_seqno=0.0
headers: 0 240
configs: count=1 len=51360
logs:    count=1 len=4096
tocblks: 0
tocs:    16/65520
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
 config   priv 000256-051423[051168]: copy=01 offset=000192 enabled
 log      priv 051424-055519[004096]: copy=01 offset=000000 enabled
 lockrgn  priv 055520-055663[000144]: part=00 offset=000000
 tagid    priv 065488-065503[000016]: tag=udid_asl=3PARdata%5FVV%5FA536%5F003C0001A536


NOTE: In the above example, the incorrect cluster ID of " barney" is shown instead of the expected cluster ID of " fred".

Occasionally, the cluster ID seen in the VxVM kernel by running " vxdisk list <disk-name>" may disagree with what is actually written on-disk.


NOTE: It is critical that all possible hosts referencing the conflicting cluster ID are checked, ensuring the shared disk group is not imported on the hosts with conflicting clusterid.

Solution


To ensure the VxVM disk group configuration structure available in the kernel is refreshed on-disk, run "vxdg flush <dg-name>" from the master node(by making sure master node has the correct cluster id stamped on all the disks).

The flush operation should update the on-disk content, correcting any kernel and on-disk mismatch, therefore refreshing the conflicting cluster ID mismatch.

Following the flush operation, validating the expected cluster ID is now showing the correct details using the above for loop and vxprivutil list results.

 

# /etc/vx/diag.d/vxprivutil list /dev/vx/rdmp/3pardata0_60 hostid
diskid:  1478039917.53.charlie
group:   name=datadg id=1478040527.89.charlie
flags:   shared autoimport cds
hostid:  fred                                     <<<< the correct cluster ID is now shown
version: 3.1
iosize:  512
public:  slice=3 offset=65792 len=503229520
private: slice=3 offset=256 len=65536
update:  time=1574989544  seqno=0.76
ssb:     actual_seqno=0.0
headers: 0 240
configs: count=1 len=51360
logs:    count=1 len=4096
tocblks: 0
tocs:    16/65520
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
 config   priv 000256-051423[051168]: copy=01 offset=000192 enabled
 log      priv 051424-055519[004096]: copy=01 offset=000000 enabled
 lockrgn  priv 055520-055663[000144]: part=00 offset=000000
 tagid    priv 065488-065503[000016]: tag=udid_asl=3PARdata%5FVV%5FA536%5F003C0001A536

 

Was this content helpful?