VxVM DiskGroup import failed when VCS ServiceGroup is switched or failed over in VMware environments using VMwareDisks resource

Article: 100023572
Last Published: 2023-03-31
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem
 

Veritas Volume Manager (VxVM) DiskGroup may fail to import when Veritas Cluster Server (VCS) ServiceGroup (SG) is switched or failed over in VMware environment with error "No valid disk found containing disk group".
 

Error Message
 

Sample Errors
 

vxvm:vxconfigd: V-5-1-11551 array_da_to_disk: Cannot find active path for dmpnode vmdk0_10
vxvm:vxconfigd:
vxvm:vxconfigd: V-5-1-11551 array_da_to_disk: Cannot find active path for dmpnode vmdk0_7
vxvm:vxconfigd:
vxvm:vxconfigd: V-5-1-11551 array_da_to_disk: Cannot find active path for dmpnode vmdk0_9
vxvm:vxconfigd:
vxvm:vxconfigd: V-5-1-11551 array_da_to_disk: Cannot find active path for dmpnode vmdk0_8
vxvm:vxconfigd:
vxvm:vxconfigd: V-5-1-11551 array_da_to_disk: Cannot find active path for dmpnode vmdk0_10
vxvm:vxconfigd:
vxvm:vxconfigd: V-5-1-11551 array_da_to_disk: Cannot find active path for dmpnode vmdk0_7
vxvm:vxconfigd:
vxvm:vxconfigd: V-5-1-16253 Disk group import of ***dg failed with error 150 - No valid disk found containing disk group
vxvm:vxconfigd: V-5-1-16253 Disk group import of ***dg failed with error 150 - No valid disk found containing disk group
Had[3517]: VCS ERROR V-16-10031-1503 (*****) DiskGroup:***dgres:online:** ERROR: vxdg import (force) failed on Disk Group ***dg. 

 

Cause

 

This particular issue is due to the VCS DiskGroup resource being onlined before vxconfigd finishes updating the VxVM & DMP databases.

At this time there is no synchronization between the DiskGroup resource onlining and vxconfigd device scanning.
 

Known issue:
 

DiskGroup resource online may take time if it is configured along with VMwareDisks resource [3638242]

If a service group is configured with VMwareDisks and DiskGroup resource, the DiskGroup resource may take time to come online during the service group online. This is because VxVM takes time to recognize a new disk that is attached by the VCS VMwareDisks resource.

A VMwareDisks resource attaches a disk to the virtual machine when the resource comes online and a DiskGroup resource, which depends on VMwareDisks resource, tries to import the disk group.

If vxconfigd does not detect the new disk attached to the virtual machine, the DiskGroup resource online fails with the following error message because the resource is not up even after the resource online is complete.

VCS ERROR V-16-2-13066 ... Agent is calling clean for resource(...)


This results in the VCS VMwareDisks resource online routine only managing to attach the VMDK disk(s) but fails to wait for VxVM & DMP to create the required DMPNODEs prior to starting the online sequence for the VCS DiskGroup resource online.

 

Solution


There is no solution for this issue at this time.


For each Virtual Machine (VM) participating in the cluster, the disk.EnableUUID parameter must be set to "TRUE" for each virtual machine

The disk.EnableUUID parameter is necessary to ensure the VMDK disk always presents a consistent UUID to all virtual machines

Follow the steps below from the vSphere client to enable the disk UUID on each Virtual Machine:
 

Enabling disk UUID on virtual machines
 

1. Power off the guest

2. Select the guest and select Edit Settings

3. Select the Options tab on top

4. Select General under the Advanced section

5. Select the Configuration Parameters... on right hand side

6. Check to see if the parameter disk.EnableUUID is set, if it is there then make sure it is set to TRUE
If the parameter is not there, select Add Row and add it

7. Power on the guest



Workaround:

The recommendation is to increase OnlineRetryLimit or somehow induce delay (e.g. preonline trigger) in initiating DiskGroup resource online event.

 

Configure OnlineRetryLimit to appropriate value, e.g. DiskGroup  OnlineRetryLimit set to 3.


For example, if the DiskGroup resource name is datadg:

# hares -override datadg OnlineRetryLimit
# hares -modify datadg  OnlineRetryLimit 3

 

The VCS OnlineRetryLimit attribute specifies the number of times the Online function is retried if the initial attempt to bring a resource online is unsuccessful.


 

Applies To

VCS VMwareDisks resource in VMware environments
VMDK

 

Was this content helpful?