Important Update: Cohesity Products Knowledge Base Articles


All Cohesity Knowledge Base Articles are now managed via the Cohesity Support Portal: https://support.cohesity.com/s/searchunify. The Knowledge Base articles available here will not reflect the latest information or may no longer be accessible.

VxFS module will fail to unload during a Storage Foundation upgrade involving Solaris non global zones in the environment

Article: 100012184
Last Published: 2014-03-24
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem

When performing an upgrade of Storage Foundation product on systems that have Solaris non global zones configured, the upgrade process will fail during the stage of unloading the existing VxFS module from the kernel. The upgrade process cannot continue without the successful unloading of the VxFS kernel module.

 

Error Message

During the upgrade process, when the currently loaded modules are attempted to be unloaded, the unloading of the vxfs module will fail and the following message is reported in the "Installation/Upgrade" logs located in the "/opt/VRTS/install/logs/" directory:

 

Veritas Storage Foundation Shutdown did not complete successfully
vxfs failed to stop on <node_name>
can't unload the module: Device busy

 

Cause

The issue is specific to a configuration on Solaris where a VxFS file system is mounted within a Solaris non global zone either as a direct mount or a loopback (lofs) file system. The Storage Foundation installation/upgrade documentation suggests that environments involving Solaris non global zones requires that all the non global zones be booted at the time of the install/upgrade. When a non global zone is booted, it will automatically mount the file system within the non global zone based on the Solaris non global zone configuration.

For instance, below is a sample snippet of a Solaris non global zone configuration of a VxFS file system which is mounted in the global zone and shared via a loopback mount within the non global zone:

e.g. The /etc/zones/zonename.xml file contains:

fs:        dir: /<mount_point_in_zone>        special: /<vxfs_filesystem_in_global_zone>        raw not specified        type: lofs        options: []

 

When the non global zone is booted up and in running state, it will have access to the actual VxFS file system which is mounted in the Solaris global zone via the loopback driver, which will hold a lock on the VxFS driver in the kernel. At this time, if the VxFS file system gets unmounted in the global zone due to any reason, the non global zone will also lose access to the conents of the file system, but however, the loopback driver will still hold the lock on the VxFS driver in the kernel. This means that if the VxFS driver is attempted to be unloaded from the kernel of the Solaris global zone when at least one non global zone is booted and was accessing a VxFS file system, it will fail to unload the module due to the existing locks held by the loopback driver in the Solaris non global system.

This is clearly a 'catch 22' situation where the non global zone needs to be booted up and in running state for a successful Storage Foundation upgrade, but the upgrade will not complete because the VxFS module is held open in the kernel by the loopback driver of the non global zone.

 

Solution

To resolve this issue of the Storage Foundation product upgrade failing due to the VxFS kernel module unable to unload, we need to use the following workaround:

 

1.  Stop all the non global zones running in the system

# zlogin <zone_name> shutdown


2.  Take a backup of the zone configuration files

# cp -rp /etc/zones  /<backup_directory>/

 

3.  Edit each zone configuration file (<zonename.xml) using an editor and manually remove each entry temporarily for a direct mount or a loopback mount file system configured within the non global zone that points to a VxFS file system

Exacmple:

# cd /etc/zones/

# vi zonename.xml

<<--- example snippet of the configuration to remove -->>

fs:        dir: /<mount_point_in_zone>        special: /<vxfs_filesystem_in_global_zone>        raw not specified        type: lofs        options: []

<<--- End of example snippet of the configuration to remove -->>

 

Then save the file and quit the editor.

 

4.  Verify and boot all the non global zones manually

# zoneadm -z <zonename> verify

# zoneadm -z <zonename> boot

 

5.  Confirm all non global zones are in 'running' state:

# zoneadm list -civ

 

6.  Perform the Storage Foundation product upgrade using the product documentation

 

7.  Once the upgrade has completed successfully, stop all the non global zones and replace the modified zone configuration files with the previously backed up original zone configuration *.xml files from step 2 above. 

# zlogin <zone_name> shutdown

# cp -rp /<backup_directory>/zones /etc/

 

8.  Verify and boot all the non global zones manually

# zoneadm -z <zonename> verify

# zoneadm -z <zonename> boot

 

9.  Confirm all non global zones are in 'running' state:

# zoneadm list -civ

 

10.  Manually verify that the loopback file system is available within the non global zones

 

Note: Even a reboot of the system will not resolve the issue and the above procedure is required to work around this problem.


Applies To

The issue can occur on any Storage Foundation environment that uses VxFS file systems on Solaris systems configured with non global zones (AND) having one or more VxFS file system being accessed from within the non global zone either as a direct mount configuration (OR) loopback configuration within the non global zones.

Was this content helpful?