Troubleshooting missing disks, foreign disks and unknown disk groups (Unknown Dg)

Article: 100018086
Last Published: 2022-01-23
Ratings: 1 1
Product(s): InfoScale & Storage Foundation

Problem

This article discusses how to troubleshoot missing disks, foreign disks and unknown disk groups (Unknown Dg).

Solution

Table of Contents

1. Introduction
2. Initial troubleshooting steps
3. Disks are not detected by Veritas
4. Disks are detected as "foreign"
5. Disks are detected as "basic"  
6. Problem has been determined to be outside of the scope of Veritas

 

1. Introduction

This article covers multiple causes that can result in disks in the Veritas Enterprise Administrator (VEA) console showing as Missing, Foreign, or Unknown. It also covers multiple troubleshooting steps that can be performed in hopes of narrowing down the cause and/or providing a solution. 

(Back to top)
 

Missing Disks

When VEA displays missing disks, it is possible that the disks are actually present, but they are being detected as foreign or basic disks instead. If VEA displays missing disks, look for any foreign or basic disks that may actually be the missing  disks. Foreign disks will often appear under a disk group called "Unknown Dg," but they may show up under any disk group. There are also cases where the missing disks are indeed missing, and will not be detected by Veritas at all.


Foreign Disks

When VEA displays foreign disks, it means that the disks are detected but there is either a problem with the private region, or there is a problem reading the disk.  It might also be possible that the disk cannot be written to such as read only or SCSI reserved disks.  Missing and foreign disks often occur together. There is a great deal of overlap for the initial troubleshooting steps of these situations. For  this reason, this article can be used as a general reference for troubleshooting both.

 

2. Initial Troubleshooting Steps

(Back to top)
 

Before continuing review the following:

1. Know the exact number of disks that should be present. This information will assist with determining the nature of the problem. This number should be compared to the number of disks that are detected by the Microsoft Windows Device Manager, as well the Veritas Enterprise Administrator.

2. Consider any recent hardware or software changes that have been made. It is very common to dismiss this step, only to later discover (after much troubleshooting and effort) that a recent, minor change was the cause of the problem and the resolution was to simply revert this change. Some common updates and changes that can be potentially problematic include:
  • Zoning changes: Verify that the HBAs (host bus adapters) for the affected server(s) have been included in the same zone as the disks.
  • Masking: Verify that all the disks are masked to the correct HBA, especially when multipathing is used
  • HBAs: Use the HBA management software to verify that the HBA settings are correct. Compare the settings for an affected server with a server  that is not exhibiting problems (if present).
  • Storage area network (SAN) hardware changes, such as HBAs, switches or disk arrays
  • Changes to Multipathing settings
  • Driver updates
  • Firmware or Microcode updates
  • Other Hardware changes
  • Other Software installations

3. Verify that the configuration matches the Hardware Compatibility List (HCL). Information about reviewing the HCL can be obtained from Veritas SORT:  https://sort.veritas.com
 
4. Perform a rescan.
  • From VEA, select the Actions drop down menu,and select Rescan.

5. In the case of cluster systems, verify the disk group is not imported on another node.  It is possible that one node imports some of the disks in the disk group and another node imports the remainder.  Both systems will then show missing disks.  Deport the disk group on one of the nodes and rescan the other node.

6. Verify that there are no read errors, write errors, or other disk access errors on the server. Review the system event log for any messages logged by vxio during rescan or attempted imports. Typical messages at this point can point to issues accessing the disk due to SCSI reservations.
 
7. Bring the system down to a single path.

In many cases, missing and foreign disks appear as the result of a problem with multipathing. To rule this possibility out, disable all but one path to the disks. It is recommended that this step be taken early in the troubleshooting process before more drastic troubleshooting steps are attempted.  It is particularly import to rule out multipathing issues as solutions taken below can make the issue worse.

Multipathing issues are usually apparent when:
  • Microsoft Partition Manager driver (partmgr) is reporting that duplicate disks are seen in the system event viewer after a system restart.
  • The disks making up a disk group keep changing between the disk group and the UnknownDG when rescanning.
  • The number of disks on a system is a multiple of the number of paths.  For example, if you expect 10 disks with 4 paths, but see 40 disks, then the multipathing driver has not claimed the disks as multipath disks.
a. Disable one of the paths. This can be done by disconnecting the fiber cable from one of the HBAs or by disabling one the HBAs from the Windows  Device Manager. 
b. Perform a rescan
c. Connect each path, one at a time, followed by a rescan. This will determine if the disks are being detected by the host down one path, but not  the other paths.

 
8. Check for failed providers. The steps for checking for failed providers can be found in the following article:

Checking and troubleshooting failed providers in Veritas Storage Foundation for Windows
https://www.veritas.com/docs/000033008

 

3. Disks are Not Detected by Veritas

(Back to top)


This section may also be applied to cases where missing disks are listed, but the missing disks do not appear to correlate with any visible foreign or basic disks.

1. Verify that the disks are detected by the Operating System (OS). This can be done from the Windows Device Manager or by running Diskpart from a Windows command prompt. 

2. Verify that the disks are detected by the HBA Management Software. Examples of HBA management software include:

  • Emulex HBAnywhere 
  • QLogic Sansurfer 
  • IBM Navisphere 
 
a. If the disks are not detected by either Windows or the HBA Management software, review the following:

If either the HBA management software or Windows are unable to detect the disks, the issue lies outside the scope of Veritas for  Windows (SFW). Both the HBA drivers and the Windows disk drivers reside below SFW. If the disks are not detected by either the Windows disk  drivers or the HBA drivers, SFW will not detect the disks either.

Note: In some cases, the HBA management software may detect the disks even though neither Windows, nor SFW can detect the disks. This is possible because the HBA drivers reside at a lower layer than either the Windows disk driver or SFW. In this case, the HBA settings, firmware and drivers should be examined to determine why the disks are not being presented to the Windows disk drivers.

Review Section 6 for recommendations for situations where the problem has been determined to be outside the scope of Veritas.


b. If both Windows and the HBAs detect the disks, but SFW cannot, review the following:
 
i. Check for failed providers (if this has not already been done). Further information on this can be found in the following article:

Checking and troubleshooting failed providers in Veritas Storage Foundation for Windows
https://www.veritas.com/docs/000033008


ii. Verify that the latest maintenance packs or roll-up patches have been installed for SFW. Information about the latest patches can be obtained from Veritas SORT:  https://sort.veritas.com

iii. Uninstall and reinstall SFW.

 

4. Disks are detected as Foreign

(Back to top)


This section may also be applied to cases where missing disks are listed, but the missing disks appear to correspond with disks that have  been marked as foreign or basic.

1. Attempt to reactivate the disk 

Sometimes a foreign status is caused due to a STALE or BAD-STATE status that has been set in the private region. Reactivating the disk can sometimes clear this flag and restore the disk to a healthy status.

a. Right-click on the disk.
b. Select Reactivate Disk.


2. Attempt a Merge Foreign disk operation.

This operation changes the  disksetid of the foreign disk to match the  disksetid of the other disks in the disk group.
a. Right-click on the disk.
b. Select  Merge Foreign Disk.

 
3. Determine if there are any other disks in the disk group that are still healthy (have a valid copy of the private region). If healthy disks still exist in the disk group, a replace disk operation may be attempted. A replace disk operation points a "missing" disk record to a disk that is actually present. This is useful in cases where a disk has been marked as "missing" even though it is clearly present.
 
Note: If multiple disks are missing, attempting a replace disk operation is not recommended unless there is a clear understanding of which disk should be associated with which missing disk record.  This operation is only possible if the disk(s) that are missing are in the basic group. It is a rare case for a disk signature to be reset other than manually, and foreign disks for example may require their signature be manually reset for this operation to succeed.  In these scenarios due to the potential for data corruption it is recommended to contact Veritas Technical Support for assistance in confirming the replace disk operation is using the correct disks.

To perform a replace disk operation, perform the following steps:
a. Right-click on the missing disk.
b. Select Replace Disk.
c. Choose the disk that correlates with the missing disk record.
 
Note: If the replace disk operation returns "no new disks exist in the system," the affected disk will need to be modified to allow the replace disk operation to succeed. Contact Veritas Technical Support for further information.
 
4. Restore the private region with VxCBR.
 
How to use VxCBR to back up and restore the disk group configuration, private region, of a Storage Foundation (tm) for Windows (SFW) dynamic disk group
https://www.veritas.com/docs/000032223

 

5. Disks are detected as Basic

(Back to top)

This internal article contains information on troubleshooting disks that unexpectedly reverted to basic disks.
Dynamic disk unexpectedly reverts to a basic disk without volumes in Veritas Storage Foundation for Windows

 

6. Problem has been determined to be outside of the scope of Veritas

(Back to top)

The following are recommendations for situations where the problem has been determined to be outside the scope of SFW:

1. Shut down the other nodes and (if possible) reboot this node.
2. Reset the SCSI bus. This can be performed from VEA.
 
Warning: A reset SCSI bus operation will break the SCSI reservations for all devices on the bus. Do not perform this step without a full  understanding of which devices are connected to the same SCSI bus and how they will be affected by this operation.
 
3. Uninstall any multipathing software.
 
Warning: Before uninstalling multipathing software, ensure that there is only one path to the disks.
 
4. Check the disk array to verify that there is not a formatting problem with the actual LUNs (Logical Unit Numbers).
5. Review the HBA settings using the HBA management software.
6. Review zoning. In particular, ensure that the HBAs for the affected servers are included in the same zones as the disks
7. Review LUN Masking. Ensure that the LUNs are being presented to the HBAs of the affected servers.

Was this content helpful?