Veritas InfoScale™ 7.3.1 Troubleshooting Guide - Solaris
- Introduction
- Section I. Troubleshooting Veritas File System
- Section II. Troubleshooting Veritas Volume Manager
- Recovering from hardware failure
- About recovery from hardware failure
- Listing unstartable volumes
- Displaying volume and plex states
- The plex state cycle
- Recovering an unstartable mirrored volume
- Recovering an unstartable volume with a disabled plex in the RECOVER state
- Forcibly restarting a disabled volume
- Clearing the failing flag on a disk
- Reattaching failed disks
- Recovering from a failed plex attach or synchronization operation
- Failures on RAID-5 volumes
- Recovering from an incomplete disk group move
- Restarting volumes after recovery when some nodes in the cluster become unavailable
- Recovery from failure of a DCO volume
- Recovering from instant snapshot failure
- Recovering from the failure of vxsnap prepare
- Recovering from the failure of vxsnap make for full-sized instant snapshots
- Recovering from the failure of vxsnap make for break-off instant snapshots
- Recovering from the failure of vxsnap make for space-optimized instant snapshots
- Recovering from the failure of vxsnap restore
- Recovering from the failure of vxsnap refresh
- Recovering from copy-on-write failure
- Recovering from I/O errors during resynchronization
- Recovering from I/O failure on a DCO volume
- Recovering from failure of vxsnap upgrade of instant snap data change objects (DCOs)
- Recovering from failed vxresize operation
- Recovering from boot disk failure
- VxVM and boot disk failure
- Possible root, swap, and usr configurations
- Booting from an alternate boot disk on Solaris SPARC systems
- The boot process on Solaris SPARC systems
- Hot-relocation and boot disk failure
- Recovery from boot failure
- Repair of root or /usr file systems on mirrored volumes
- Replacement of boot disks
- Recovery by reinstallation
- Managing commands, tasks, and transactions
- Backing up and restoring disk group configurations
- Troubleshooting issues with importing disk groups
- Recovering from CDS errors
- Logging and error messages
- Troubleshooting Veritas Volume Replicator
- Recovery from RLINK connect problems
- Recovery from configuration errors
- Errors during an RLINK attach
- Errors during modification of an RVG
- Recovery on the Primary or Secondary
- About recovery from a Primary-host crash
- Recovering from Primary data volume error
- Primary SRL volume error cleanup and restart
- Primary SRL volume error at reboot
- Primary SRL volume overflow recovery
- Primary SRL header error cleanup and recovery
- Secondary data volume error cleanup and recovery
- Secondary SRL volume error cleanup and recovery
- Secondary SRL header error cleanup and recovery
- Secondary SRL header error at reboot
- Troubleshooting issues in cloud deployments
- Recovering from hardware failure
- Section III. Troubleshooting Dynamic Multi-Pathing
- Section IV. Troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting Storage Foundation Cluster File System High Availability
- About troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting CFS
- Troubleshooting fenced configurations
- Troubleshooting Cluster Volume Manager in Veritas InfoScale products clusters
- CVM group is not online after adding a node to the Veritas InfoScale products cluster
- Shared disk group cannot be imported in Veritas InfoScale products cluster
- Unable to start CVM in Veritas InfoScale products cluster
- Removing preexisting keys
- CVMVolDg not online even though CVMCluster is online in Veritas InfoScale products cluster
- Shared disks not visible in Veritas InfoScale products cluster
- Troubleshooting Storage Foundation Cluster File System High Availability
- Section V. Troubleshooting Cluster Server
- Troubleshooting and recovery for VCS
- VCS message logging
- Log unification of VCS agent's entry points
- Enhancing First Failure Data Capture (FFDC) to troubleshoot VCS resource's unexpected behavior
- GAB message logging
- Enabling debug logs for agents
- Enabling debug logs for IMF
- Enabling debug logs for the VCS engine
- About debug log tags usage
- Gathering VCS information for support analysis
- Gathering LLT and GAB information for support analysis
- Gathering IMF information for support analysis
- Message catalogs
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting Intelligent Monitoring Framework (IMF)
- Troubleshooting service groups
- VCS does not automatically start service group
- System is not in RUNNING state
- Service group not configured to run on the system
- Service group not configured to autostart
- Service group is frozen
- Failover service group is online on another system
- A critical resource faulted
- Service group autodisabled
- Service group is waiting for the resource to be brought online/taken offline
- Service group is waiting for a dependency to be met.
- Service group not fully probed.
- Service group does not fail over to the forecasted system
- Service group does not fail over to the BiggestAvailable system even if FailOverPolicy is set to BiggestAvailable
- Restoring metering database from backup taken by VCS
- Initialization of metering database fails
- Troubleshooting resources
- Troubleshooting I/O fencing
- Node is unable to join cluster while another node is being ejected
- The vxfentsthdw utility fails when SCSI TEST UNIT READY command fails
- Manually removing existing keys from SCSI-3 disks
- System panics to prevent potential data corruption
- Cluster ID on the I/O fencing key of coordinator disk does not match the local cluster's ID
- Fencing startup reports preexisting split-brain
- Registered keys are lost on the coordinator disks
- Replacing defective disks when the cluster is offline
- The vxfenswap utility exits if rcp or scp commands are not functional
- Troubleshooting CP server
- Troubleshooting server-based fencing on the Veritas InfoScale products cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting the steward process
- Troubleshooting licensing
- Validating license keys
- Licensing error messages
- [Licensing] Insufficient memory to perform operation
- [Licensing] No valid VCS license keys were found
- [Licensing] Unable to find a valid base VCS license key
- [Licensing] License key cannot be used on this OS platform
- [Licensing] VCS evaluation period has expired
- [Licensing] License key can not be used on this system
- [Licensing] Unable to initialize the licensing framework
- [Licensing] QuickStart is not supported in this release
- [Licensing] Your evaluation period for the feature has expired. This feature will not be enabled the next time VCS starts
- Verifying the metered or forecasted values for CPU, Mem, and Swap
- VCS message logging
- Troubleshooting and recovery for VCS
- Section VI. Troubleshooting SFDB
Cannot boot from unusable or stale plexes
If a disk is unavailable when the system is running, any mirrors of volumes that reside on that disk become stale. This means that the data on that disk is inconsistent relative to the other mirrors of that volume. During the boot process, the system accesses only one copy of the root volume (the copy on the boot disk) until a complete configuration for this volume can be obtained.
If it turns out that the plex of this volume that was used for booting is stale, the system must be rebooted from an alternate boot disk that contains non-stale plexes. This problem can occur, for example, if the system was booted from one of the disks made bootable by VxVM with the original boot disk turned off. The system boots normally, but the plexes that reside on the unpowered disk are stale. If the system reboots from the original boot disk with the disk turned back on, the system boots using that stale plex.
Another possible problem can occur if errors in the VxVM headers on the boot disk prevent VxVM from properly identifying the disk. In this case, VxVM does not know the name of that disk. This is a problem because plexes are associated with disk names, so any plexes on the unidentified disk are unusable.
A problem can also occur if the root disk has a failure that affects the root volume plex. At the next boot attempt, the system still expects to use the failed root plex for booting. If the root disk was mirrored at the time of the failure, an alternate root disk (with a valid root plex) can be specified for booting.
If any of these situations occur, the configuration daemon, vxconfigd, notes it when it is configuring the system as part of the init processing of the boot sequence. vxconfigd displays a message describing the error and what can be done about it, and then halts the system. For example, if the plex rootvol-01 of the root volume rootvol on disk rootdisk is stale, vxconfigd may display this message:
VxVM vxconfigd ERROR V-5-1-1049: System boot disk does not have a valid root plex Please boot from one of the following disks: Disk: disk01 Device: c0t1d0s2 vxvm:vxconfigd: Error: System startup failed The system is down.
This informs the administrator that the alternate boot disk named disk01 contains a usable copy of the root plex and should be used for booting. When this message is displayed, reboot the system from the alternate boot disk.
Once the system has booted, the exact problem needs to be determined. If the plexes on the boot disk were simply stale, they are caught up automatically as the system comes up. If, on the other hand, there was a problem with the private area on the disk or the disk failed, you need to re-add or replace the disk.
If the plexes on the boot disk are unavailable, you should receive mail from VxVM utilities describing the problem. Another way to determine the problem is by listing the disks with the vxdisk utility. If the problem is a failure in the private area of root disk (such as due to media failures or accidentally overwriting the VxVM private region on the disk), vxdisk list shows a display such as this:
DEVICE TYPE DISK GROUP STATUS - - rootdisk bootdg failed was: c0t3d0s2 c0t1d0s2 sliced disk01 bootdg ONLINE