Veritas InfoScale™ 7.4.2 Troubleshooting Guide - Linux
- Introduction
- Section I. Troubleshooting Veritas File System
- Section II. Troubleshooting Veritas Volume Manager
- Recovering from hardware failure
- About recovery from hardware failure
- Listing unstartable volumes
- Displaying volume and plex states
- The plex state cycle
- Recovering an unstartable mirrored volume
- Recovering an unstartable volume with a disabled plex in the RECOVER state
- Forcibly restarting a disabled volume
- Clearing the failing flag on a disk
- Reattaching failed disks
- Recovering from a failed plex attach or synchronization operation
- Failures on RAID-5 volumes
- Recovering from an incomplete disk group move
- Restarting volumes after recovery when some nodes in the cluster become unavailable
- Recovery from failure of a DCO volume
- Recovering from instant snapshot failure
- Recovering from the failure of vxsnap prepare
- Recovering from the failure of vxsnap make for full-sized instant snapshots
- Recovering from the failure of vxsnap make for break-off instant snapshots
- Recovering from the failure of vxsnap make for space-optimized instant snapshots
- Recovering from the failure of vxsnap restore
- Recovering from the failure of vxsnap refresh
- Recovering from copy-on-write failure
- Recovering from I/O errors during resynchronization
- Recovering from I/O failure on a DCO volume
- Recovering from failure of vxsnap upgrade of instant snap data change objects (DCOs)
- Recovering from failed vxresize operation
- Recovering from boot disk failure
- VxVM and boot disk failure
- Possible root disk configurations
- The boot process
- VxVM boot disk recovery
- Recovery by reinstallation
- Manually unencapsulating a root disk
- Managing commands, tasks, and transactions
- Backing up and restoring disk group configurations
- Troubleshooting issues with importing disk groups
- Recovering from CDS errors
- Logging and error messages
- Troubleshooting Veritas Volume Replicator
- Recovery from RLINK connect problems
- Recovery from configuration errors
- Errors during an RLINK attach
- Errors during modification of an RVG
- Recovery on the Primary or Secondary
- About recovery from a Primary-host crash
- Recovering from Primary data volume error
- Primary SRL volume error cleanup and restart
- Primary SRL volume error at reboot
- Primary SRL volume overflow recovery
- Primary SRL header error cleanup and recovery
- Secondary data volume error cleanup and recovery
- Secondary SRL volume error cleanup and recovery
- Secondary SRL header error cleanup and recovery
- Secondary SRL header error at reboot
- Recovering from hardware failure
- Section III. Troubleshooting Dynamic Multi-Pathing
- Section IV. Troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting Storage Foundation Cluster File System High Availability
- About troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting CFS
- Troubleshooting fenced configurations
- Troubleshooting Cluster Volume Manager in Veritas InfoScale products clusters
- CVM group is not online after adding a node to the Veritas InfoScale products cluster
- Shared disk group cannot be imported in Veritas InfoScale products cluster
- Unable to start CVM in Veritas InfoScale products cluster
- Removing preexisting keys
- CVMVolDg not online even though CVMCluster is online in Veritas InfoScale products cluster
- Shared disks not visible in Veritas InfoScale products cluster
- Troubleshooting interconnects
- Troubleshooting Storage Foundation Cluster File System High Availability
- Section V. Troubleshooting Cluster Server
- Troubleshooting and recovery for VCS
- VCS message logging
- Log unification of VCS agent's entry points
- Enhancing First Failure Data Capture (FFDC) to troubleshoot VCS resource's unexpected behavior
- GAB message logging
- Enabling debug logs for agents
- Enabling debug logs for IMF
- Enabling debug logs for the VCS engine
- About debug log tags usage
- Gathering VCS information for support analysis
- Gathering LLT and GAB information for support analysis
- Gathering IMF information for support analysis
- Message catalogs
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting issues with systemd unit service files
- If a unit service has failed and the corresponding module is still loaded, systemd cannot unload it and so its package cannot be removed
- If a unit service is active and the corresponding process is stopped outside of systemd, the service cannot be started again using 'systemctl start'
- If a unit service takes longer than the default timeout to stop or start the corresponding service, it goes into the Failed state
- Troubleshooting Intelligent Monitoring Framework (IMF)
- Troubleshooting service groups
- VCS does not automatically start service group
- System is not in RUNNING state
- Service group not configured to run on the system
- Service group not configured to autostart
- Service group is frozen
- Failover service group is online on another system
- A critical resource faulted
- Service group autodisabled
- Service group is waiting for the resource to be brought online/taken offline
- Service group is waiting for a dependency to be met.
- Service group not fully probed.
- Service group does not fail over to the forecasted system
- Service group does not fail over to the BiggestAvailable system even if FailOverPolicy is set to BiggestAvailable
- Restoring metering database from backup taken by VCS
- Initialization of metering database fails
- Troubleshooting resources
- Troubleshooting I/O fencing
- Node is unable to join cluster while another node is being ejected
- The vxfentsthdw utility fails when SCSI TEST UNIT READY command fails
- Manually removing existing keys from SCSI-3 disks
- System panics to prevent potential data corruption
- Cluster ID on the I/O fencing key of coordinator disk does not match the local cluster's ID
- Fencing startup reports preexisting split-brain
- Registered keys are lost on the coordinator disks
- Replacing defective disks when the cluster is offline
- The vxfenswap utility exits if rcp or scp commands are not functional
- Troubleshooting CP server
- Troubleshooting server-based fencing on the Veritas InfoScale products cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting the steward process
- Troubleshooting licensing
- Validating license keys
- Licensing error messages
- [Licensing] Insufficient memory to perform operation
- [Licensing] No valid VCS license keys were found
- [Licensing] Unable to find a valid base VCS license key
- [Licensing] License key cannot be used on this OS platform
- [Licensing] VCS evaluation period has expired
- [Licensing] License key can not be used on this system
- [Licensing] Unable to initialize the licensing framework
- [Licensing] QuickStart is not supported in this release
- [Licensing] Your evaluation period for the feature has expired. This feature will not be enabled the next time VCS starts
- Verifying the metered or forecasted values for CPU, Mem, and Swap
- VCS message logging
- Troubleshooting and recovery for VCS
- Section VI. Troubleshooting SFDB
Manually unencapsulating a root disk
The following steps recover the system in the unlikely event that an error makes the system unbootable during the root disk encapsulation or unencapsulation process.
To manually unencapsulate a boot disk
- Turn on the system and boot it from the installation CD number 1.
- Run the following command at the boot prompt to put the system in
rescuemode.boot: linux rescue
- Select the language, keyboard, and choose to skip that step to find your installation.
- Use the fdisk command to inspect the boot disk for the partitions that VxVM created to logically manage the disk:
# fdisk -l /dev/sda
The boot disk may contain a VxVM partition, either the VxVM Public Region partition (tag 7e), the VxVM Private Region partition (tag 7f), or both. If these partitions are present, delete the partitions from the disk using the following command:
# fdisk /dev/sda
See the fdisk(8) manual page for details.
The following example shows the output before and after removing the VxVM partitions from the disk.
VxVM Public Region in primary partition 3 (tag 7e) and VxVM Private Region in logical partition 6 (tag 7f) were found on the root disk:
# fdisk -lu /dev/sda
Disk /dev/sda: 36.4 GB, 36420075008 bytes 255 heads, 63 sectors/track, 4427 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 63 1028159 514048+ 83 Linux /dev/sda2 1028160 15358139 7164990 83 Linux /dev/sda3 63 71119754 35559846 7e Unknown /dev/sda4 15566985 71119754 27776385 5 Extended /dev/sda5 15567048 17667277 1050115 82 Linux swap /dev/sda6 17667341 17669388 1024 7f Unknown /dev/sda7 17671563 71119754 26724096 83 Linux
After you remove the VxVM partitions from the root disk, the following output displays:
# fdisk -lu /dev/sda
Disk /dev/sda: 36.4 GB, 36420075008 bytes 255 heads, 63 sectors/track, 4427 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 63 1028159 514048+ 83 Linux /dev/sda2 1028160 15358139 7164990 83 Linux /dev/sda4 15566985 71119754 27776385 5 Extended /dev/sda5 15567048 17669388 1051170 82 Linux swap /dev/sda6 17671563 71119754 26724096 83 Linux
In this example. the VxVM Private Region is taken from the swap partition because the required free space is not available.
- Make a temporary mount point,
/vxvm, and mount the root partition on it:# mkdir /vxvm # mount -t ext3 /dev/sda1 /vxvm
- If the disk has a separate boot partition, mount this partition on
/vxvm/boot:# mount -t ext3 /dev/sda2 /vxvm/boot
- Before restoring the
/etc/fstaband/etc/lilo.conffiles, save the files for problem analysis.To save the
/etc/fstabdefinitions, use the following command:# cp /vxvm/etc/fstab /vxvm/etc/fstab_savefile
To save the boot configuration file, use one of the following methods.
For the LILO boot loader:
# cp /vxvm/etc/lilo.conf /vxvm/etc/lilo.conf_savefile
For the GRUB boot loader:
# cp /vxvm/etc/grub.conf /vxvm/etc/grub.conf_savefile
The following file may also be needed for problem analysis:
hostname=`uname -n` /etc/vx/rootdisk_info.$hostname
You can obtain the file after the system is rebooted.
- Restore the
/etc/fstabfile:# cp /vxvm/etc/fstab.b4vxvm /vxvm/etc/fstab
- Restore the boot loader configuration, using one of the following methods:
For the LILO boot loader:
# cp /vxvm/etc/lilo.conf.b4vxvm /vxvm/etc/lilo.conf # /vxvm/sbin/lilo -r /vxvm
For the GRUB boot loader:
# cp /vxvm/etc/grub.conf.b4vxvm /vxvm/etc/grub.conf
- Unmount the partitions, run sync and exit the rescue shell
# cd / # umount /vxvm/boot # umount /vxvm # sync # exit
- Shut down and reboot the system.