Veritas Access Appliance Troubleshooting Guide
- Introduction
- General troubleshooting procedures
- About general troubleshooting procedures
- Viewing the Access Appliance log files
- About event logs
- About shell-activity logs
- Setting the CIFS log level
- Setting the NetBackup client log levels and debugging options
- Retrieving and sending debugging information
- Insufficient delay between two successive OpenStack commands may result in failure
- Monitoring Access Appliance
- Common recovery procedures
- About common recovery procedures
- Restarting servers
- Bringing services online
- Recovering from a non-graceful shutdown
- Testing the network connectivity
- Troubleshooting with traceroute
- Using the traceroute command
- Collecting the metasave image of a file system
- Replacing an Ethernet interface card (online mode)
- Replacing an Ethernet interface card (offline mode)
- Replacing an Access Appliance node
- Replacing a disk
- Speeding up replication
- Uninstalling a patch release or software upgrade
- Troubleshooting the Access Appliance cloud as a tier feature
- Troubleshooting Access Appliance installation and configuration issues
- Troubleshooting Access Appliance CIFS issues
- Troubleshooting Access Appliance GUI startup issues
- Index
General techniques for the troubleshooting process
After applying some general troubleshooting tips to narrow the scope of a problem, here are some techniques to further isolate the problem:
Swap identical parts.
In a system with identical or parallel parts and subsystems, it is a good idea to swap components between those subsystems and see whether or not the problem moves with the swapped component. For example, if you experience Access Appliance network connection problems on one node in a cluster, you could swap Ethernet Interface cards to determine if the problem moves to the new node.
Remove parallel components.
If a system is composed of several parallel or redundant components that can be removed without crippling the whole system, start removing these components (one at a time) and see if things start to work. For example, in a cluster, shutdown the nodes one-by-one to see if the problem still persists.
Divide the system into sections.
In a system with multiple sections or stages, carefully measure the variables going in and out of each stage until you find a stage where things do not look right. For example, if you run across a problem with a replication job, check to see if the job has run successfully before and try to determine the time frame when the job started to fail.
Monitor system behavior over time (or location).
Display a list of services and their current status using the Support> services show all command.
Set up a process (such as the Support> traceroute command or a series of Support> iostat commands) to monitor system activity over a period of time or to monitor system activity across the network. This monitoring is especially helpful to track down intermittent problems, processor activity problems, node connection problems, and so on.