Veritas NetBackup™ Troubleshooting Guide

Last Published:
Product(s): NetBackup (8.3.0.1)
  1. Introduction
    1.  
      NetBackup logging and status code information
    2.  
      Troubleshooting a problem
    3.  
      Problem report for Technical Support
    4.  
      About gathering information for NetBackup-Java applications
  2. Troubleshooting procedures
    1.  
      About troubleshooting procedures
    2. Troubleshooting NetBackup problems
      1.  
        Verifying that all processes are running on UNIX servers
      2.  
        Verifying that all processes are running on Windows servers
    3.  
      Troubleshooting installation problems
    4.  
      Troubleshooting configuration problems
    5.  
      Device configuration problem resolution
    6.  
      Testing the master server and clients
    7.  
      Testing the media server and clients
    8.  
      Resolving network communication problems with UNIX clients
    9.  
      Resolving network communication problems with Windows clients
    10. Troubleshooting vnetd proxy connections
      1.  
        vnetd proxy connection requirements
      2.  
        Where to begin to troubleshoot vnetd proxy connections
      3.  
        Verify that the vnetd process and proxies are active
      4.  
        Verify that the host connections are proxied
      5.  
        Test the vnetd proxy connections
      6.  
        Examine the log files of the connecting and accepting processes
      7.  
        Viewing the vnetd proxy log files
    11. Troubleshooting security certificate revocation
      1.  
        Troubleshooting cloud provider's revoked SSL certificate issues
      2.  
        Troubleshooting cloud provider's CRL download issues
      3.  
        How a host's CRL affects certificate revocation troubleshooting
      4.  
        NetBackup job fails because of revoked certificate or unavailability of CRLs
      5.  
        NetBackup job fails because of apparent network error
      6.  
        NetBackup job fails because of unavailable resource
      7.  
        Master server security certificate is revoked
      8.  
        Determining a NetBackup host's certificate state
      9.  
        Troubleshooting issues with external CA-signed certificate revocation
    12.  
      About troubleshooting networks and host names
    13. Verifying host name and service entries in NetBackup
      1.  
        Example of host name and service entries on UNIX master server and client
      2.  
        Example of host name and service entries on UNIX master server and media server
      3.  
        Example of host name and service entries on UNIX PC clients
      4.  
        Example of host name and service entries on UNIX server that connects to multiple networks
    14.  
      About the bpclntcmd utility
    15.  
      Using the Host Properties window to access configuration settings
    16.  
      Resolving full disk problems
    17. Frozen media troubleshooting considerations
      1.  
        Logs for troubleshooting frozen media
      2.  
        About the conditions that cause media to freeze
    18. Troubleshooting problems with the NetBackup web services
      1.  
        Viewing NetBackup web services logs
      2.  
        Troubleshooting web service issues after external CA configuration
    19.  
      Troubleshooting problems with the NetBackup web server certificate
    20. Resolving PBX problems
      1.  
        Checking PBX installation
      2.  
        Checking that PBX is running
      3.  
        Checking that PBX is set correctly
      4.  
        Accessing the PBX logs
      5.  
        Troubleshooting PBX security
      6.  
        Determining if the PBX daemon or service is available
    21. Troubleshooting problems with validation of the remote host
      1.  
        Viewing logs pertaining to host validation
      2.  
        Enabling insecure communication with NetBackup 8.0 and earlier hosts
      3.  
        Approving pending host ID-to-host name mappings
      4.  
        Clearing host cache
    22. Troubleshooting Auto Image Replication
      1.  
        Rules for master servers used with Auto Image Replication and SLPs
      2. Targeted AIR trusted master server operation failed in case of external certificate configuration
        1.  
          Add or update trust
        2.  
          Remove trust
      3.  
        About troubleshooting automatic import jobs that SLP components manage
    23.  
      Troubleshooting network interface card performance
    24.  
      About SERVER entries in the bp.conf file
    25.  
      About unavailable storage unit problems
    26.  
      Resolving a NetBackup Administration operations failure on Windows
    27.  
      Resolving garbled text displayed in NetBackup Administration Console on a UNIX computer
    28.  
      Troubleshooting error messages in the NetBackup Administration Console
    29.  
      Extra disk space required for logs and temporary files for the NetBackup Administration Console
    30.  
      Unable to logon to the NetBackup Administration Console after external CA configuration
    31.  
      Troubleshooting file-based external certificate issues
    32.  
      Troubleshooting Windows certificate store issues
    33.  
      Troubleshooting backup failures
    34.  
      Troubleshooting backup failure issues with NAT clients or NAT servers
    35.  
      Troubleshooting issues with the NetBackup Messaging Broker (or nbmqbroker) service
    36.  
      Issues with email notifications for Windows systems
  3. Using NetBackup utilities
    1.  
      About NetBackup troubleshooting utilities
    2.  
      About the analysis utilities for NetBackup debug logs
    3.  
      About the Logging Assistant
    4.  
      About network troubleshooting utilities
    5. About the NetBackup support utility (nbsu)
      1.  
        Output from the NetBackup support utility (nbsu)
      2.  
        Example of a progress display for the NetBackup support utility (nbsu)
    6. About the NetBackup consistency check utility (NBCC)
      1.  
        Output from the NetBackup consistency check utility (NBCC)
      2.  
        Example of an NBCC progress display
    7.  
      About the NetBackup consistency check repair (NBCCR) utility
    8.  
      About the nbcplogs utility
    9. About the robotic test utilities
      1.  
        Robotic tests on UNIX
      2.  
        Robotic tests on Windows
  4. Disaster recovery
    1.  
      About disaster recovery
    2.  
      About disaster recovery requirements
    3.  
      Disaster recovery packages
    4.  
      About disaster recovery settings
    5.  
      Recommended backup practices
    6. About disk recovery procedures for UNIX and Linux
      1. About recovering the master server disk for UNIX and Linux
        1.  
          Recovering the master server when root is intact
        2.  
          Recovering the master server when the root partition is lost
      2.  
        About recovering the NetBackup media server disk for UNIX
      3.  
        Recovering the system disk on a UNIX client workstation
    7. About clustered NetBackup server recovery for UNIX and Linux
      1.  
        Replacing a failed node on a UNIX or Linux cluster
      2.  
        Recovering the entire UNIX or Linux cluster
    8. About disk recovery procedures for Windows
      1. About recovering the master server disk for Windows
        1.  
          Recovering the master server with Windows intact
        2.  
          Recovering the master server and Windows
      2.  
        About recovering the NetBackup media server disk for Windows
      3.  
        Recovering a Windows client disk
    9. About clustered NetBackup server recovery for Windows
      1.  
        Replacing a failed node on a Windows VCS cluster
      2.  
        Recovering the shared disk on a Windows VCS cluster
      3.  
        Recovering the entire Windows VCS cluster
    10.  
      Generating a certificate on a clustered master server after disaster recovery installation
    11.  
      About restoring disaster recovery package
    12.  
      About the DR_PKG_MARKER_FILE environment variable
    13.  
      Restoring disaster recovery package on Windows
    14.  
      Restoring disaster recovery package on UNIX
    15. About recovering the NetBackup catalog
      1.  
        About NetBackup catalog recovery on Windows computers
      2.  
        About NetBackup catalog recovery from disk devices
      3.  
        About NetBackup catalog recovery and symbolic links
      4. About NetBackup catalog recovery and
        1.  
          Specifying the NetBackup job ID number after a catalog recovery
      5.  
        NetBackup disaster recovery email example
      6. About recovering the entire NetBackup catalog
        1.  
          Recovering the entire NetBackup catalog using the Catalog Recovery Wizard
        2.  
          Recovering the entire NetBackup catalog using bprecover -wizard
      7. About recovering the NetBackup catalog image files
        1.  
          Recovering the NetBackup catalog image files using the Catalog Recovery Wizard
        2.  
          Recovering the NetBackup catalog image files using bprecover -wizard
      8. About recovering the NetBackup relational database
        1.  
          Recovering NetBackup relational database files from a backup
        2.  
          Recovering the NetBackup relational database files from staging
        3.  
          About processing the relational database in staging
      9.  
        Recovering the NetBackup catalog when NetBackup Access Control is configured
      10.  
        Recovering the NetBackup catalog from a nonprimary copy of a catalog backup
      11.  
        Recovering the NetBackup catalog without the disaster recovery file
      12.  
        Recovering a NetBackup user-directed online catalog backup from the command line
      13.  
        Restoring files from a NetBackup online catalog backup
      14.  
        Unfreezing the NetBackup online catalog recovery media
      15.  
        Steps to carry out when you see exit status 5988 during catalog recovery
  5.  
    Index

Testing the media server and clients

If you use media servers, use the following steps to verify that they are operational. Before testing the media servers, eliminate all problems on the master server.

Table: Steps for testing the media server and clients

Step

Action

Description

Step 1

Enable legacy debug logs.

Enable appropriate legacy debug logs on the servers, by entering the following:

UNIX/Linux: /usr/openv/netbackup/logs/mklogdir

Windows: install_path\NetBackup\logs\mklogdir.bat

See the NetBackup Logging Reference Guide.

If you are uncertain which logs apply, enable them all until you solve the problem. Delete the legacy debug log directories when you have resolved the problem.

Step 2

Configure a test policy.

Configure a test policy with a user schedule (set the backup window to be open while you test) by doing the following:

  • Name the media server as the client and a storage unit that is on the media server (preferably a nonrobotic drive).

  • Add a volume on the EMM database host for the devices in the storage unit. Ensure that the volume is in the NetBackup volume pool.

  • Insert the volume in the drive. If you do not pre-label the volume by using the bplabel command, NetBackup automatically assigns a previously unused media ID.

Step 3

Verify the daemons and services.

Verify that all NetBackup daemons or services are running on the master server. Also, verify that all Media and Device Management daemons or services are running on the media server.

To perform this check, do one of the following:

  • On a UNIX system, run:

    /usr/openv/netbackup/bin/bpps -x
  • On a Windows system, use the Services application in the Windows Control Panel.

Step 4

Backup and restore a file.

Perform a user backup and then a restore of a file from a client that has been verified to work with the master server.

This test verifies the following:

  • NetBackup media server software.

  • NetBackup on the media server can mount the media and use the drive that you configured.

  • Communications between the master server processes nbpem, nbjm, nbrb, EMM server process nbemm, and media server processes bpcd, bpbrm, bpdm, and bptm.

  • Communications between media server process bpbrm, bpdm, bptm, and client processes bpcd and bpbkar.

For the failures that relate to drives or media, ensure that the drive is in an UP state and that the hardware functions.

Step 5

Verify communication between the master server and the media servers.

If you suspect a communications problem between the master server and the media servers, check the debug logs for the pertinent processes.

If the debug logs don't help you, check the following:

  • On a UNIX server, the System log

  • On a Windows server, the Event Viewer Application and System log

  • vmd debug logs

Step 6

Ensure that the hardware runs correctly.

For the failures that relate to drives or media, ensure that the drive is running and that the hardware functions correctly.

See the vendor manuals for information on hardware failures.

If you use a robot in an initial configuration condition, verify that the robotic drive is configured correctly.

In particular, verify the following:

  • The same robot number is used both in the Media and Device Management and storage unit configurations.

  • Each robot has a unique robot number.

On a UNIX server, you can verify only the Media and Device Management part of the configuration. To verify, use the tpreq command to request a media mount. Verify that the mount completes and check the drive on which the media was mounted. Repeat the process until the media is mounted and unmounted on each drive from the host where the problem occurred. Perform these steps from the media server. If this works, the problem is probably with the policy or the storage unit configuration on the media server. When you are done, use tpunmount to unmount the media.

Step 7

Include a robotic device in the test policy.

If you previously configured a non-robotic drive and a robot was attached to your media server, change the test policy to name the robot. Also, add a volume for the robot to the EMM server. Verify that the volume is in the NetBackup volume pool and in the robot.

Start with step 3 to repeat this procedure for a robot. This procedure verifies that NetBackup can find the volume, mount it, and use the robotic drive.

If a failure occurs, check the NetBackup All Log Entries report. Look for any errors that relate to devices or media.

See the NetBackup Administrator's Guide, Volume I.

If the All Log Entries report doesn't help, check the following:

  • On a UNIX server, the system logs on the media server

  • vmd debug logs on the EMM server for the robot

  • On a Windows system, the Event Viewer Application and System log

In an initial configuration, verify that the robotic drive is configured correctly. Do not use a robot number that is already configured on another server.

Try the test utilities.

Do not use the Robotic Test Utilities when backups or restores are active. These utilities prevent the corresponding robotic processes from performing robotic actions, such as loading and unloading media. The result is that it can cause media mount timeouts and prevent other robotic operations like robotic inventory and inject or eject from working.

Step 8

Test other clients or storage units.

When the test policy operates satisfactorily, repeat specific steps as necessary to verify other clients and storage units.

Step 9

Test the remaining policies and schedules.

When all clients and storage units are in operation, test the remaining policies and schedules that use storage units on the media server. If a scheduled backup fails, check the All Log Entries report for errors. Then follow the suggested actions for the appropriate status code.