Veritas NetBackup™ Troubleshooting Guide

Last Published:
Product(s): NetBackup (8.2)
  1. Introduction
    1.  
      NetBackup logging and status code information
    2.  
      Troubleshooting a problem
    3.  
      Problem report for Technical Support
    4.  
      About gathering information for NetBackup-Java applications
  2. Troubleshooting procedures
    1.  
      About troubleshooting procedures
    2. Troubleshooting NetBackup problems
      1.  
        Verifying that all processes are running on UNIX servers
      2.  
        Verifying that all processes are running on Windows servers
    3.  
      Troubleshooting installation problems
    4.  
      Troubleshooting configuration problems
    5.  
      Device configuration problem resolution
    6.  
      Testing the master server and clients
    7.  
      Testing the media server and clients
    8.  
      Resolving network communication problems with UNIX clients
    9.  
      Resolving network communication problems with Windows clients
    10. Troubleshooting vnetd proxy connections
      1.  
        vnetd proxy connection requirements
      2.  
        Where to begin to troubleshoot vnetd proxy connections
      3.  
        Verify that the vnetd process and proxies are active
      4.  
        Verify that the host connections are proxied
      5.  
        Test the vnetd proxy connections
      6.  
        Examine the log files of the connecting and accepting processes
      7.  
        Viewing the vnetd proxy log files
    11. Troubleshooting security certificate revocation
      1.  
        Troubleshooting cloud provider's revoked SSL certificate issues
      2.  
        Troubleshooting cloud provider's CRL download issues
      3.  
        How a host's CRL affects certificate revocation troubleshooting
      4.  
        NetBackup job fails because of revoked certificate or unavailability of CRLs
      5.  
        NetBackup job fails because of apparent network error
      6.  
        NetBackup job fails because of unavailable resource
      7.  
        Master server security certificate is revoked
      8.  
        Determining a NetBackup host's certificate state
      9.  
        Troubleshooting issues with external CA-signed certificate revocation
    12.  
      About troubleshooting networks and host names
    13. Verifying host name and service entries in NetBackup
      1.  
        Example of host name and service entries on UNIX master server and client
      2.  
        Example of host name and service entries on UNIX master server and media server
      3.  
        Example of host name and service entries on UNIX PC clients
      4.  
        Example of host name and service entries on UNIX server that connects to multiple networks
    14.  
      About the bpclntcmd utility
    15.  
      Using the Host Properties window to access configuration settings
    16.  
      Resolving full disk problems
    17. Frozen media troubleshooting considerations
      1.  
        Logs for troubleshooting frozen media
      2.  
        About the conditions that cause media to freeze
    18. Troubleshooting problems with the NetBackup web services
      1.  
        Viewing NetBackup web services logs
      2.  
        Troubleshooting web service issues after external CA configuration
    19.  
      Troubleshooting problems with the NetBackup web server certificate
    20. Resolving PBX problems
      1.  
        Checking PBX installation
      2.  
        Checking that PBX is running
      3.  
        Checking that PBX is set correctly
      4.  
        Accessing the PBX logs
      5.  
        Troubleshooting PBX security
      6.  
        Determining if the PBX daemon or service is available
    21. Troubleshooting problems with validation of the remote host
      1.  
        Viewing logs pertaining to host validation
      2.  
        Enabling insecure communication with NetBackup 8.0 and earlier hosts
      3.  
        Approving pending host ID-to-host name mappings
      4.  
        Clearing host cache
    22. About troubleshooting Auto Image Replication
      1. Troubleshooting Auto Image Replication
        1.  
          Targeted AIR trusted master server operation failed in case of external certificate configuration
      2.  
        About troubleshooting automatic import jobs
    23.  
      Troubleshooting network interface card performance
    24.  
      About SERVER entries in the bp.conf file
    25.  
      About unavailable storage unit problems
    26.  
      Resolving a NetBackup Administration operations failure on Windows
    27.  
      Resolving garbled text displayed in NetBackup Administration Console on a UNIX computer
    28.  
      Unable to logon to the NetBackup Administration Console after external CA configuration
    29.  
      Troubleshooting file-based external certificate issues
    30.  
      Troubleshooting Windows certificate store issues
    31.  
      Troubleshooting backup failures
    32.  
      Troubleshooting backup failure issues with NAT clients
    33.  
      Troubleshooting issues with the NetBackup Messaging Broker (or nbmqbroker) service
  3. Using NetBackup utilities
    1.  
      About NetBackup troubleshooting utilities
    2.  
      About the analysis utilities for NetBackup debug logs
    3.  
      About the Logging Assistant
    4.  
      About network troubleshooting utilities
    5. About the NetBackup support utility (nbsu)
      1.  
        Output from the NetBackup support utility (nbsu)
      2.  
        Example of a progress display for the NetBackup support utility (nbsu)
    6. About the NetBackup consistency check utility (NBCC)
      1.  
        Output from the NetBackup consistency check utility (NBCC)
      2.  
        Example of an NBCC progress display
    7.  
      About the NetBackup consistency check repair (NBCCR) utility
    8.  
      About the nbcplogs utility
    9. About the robotic test utilities
      1.  
        Robotic tests on UNIX
      2.  
        Robotic tests on Windows
  4. Disaster recovery
    1.  
      About disaster recovery
    2.  
      About disaster recovery requirements
    3.  
      Disaster recovery packages
    4.  
      About disaster recovery settings
    5.  
      Recommended backup practices
    6. About disk recovery procedures for UNIX and Linux
      1. About recovering the master server disk for UNIX and Linux
        1.  
          Recovering the master server when root is intact
        2.  
          Recovering the master server when the root partition is lost
      2.  
        About recovering the NetBackup media server disk for UNIX
      3.  
        Recovering the system disk on a UNIX client workstation
    7. About clustered NetBackup server recovery for UNIX and Linux
      1.  
        Replacing a failed node on a UNIX or Linux cluster
      2.  
        Recovering the entire UNIX or Linux cluster
    8. About disk recovery procedures for Windows
      1. About recovering the master server disk for Windows
        1.  
          Recovering the master server with Windows intact
        2.  
          Recovering the master server and Windows
      2.  
        About recovering the NetBackup media server disk for Windows
      3.  
        Recovering a Windows client disk
    9. About clustered NetBackup server recovery for Windows
      1.  
        Replacing a failed node on a Windows VCS cluster
      2.  
        Recovering the shared disk on a Windows VCS cluster
      3.  
        Recovering the entire Windows VCS cluster
    10.  
      Generating a certificate on a clustered master server after disaster recovery installation
    11.  
      About restoring disaster recovery package
    12.  
      About the DR_PKG_MARKER_FILE environment variable
    13.  
      Restoring disaster recovery package on Windows
    14.  
      Restoring disaster recovery package on UNIX
    15. About recovering the NetBackup catalog
      1.  
        About NetBackup catalog recovery on Windows computers
      2.  
        About NetBackup catalog recovery from disk devices
      3.  
        About NetBackup catalog recovery and symbolic links
      4. About NetBackup catalog recovery and OpsCenter
        1.  
          Specifying the NetBackup job ID number after a catalog recovery
      5.  
        NetBackup disaster recovery email example
      6. About recovering the entire NetBackup catalog
        1.  
          Recovering the entire NetBackup catalog using the Catalog Recovery Wizard
        2.  
          Recovering the entire NetBackup catalog using bprecover -wizard
      7. About recovering the NetBackup catalog image files
        1.  
          Recovering the NetBackup catalog image files using the Catalog Recovery Wizard
        2.  
          Recovering the NetBackup catalog image files using bprecover -wizard
      8. About recovering the NetBackup relational database
        1.  
          Recovering NetBackup relational database files from a backup
        2.  
          Recovering the NetBackup relational database files from staging
        3.  
          About processing the relational database in staging
      9.  
        Recovering the NetBackup catalog when NetBackup Access Control is configured
      10.  
        Recovering the NetBackup catalog from a nonprimary copy of a catalog backup
      11.  
        Recovering the NetBackup catalog without the disaster recovery file
      12.  
        Recovering a NetBackup user-directed online catalog backup from the command line
      13.  
        Restoring files from a NetBackup online catalog backup
      14.  
        Unfreezing the NetBackup online catalog recovery media
      15.  
        Steps to carry out when you see exit status 5988 during catalog recovery

Testing the master server and clients

If the NetBackup, installation, and configuration troubleshooting procedures do not reveal the problem, perform the following procedure. Skip those steps that you have already performed.

The procedure assumes that the software was successfully installed, but not necessarily configured correctly. If NetBackup never worked properly, you probably have configuration problems. In particular, look for device configuration problems.

You may also want to perform each backup and restore twice. On UNIX, perform them first as a root user and then as a nonroot user. On Windows, perform them first as a user that is a member of the Administrators group. Then perform them as a user that is not a member of the Administrator group. In all cases, ensure that you have read and write permissions on the test files.

The explanations in these procedures assume that you are familiar with the backup processes and restore processes. For further information, see the NetBackup Logging Reference Guide.

Several steps in this procedure mention the All Log Entries report. To access more information on this report and others, refer to the following:

See the NetBackup Administrator's Guide, Volume I.

Table: Steps for testing the master server and clients

Step

Action

Description

Step 1

Enable debug logs.

Enable the appropriate debug logs on the master server.

For information on logging, see the NetBackup Logging Reference Guide.

If you do not know which logs apply, enable them all until you solve the problem. Delete the debug log directories when you have resolved the problem.

Step 2

Configure a test policy.

Configure a test policy to use a basic disk storage unit.

Or, configure a test policy and set the backup window to be open while you test. Name the master server as the client and a storage unit that is on the master server (preferably a nonrobotic drive). Also, configure a volume in the NetBackup volume pool and insert the volume in the drive. If you don't label the volume by using the bplabel command, NetBackup automatically assigns a previously unused media ID.

Step 3

Verify the daemons and services.

To verify that the NetBackup daemons or services are running on the master server, do the following:

  • To check the daemons on a UNIX system, enter the following command:

    /usr/openv/netbackup/bin/bpps -x
  • To check the services on a Windows system, use the NetBackup Activity Monitor or the Services application of the Windows Control Panel.

Step 4

Backup and restore a policy.

Start a manual backup of a policy by using the manual backup option in the NetBackup administration interface. Then, restore the backup.

These actions verify the following:

  • NetBackup server software is functional, which includes all daemons or services, programs, and databases.

  • NetBackup can mount the media and use the drive you configured.

Step 5

Check for failure.

If a failure occurs, check the job's Detailed Status in the Activity Monitor.

You can also try the NetBackup All Log Entries report. For the failures that relate to drives or media, verify that the drive is in an UP state and that the hardware functions.

To isolate the problem further, use the debug logs.

For an overview of the sequence of processing, see the information on backup processes and restore processes in the NetBackup Logging Reference Guide.

Step 6

Consult information besides the debug logs.

If the debug logs do not reveal the problem, check the following:

  • Systems Logs on UNIX systems

  • Event Viewer and System logs on Windows systems

  • Media Manager debug logs on the media server that performed the backup, restore, or duplication

  • The bpdm and bptm debug logs on the media server that performed the backup, restore, or duplication

See the vendor manuals for information on hardware failures.

Step 7

Verify robotic drives.

If you use a robot and the configuration is an initial configuration, verify that the robotic drive is configured correctly.

In particular, verify the following:

  • The same robot number is used both in the Media and Device Management and storage unit configurations.

  • Each robot has a unique robot number.

On a UNIX NetBackup server, you can verify only the Media and Device Management part of the configuration. To verify, use the tpreq command to request a media mount. Verify that the mount completes and check the drive on which the media was mounted. Repeat the process until the media is mounted and unmounted on each drive from the host where the problem occurred. If this works, the problem is probably with the policy or the storage unit configuration. When you are done, tpunmount the media.

Step 8

Include a robot in the test policy.

If you previously configured a nonrobotic drive and your system includes a robot, change your test policy now to specify a robot. Add a volume to the robot. The volume must be in the NetBackup volume pool on the EMM database host for the robot.

Return to step 3 and repeat this procedure for the robot. This procedure verifies that NetBackup can find the volume, mount it, and use the robotic drive.

Step 9

Use the robotic test utilities.

If you have difficulties with the robot, try the test utilities.

Do not use the Robotic Test Utilities when backups or restores are active. These utilities prevent the corresponding robotic processes from performing robotic actions, such as loading and unloading media. The result is that it can cause media mount timeouts and prevent other robotic operations like robotic inventory and inject or eject from working.

Step 10

Enhance the test policy.

Add a user schedule to your test policy (the backup window must be open while you test). Use a storage unit and media that was verified in previous steps.

Step 11

Backup and restore a file.

Start a user backup and restore of a file by using the client-user interface on the master server. Monitor the status and the progress log for the operation. If successful, this operation verifies that the client software is functional on the master server.

If a failure occurs, check the NetBackup All Log Entries report. To isolate the problem further, check the appropriate debug logs from the following list.

On a UNIX system, the debug logs are in the /usr/openv/netbackup/logs/ directory. On a Windows computer, the debug logs are in the install_path\NetBackup\logs\ directory.

Debug log directories exist for the following processes:

  • bparchive (UNIX only)

  • bpbackup (UNIX only)

  • bpbkar

  • bpcd

  • bplist

  • bprd

  • bprestore

  • nbwin (Windows only)

  • bpinetd (Windows only)

Explanations about which logs apply to specific client types are available.

For information on logging, see the NetBackup Logging Reference Guide.

Step 12

Reconfigure the test policy.

Reconfigure your test policy to name a client that is located elsewhere in the network. Use a storage unit and media that has been verified in previous steps. If necessary, install the NetBackup client software.

Step 13

Create debug log directories.

Create debug log directories for the following processes:

  • bprd on the server

  • bpcd on the client

  • bpbkar on the client

  • nbwin on the client (Windows only)

  • bpbackup on the client (except Windows clients)

  • bpinetd (Windows only)

  • tar

  • On the media server: bpbrm, bpdm, and bptm

Explanations about which logs apply to specific client types are available.

For information on logging, see the NetBackup Logging Reference Guide.

Step 14

Verify communication between the client and the master server.

Perform a user backup and then a restore from the client that is specified in step 8. These actions verify communications between the client and the master server, and NetBackup software on the client.

If an error occurs, check the job's Detailed Status in the Activity Monitor.

check the All Log Entries report and the debug logs that you created in the previous step. A likely cause for errors is a communications problem between the server and the client.

Step 15

Test other clients or storage units.

When the test policy operates satisfactorily, repeat specific steps as necessary to verify other clients and storage units.

Step 16

Test the remaining policies and schedules.

When all clients and storage units are functional, test the remaining policies and schedules that use storage units on the master server. If a scheduled backup fails, check the All Log Entries report for errors. Then follow the recommended actions as is part of the error status code.