Veritas NetBackup™ Troubleshooting Guide

Last Published:
Product(s): NetBackup (8.1.2)
  1. Introduction
    1.  
      NetBackup logging and status code information
    2.  
      Troubleshooting a problem
    3.  
      Problem report for Technical Support
    4.  
      About gathering information for NetBackup-Java applications
  2. Troubleshooting procedures
    1.  
      About troubleshooting procedures
    2. Troubleshooting NetBackup problems
      1.  
        Verifying that all processes are running on UNIX servers
      2.  
        Verifying that all processes are running on Windows servers
    3.  
      Troubleshooting installation problems
    4.  
      Troubleshooting configuration problems
    5.  
      Device configuration problem resolution
    6.  
      Testing the master server and clients
    7.  
      Testing the media server and clients
    8.  
      Resolving network communication problems with UNIX clients
    9.  
      Resolving network communication problems with Windows clients
    10. Troubleshooting vnetd proxy connections
      1.  
        vnetd proxy connection requirements
      2.  
        Where to begin to troubleshoot vnetd proxy connections
      3.  
        Verify that the vnetd process and proxies are active
      4.  
        Verify that the host connections are proxied
      5.  
        Test the vnetd proxy connections
      6.  
        Examine the log files of the connecting and accepting processes
      7.  
        Viewing the vnetd proxy log files
    11. Troubleshooting security certificate revocation
      1.  
        How a host's CRL affects certificate revocation troubleshooting
      2.  
        NetBackup job fails because of revoked certificate
      3.  
        NetBackup job fails because of apparent network error
      4.  
        NetBackup job fails because of unavailable resource
      5.  
        Master server security certificate is revoked
      6.  
        Determining a NetBackup host's certificate state
    12.  
      About troubleshooting networks and host names
    13. Verifying host name and service entries in NetBackup
      1.  
        Example of host name and service entries on UNIX master server and client
      2.  
        Example of host name and service entries on UNIX master server and media server
      3.  
        Example of host name and service entries on UNIX PC clients
      4.  
        Example of host name and service entries on UNIX server that connects to multiple networks
    14.  
      About the bpclntcmd utility
    15.  
      Using the Host Properties window to access configuration settings
    16.  
      Resolving full disk problems
    17. Frozen media troubleshooting considerations
      1.  
        Logs for troubleshooting frozen media
      2.  
        About the conditions that cause media to freeze
    18. Troubleshooting problems with the NetBackup web services
      1.  
        Viewing NetBackup web services logs
    19.  
      Troubleshooting problems with the NetBackup web server certificate
    20. Resolving PBX problems
      1.  
        Checking PBX installation
      2.  
        Checking that PBX is running
      3.  
        Checking that PBX is set correctly
      4.  
        Accessing the PBX logs
      5.  
        Troubleshooting PBX security
      6.  
        Determining if the PBX daemon or service is available
    21. Troubleshooting problems with validation of the remote host
      1.  
        Viewing logs pertaining to host validation
      2.  
        Enabling insecure communication with NetBackup 8.0 and earlier hosts
      3.  
        Approving pending host ID-to-host name mappings
      4.  
        Clearing host cache
    22. About troubleshooting Auto Image Replication
      1.  
        Troubleshooting Auto Image Replication
      2.  
        About troubleshooting automatic import jobs
    23.  
      Troubleshooting network interface card performance
    24.  
      About SERVER entries in the bp.conf file
    25.  
      About unavailable storage unit problems
    26.  
      Resolving a NetBackup Administration operations failure on Windows
    27.  
      Resolving garbled text displayed in NetBackup Administration Console on a UNIX computer
  3. Using NetBackup utilities
    1.  
      About NetBackup troubleshooting utilities
    2.  
      About the analysis utilities for NetBackup debug logs
    3.  
      About the Logging Assistant
    4.  
      About network troubleshooting utilities
    5. About the NetBackup support utility (nbsu)
      1.  
        Output from the NetBackup support utility (nbsu)
      2.  
        Example of a progress display for the NetBackup support utility (nbsu)
    6. About the NetBackup consistency check utility (NBCC)
      1.  
        Output from the NetBackup consistency check utility (NBCC)
      2.  
        Example of an NBCC progress display
    7.  
      About the NetBackup consistency check repair (NBCCR) utility
    8.  
      About the nbcplogs utility
    9. About the robotic test utilities
      1.  
        Robotic tests on UNIX
      2.  
        Robotic tests on Windows
  4. Disaster recovery
    1.  
      About disaster recovery
    2.  
      About disaster recovery requirements
    3.  
      Disaster recovery packages
    4.  
      About disaster recovery settings
    5.  
      Recommended backup practices
    6. About disk recovery procedures for UNIX and Linux
      1. About recovering the master server disk for UNIX and Linux
        1.  
          Recovering the master server when root is intact
        2.  
          Recovering the master server when the root partition is lost
      2.  
        About recovering the NetBackup media server disk for UNIX
      3.  
        Recovering the system disk on a UNIX client workstation
    7. About clustered NetBackup server recovery for UNIX and Linux
      1.  
        Replacing a failed node on a UNIX or Linux cluster
      2.  
        Recovering the entire UNIX or Linux cluster
    8. About disk recovery procedures for Windows
      1. About recovering the master server disk for Windows
        1.  
          Recovering the master server with Windows intact
        2.  
          Recovering the master server and Windows
      2.  
        About recovering the NetBackup media server disk for Windows
      3.  
        Recovering a Windows client disk
    9. About clustered NetBackup server recovery for Windows
      1.  
        Replacing a failed node on a Windows VCS cluster
      2.  
        Recovering the shared disk on a Windows VCS cluster
      3.  
        Recovering the entire Windows VCS cluster
    10.  
      Generating a certificate on a clustered master server after disaster recovery installation
    11.  
      About restoring disaster recovery package
    12.  
      Restoring disaster recovery package on Windows
    13.  
      Restoring disaster recovery package on UNIX
    14. About recovering the NetBackup catalog
      1.  
        About NetBackup catalog recovery on Windows computers
      2.  
        About NetBackup catalog recovery from disk devices
      3.  
        About NetBackup catalog recovery and symbolic links
      4. About NetBackup catalog recovery and OpsCenter
        1.  
          Specifying the NetBackup job ID number after a catalog recovery
      5.  
        NetBackup disaster recovery email example
      6. About recovering the entire NetBackup catalog
        1.  
          Recovering the entire NetBackup catalog using the Catalog Recovery Wizard
        2.  
          Recovering the entire NetBackup catalog using bprecover -wizard
      7. About recovering the NetBackup catalog image files
        1.  
          Recovering the NetBackup catalog image files using the Catalog Recovery Wizard
        2.  
          Recovering the NetBackup catalog image files using bprecover -wizard
      8. About recovering the NetBackup relational database
        1.  
          Recovering NetBackup relational database files from a backup
        2.  
          Recovering the NetBackup relational database files from staging
        3.  
          About processing the relational database in staging
      9.  
        Recovering the NetBackup catalog when NetBackup Access Control is configured
      10.  
        Recovering the NetBackup catalog from a nonprimary copy of a catalog backup
      11.  
        Recovering the NetBackup catalog without the disaster recovery file
      12.  
        Recovering a NetBackup user-directed online catalog backup from the command line
      13.  
        Restoring files from a NetBackup online catalog backup
      14.  
        Unfreezing the NetBackup online catalog recovery media
      15.  
        Steps to carry out when you see exit status 5988 during catalog recovery

Troubleshooting Auto Image Replication

Auto Image Replication replicates the backups that are generated in one NetBackup domain to another media server in one or more NetBackup domains.

Note:

Although Auto Image Replication supports replication across different master server domains, the Replication Director does not.

Auto Image Replication operates like any duplication job except that its job contains no write side. The job must consume a read resource from the disk volume on which the source images reside. If no media server is available, the job fails with status 800.

The Auto Image Replication job operates at a disk volume level. Within the storage unit that is specified in the storage lifecycle policy for the source copy, some disk volumes may not support replication. Use the Disk Pools interface of the NetBackup Administration Console to verify that the image is on a disk volume that supports replication. If the interface shows that the disk volume is not a replication source, click Update Disk Volume or Refresh to update the disk volume(s) in the disk pool. If the problem persists, check your disk device configuration.

The action to take on the automatic replication job depends on several conditions as shown in the following table.

Action

Condition

AIR replication jobs have not started

Verify the following:

  • The SLP is active.

  • The nbstserv daemon is running.

  • The image has not exceeded the extended retry count.

AIR replication jobs are queued but have not started

No media server or I/O stream is available.

AIR replication jobs fail, for example with status 191

Check the job details for more information about the failure.

For more details, review the bpdm log on the media server that processed the replication job.

The following procedure is based on NetBackup that operates in an OpenStorage configuration. This configuration communicates with a Media Server Deduplication Pool (MSDP) that uses Auto Image Replication.

To troubleshoot Auto Image Replication jobs

  1. Display the storage server information by using the following command:
    # bpstsinfo -lsuinfo -stype PureDisk -storage_server 
    storage_server_name
    

    Example output:

    LSU Info:
    Server Name: PureDisk:ss1.acme.com
    LSU Name: PureDiskVolume
    Allocation : STS_LSU_AT_STATIC
    Storage: STS_LSU_ST_NONE
    Description: PureDisk storage unit (/ss1.acme.com#1/2)
    Configuration: 
    Media: (STS_LSUF_DISK | STS_LSUF_ACTIVE | STS_LSUF_STORAGE_NOT_FREED 
       | STS_LSUF_REP_ENABLED | STS_LSUF_REP_SOURCE)
    Save As : (STS_SA_CLEARF | STS_SA_OPAQUEF | STS_SA_IMAGE)
    Replication Sources: 0 ( )
    Replication Targets: 1 ( PureDisk:bayside:PureDiskVolume )
    ...

    This output shows the logical storage unit (LSU) flags STS_LSUF_REP_ENABLED and STS_LSUF_REP_SOURCE for PureDiskVolume. PureDiskVolume is enabled for Auto Image Replication and is a replication source.

  2. To verify that NetBackup recognizes these two flags, run the following command:
    # nbdevconfig -previewdv -stype PureDisk -storage_server 
    storage_server_name -media_server media_server_name -U
    Disk Pool Name      : 
    Disk Type           : PureDisk
    Disk Volume Name    : PureDiskVolume
    ...
    Flag                : ReplicationSource
    ...

    The ReplicationSource flag confirms that NetBackup recognizes the LSU flags.

  3. To display the replication targets by using the raw output, run the following command:
    # nbdevconfig -previewdv -stype PureDisk -storage_server 
    storage_server_name -media_server media_server_name
    
    V_5_ DiskVolume < "PureDiskVolume" "PureDiskVolume" 46068048064 
       46058373120 0 0 0 16 1 >
    V_5_ ReplicationTarget < "bayside:PureDiskVolume" >

    The display shows that the replication target is a storage server called bayside and the LSU (volume) name is PureDiskVolume.

  4. To ensure that NetBackup captured this configuration correctly, run the following command:
    # nbdevquery -listdv -stype PureDisk -U
    Disk Pool Name      : PDpool
    Disk Type           : PureDisk
    Disk Volume Name    : PureDiskVolume
    ...
    Flag                : AdminUp
    Flag                : InternalUp
    Flag                : ReplicationSource
    Num Read Mounts     : 0
    ...

    This listing shows that disk volume PureDiskVolume is configured in disk pool PDPool, and that NetBackup recognizes the replication capability on the source side. A similar nbdevquery command on the target side should display ReplicationTarget for its disk volume.

  5. If NetBackup does not recognize the replication capability, run the following command:
    # nbdevconfig -updatedv -stype PureDisk -dp PDpool
  6. To ensure that you have a storage unit that uses this disk pool, run the following command:
    # bpstulist 
    PDstu 0 _STU_NO_DEV_HOST_ 0 -1 -1 1 0 "*NULL*" 
       1 1 51200 *NULL* 2 6 0 0 0 0 PDpool *NULL*

    The output shows that storage unit PDstu uses disk pool PDpool.

  7. Check the settings on the disk pool by running the following command:
    nbdevquery -listdp -stype PureDisk -dp PDpool -U
    Disk Pool Name   : PDpool
    Disk Pool Id     : PDpool
    Disk Type        : PureDisk
    Status           : UP
    Flag             : Patchwork
    ...
    Flag             : OptimizedImage
    Flag             : ReplicationTarget
    Raw Size (GB)    : 42.88
    Usable Size (GB) : 42.88
    Num Volumes      : 1
    High Watermark   : 98
    Low Watermark    : 80
    Max IO Streams   : -1
    Comment          : 
    Storage Server   : ss1.acme.com (UP)

    Max IO Streams is set to -1, which means the disk pool has unlimited input-output streams.

  8. To check the list of media servers that are credentialed to access the storage servers and their disk pools, run the following command:
    # tpconfig -dsh -all_hosts
    ==============================================================
    Media Server:                   ss1.acme.com
    Storage Server:                 ss1.acme.com
    User Id:                        root
        Storage Server Type:        BasicDisk
        Storage Server Type:        SnapVault
        Storage Server Type:        PureDisk
    ==============================================================

    This disk pool only has one media server, ss1.acme.com. You have completed the storage configuration validation.

  9. The last phase of validation is the storage lifecycle policy configuration. To run Auto Image Replication, the source copy must be on storage unit PDstu. Run the following command (for example):
    nbstl woodridge2bayside -L
                                    Name: woodridge2bayside
                     Data Classification: (none specified)
                Duplication job priority: 0
                                   State: active
                                 Version: 0
     Destination  1              Use for: backup
                                 Storage: PDstu
                             Volume Pool: (none specified)
                            Server Group: (none specified)
                          Retention Type: Fixed
                         Retention Level: 1 (2 weeks)
                   Alternate Read Server: (none specified)
                   Preserve Multiplexing: false
          Enable Automatic Remote Import: true
                                   State: active
                                  Source: (client)
                          Destination ID: 0
     Destination  2              Use for: 3 (replication to remote master)
                                 Storage: Remote Master
                             Volume Pool: (none specified)
                            Server Group: (none specified)
                                     ...
                   Preserve Multiplexing: false
          Enable Automatic Remote Import: false
                                   State: active
                                  Source: Destination 1 (backup:PDstu)
                          Destination ID: 0

    To troubleshoot the Auto Image Replication job flow, use the same command lines as you use for other storage lifecycle policy managed jobs. For example, to list the images that have been duplicated to remote master, run the following:

    nbstlutil list -copy_type replica -U -copy_state 3

    To list the images that have not been duplicated to remote master (either pending or failed), run the following:

    nbstlutil list -copy_type replica -U -copy_incomplete
  10. To show the status for completed replication copies, run the following command:
    nbstlutil repllist -U
    Image:
     Master Server            : ss1.acme.com
     Backup ID                : woodridge_1287610477
     Client                   : woodridge
     Backup Time              : 1287610477 (Wed Oct 20 16:34:37 2010)
     Policy                   : two-hop-with-dup
     Client Type              : 0
     Schedule Type            : 0
     Storage Lifecycle Policy : woodridge2bayside2pearl_withdup
     Storage Lifecycle State  : 3 (COMPLETE) 
     Time In Process          : 1287610545 (Wed Oct 20 16:35:45 2010)
     Data Classification ID   : (none specified)
     Version Number           : 0
     OriginMasterServer       : (none specified)
     OriginMasterServerID     : 00000000-0000-0000-0000-000000000000
     Import From Replica Time : 0 (Wed Dec 31 18:00:00 1969)
     Required Expiration Date : 0 (Wed Dec 31 18:00:00 1969)
     Created Date Time        : 1287610496 (Wed Oct 20 16:34:56 2010)
    
     Copy:
       Master Server       : ss1.acme.com
       Backup ID           : woodridge_1287610477
       Copy Number         : 102
       Copy Type           : 3
       Expire Time         : 1290288877 (Sat Nov 20 15:34:37 2010)
       Expire LC Time      : 1290288877 (Sat Nov 20 15:34:37 2010)
       Try To Keep Time    : 1290288877 (Sat Nov 20 15:34:37 2010)
       Residence           : Remote Master
       Copy State          : 3 (COMPLETE) 
       Job ID              : 25
       Retention Type      : 0 (FIXED) 
       MPX State           : 0 (FALSE)
       Source              : 1
       Destination ID      : 
       Last Retry Time     : 1287610614
    
     Replication Destination:
       Source Master Server: ss1.acme.com
       Backup ID           : woodridge_1287610477
       Copy Number         : 102
       Target Machine      : bayside
       Target Info         : PureDiskVolume
       Remote Master       : (none specified)