Unix client backup jobs hang during the backup with status 13, status 41, errno = 110

Article: 100003560
Last Published: 2020-02-10
Ratings: 1 1
Product(s): NetBackup & Alta Data Protection

Problem

Unix client backup jobs hang during the backup

Error Message

status 13, status 41, errno = 110

Cause

NetBackup bpbkar process may be:

  1. busy reading an extremely large file or reading a large disk with few file changes.
  2. waiting for an application to release a lock on a file.
  3. waiting for the OS response which could be hung on a corrupt or inaccessible file / directory

Troubleshooting

To test a problem data path without running a backup:

  1. Open Command Prompt on the Windows client.
  2. Run the tar / ls / bpbkar commands below manually on the client to see if there are problems with data corruption or directory access problem.
  3. In the testing with bpbkar, this will just read the client data to /dev/null. No backup will be done.

 

To test with Unix 'tar' and 'ls' commands:

  • # ls -ltR /YOUR-DATA-PATH-TO-TEST-HERE
  • # tar cvf /dev/null /YOUR-DATA-PATH-TO-TEST-HERE

If problems show up using the above commands, have the administrator resolve the client issue before continuing.

 

To test with NetBackup 'bpbkar' command:

To view the last file / directory where the backup is hanging, do the following on the client:

  1. Create the bpbkar debug log directory by running:
    • mkdir /usr/openv/netbackup/logs/bpbkar
  2. Create an empty file named bpbkar_path_tr to enable debug logging into bpbkar log.
    • touch /usr/openv/netbackup/bpbkar_path_tr
      • Note: The use of this touch file 'bpbkar_path_tr'  will cause larger bpbkar logs than the usual.
  3. Enable verbose 5 client logging through the NetBackup GUI on the master.
    • Under NetBackup Management > Host Properties > CLIENTS > double click client-name > Properties > Logging > Global > 5
    • Or, add the VERBOSE = 5 into the /usr/openv/netbackup/bp.conf file on the client server.
      • Note: No restart is needed after these settings are added.
  4. Run a backup to generate the log information.

 

Run the commands to start the test:

  • # cd /usr/openv/netbackup/bin
  • # ./bpbkar -dt 0 -r 888 -nocont -nfsok   /YOUR-DATA-PATH-TO-TEST-HERE > /dev/null

If the bpbkar command stops or hangs, view the end of the /usr/openv/netbackup/logs/bpbkar/log.<date> file for the last directory / file name and possible OS messages.

Example:

17:03:25.110 [642176] <2> bpbkar SelectFile: cwd=/var/apache/logs path=access_log
17:03:25.297 [642176] <2> bpbkar SelectFile: cwd=/var/apache/logs path=error_log

Note: It hangs here, there are no further messages for PID [642176].

 

File Hang:

The last file in the bpbkar log may have corruption and should be tested with Linux commands: ls, cp, mv

Directory / Mount Point Hang:

If the last file is a directory or mount point, cd to the path and see if it hangs the cursor. Hit Ctrl+C to exit out of the hang.

Solution

  1. For file or directory hang tested at the OS level:
    • As a workaround to the file system issue, put the path into the /usr/openv/netbackup/exclude_list on the client.
    • echo "/Path/file_name" >> /usr/openv/netbackup/exclude_list
  2. For hang on active files in use by applications, add the 'LOCKED_FILE_ACTION = SKIP ' entry into the /usr/openv/netbackup/bp.conf file on the client server.
    • echo "LOCKED_FILE_ACTION = SKIP"  >>  /usr/openv/netbackup/bp.conf
      • Note: No restart of any server is needed after making this change
    • This setting can also be made through the NetBackup GUI on the master.
      • Under NetBackup Management > Host Properties > CLIENTS > double-click client-name > Properties > Unix Client > Client settings > Locked File Action > 'Skip' 
      • Note: No restart of any server is needed after making this change
  3. For clients with very large files, very large disks, and for servers that are heavily loaded, have the media server allow them more time before shutting down the backup job. 
    • Increase the media server 'Client Read Timeout'.
    • Open the NetBackup GUI on the master
    • Under NetBackup Management > Host Properties > Media Server > double-click media server name > Properties > Timeouts
    • Increase the media server's "Client read timeout" to 1800, 3600 seconds, or higher.
      • Warning: Do not adjust the "Client connect timeout". That setting should remain at default of 300 seconds.
      • Note: No restart of any server is needed after making this change

Was this content helpful?