Please enter search query.
Search <product_name> all support & community content...
Article: 100003560
Last Published: 2020-02-10
Ratings: 3 0
Product(s): NetBackup
Problem
Unix client backup jobs hang during the backup
Error Message
status 13, status 41, errno = 110
Cause
NetBackup bpbkar process may be busy reading an extremely large file or reading a large disk with few file changes.
NetBackup bpbkar process may be waiting for an application to release a lock on a file.
NetBackup bpbkar process may be waiting for the OS response which could be hung on a corrupt or inaccessible file / directory
TROUBLESHOOTING:
TO TEST A PROBLEM DATA PATH WITHOUT RUNNING A BACKUP:
--------------------------------------------------------------------------------------------------------
Open command window on the client.
Run the tar / ls / bpbkar commands below manually on the client to see if there are problems with data corruption or directory access problem.
In the testing with bpbkar, this will just read the client data to /dev/null. No backup will be done.
Test With Unix ' tar' and 'ls' -
# ls -ltR /YOUR-DATA-PATH-TO-TEST-HERE
# tar cvf /dev/null /YOUR-DATA-PATH-TO-TEST-HERE
IF PROBLEMS SHOW UP USING THE ABOVE, have the administrator resolve the client issue before continuing.
To Test With NetBackup ' bpbkar ' -
To view the last file / directory where the backup is hanging, do the following on the client:
1. Create the bpbkar debug log directory by running:
mkdir /usr/openv/netbackup/logs/bpbkar.
2. Create an empty file named bpbkar_path_tr to enable debug logging into ../netbackup/logs/bpbkar/log.<date>
touch /usr/openv/netbackup/bpbkar_path_tr
3. Enable verbose 5 client logging through the NetBackup GUI on the master.
Under NetBackup Management>Host Properties>CLIENTS>double click client-name >properties>Logging>Global>5
Or add ' VERBOSE = 5 ' into the /usr/openv/netbackup/bp.conf file on the client server
4. No restart is needed after these settings are added
5. Run a backup to generate the log information
Note that the use of this "touch" file 'bpbkar_path_tr' will cause larger bpbkar logs than the usual.
Run command to start the test
# cd /usr/openv/netbackup/bin
# ./bpbkar -dt 0 -r 888 -nocont -nfsok /YOUR-DATA-PATH-TO-TEST-HERE > /dev/null
If BPBKAR command stops or hangs, view the end of the /usr/openv/netbackup/logs/bpbkar/log.<date> file for the last directory / file name and possible OS messages.
I.E.
.
17:03:25.110 [642176] <2> bpbkar SelectFile: cwd=/var/apache/logs path=access_log
17:03:25.297 [642176] <2> bpbkar SelectFile: cwd=/var/apache/logs path=error_log
(**Hang** no further messages for PID [642176] )
File Hang:
The last file in the bpbkar log may have corruption and should be tested with unix commands: ls, cp, mv
Directory / Mount point Hang:
If the last file is a directory or mount point, cd to the path and see if it hangs the cursor. Hit Ctrl-c to exit out of the hang.
Solution
For file or directory hang tested at the OS level:
As a workaround to the file system issue, put the path into the /usr/openv/netbackup/exclude_list on the client.
echo "/Path/file_name" >> /usr/openv/netbackup/exclude_list
For hang on active files in use by applications:
Add ' LOCKED_FILE_ACTION = SKIP ' into the /usr/openv/netbackup/bp.conf file on the client server
echo "LOCKED_FILE_ACTION = SKIP" >> /usr/openv/netbackup/bp.conf
No restart of any server is needed after making this change
This setting can also be made through the NetBackup GUI on the master.
Under NetBackup Management>Host Properties>CLIENTS>double click client-name >properties>Unix Client>Client settings>Locked File Action>'Skip'
No restart of any server is needed after making this change
For clients with very large files, very large disks, and for servers that are heavily loaded, have the media server allow them more time before shutting down the backup job.
Increase the media server 'Client Read Timeout'.
Open the NetBackup GUI on the master
Under NetBackup Management>Host Properties>Media Server>double click media server name>properties>timeouts
Increase the media server's "Client read timeout" to 1800, 3600 seconds or higher.
Do not adjust the "Client connect timeout". That setting should remain at default of 300 seconds.
No restart of any server is needed after making this change