NetBackup STATUS 23, NDMP backups fail after the data has been transferred to the tape storage

NetBackup STATUS 23, NDMP backups fail after the data has been transferred to the tape storage

Article: 100028631
Last Published: 2019-04-15
Ratings: 3 0
Product(s): NetBackup

Problem

Status 23 failures for NDMP backups are a potential nuisance.  If the backups have completed sending the data to the storage, but then the communications fail to re-establish when saving the TIR (True Image Restore) meta data, the problems will occur.  These backups can run for many hours or just a short period of time and encounter the same problem.

Sometimes backups to the same storage, using the same Filer and NetBackup media server can be successful.  There will be no pattern regarding backup duration, size of the backup, or the number of files being backed up, which cause the backups to fail.  The only pattern will be that the backups fail after sending the data to the tape storage, and before sending the TIR meta data.

Error Message

Example of failure (BPTM log)

05:25:42.777 [1420] <2> NdmpMediaSession: ndmp_connect_to_server: hostname = <Filer-IP>, portname = 10000
05:25:42.834 [1420] <2> NdmpMoverSession[0]: Creating server for <Filer-IP> in Server Port Window 1025 to 5000
05:25:42.835 [1420] <2> NdmpMoverSession[0]: Listen address = <Media-Server-IP> port 3571
...(75 second delay)...
05:26:57.930 [1420] <2> NdmpMoverSession[0]: ndmp_mover_connect failed, status = NDMP_CONNECT_ERR
05:26:57.930 [1420] <2> NdmpMoverSession[0]: MoverConnect method failed. Reinitializing and attempting to use MoverListen method.
05:26:58.002 [1420] <2> NdmpMoverSession[0]: mover halted reason NDMP_MOVER_HALT_ABORTED
05:26:58.042 [1420] <2> NdmpMoverSession[0]: ndmp_mover_abort status = 0
05:26:58.044 [1420] <2> NdmpMoverSession[0]: ndmp_mover_stop status = 19
05:26:58.044 [1420] <2> NdmpMoverSession[0]: Shutdown complete
05:26:58.044 [1420] <2> NdmpMoverSession[0]: ndmp_mover_set_window(0, 18446744073709551615)
05:26:58.047 [1420] <2> NdmpMoverSession[0]: ndmp_mover_listen failed, status = NDMP_CONN_TYPE_UNCONFIGURED_ERR
05:26:58.047 [1420] <16> NdmpMoverSession[0]: ERROR Start failed
05:26:58.047 [1420] <16> check_and_process_mover_tasks: NDMP Mover Client Setup failed
05:26:58.060 [1420] <2> NdmpMoverSession[0]: Shutdown complete
05:26:58.060 [1420] <2> write_data_tir: ndmp_task = 1
05:26:58.060 [1420] <2> write_data_tir: status_to_return = -1
05:26:58.073 [1420] <2> check_error_history: just tpunmount: called from bptm line 21399, EXIT_Status = 23

Cause

The NetBackup BPTM logs will show the difference of when the NDMP Mover is enabled or not

Example (1)  When the NDMP Mover is enabled.  The key lines are, " Creating server for <Filer-IP> in Server Port Window 1025 to 5000 " and " Listen address = <Media-Server-IP> port " as these define how the communication should be performed from the Filer to the NetBackup media server.

13:45:00.741 [29687] <2> ndmp_setup_for_write: IS_NDMP (is NDMP backup image) = 1
13:45:00.741 [29687] <2> ndmp_setup_for_write: mover_offset 0, mover_length 18446744073709551615
13:45:00.741 [29687] <2> ndmp_setup_for_write: mover_previous 0, previous_progress_kbytes 0
13:45:00.741 [29687] <2> ndmp_setup_for_write: pre_mover_bytes_needed 0, pre_mover_bytes_processed 0
13:45:00.741 [29687] <2> ndmp_setup_for_write: post_mover_bytes_needed 0, post_mover_bytes_processed 0
13:45:00.741 [29687] <2> NdmpMoverSession[0]: ndmp_mover_set_record_size(65536)
13:45:00.761 [29687] <2> NdmpMoverSession[0]: EnableStart
13:45:00.761 [29687] <2> NdmpMoverSession[0]: CheckTask START
13:45:00.780 [29687] <2> NdmpMoverSession[0]: ndmp_mover_set_window(0, 18446744073709551615)
13:45:00.798 [29687] <2> NdmpMoverSession[0]: Attempting to create server to which remote host <Filer-hostname> can connect - IPv4
13:45:00.800 [29687] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1630] found via getaddrinfo NAME=<Filer-IP> SVC=10000
13:45:00.801 [29687] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4994] targaddr=<Filer-IP> No match, some prohibited, prefnet=IF_LIST
13:45:00.801 [29687] <2> async_connect: [vnet_connect.c:1477] connect in progress 1 0x1
13:45:00.802 [29687] <2> async_connect: [vnet_connect.c:1644] connect async CONNECT FROM <Media-Server-IP>.50813 TO <Filer-IP>.10000 fd = 24
13:45:00.802 [29687] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM <Media-Server-IP> TO <Filer-IP> <Filer-IP> 10000
13:45:00.819 [29687] <2> NdmpMoverSession[0]: Creating server for <Filer-IP> in Server Port Window 1025 to 5000
13:45:00.819 [29687] <2> NdmpMoverSession[0]: Listen address = <Media-Server-IP> port 4328
13:45:00.877 [29687] <2> NdmpMoverSession[0]: Accepted new socket 25
13:45:00.877 [29687] <2> NdmpMoverSession[0]: Start successful
13:45:00.894 [29687] <2> NdmpMoverSession[0]: CheckTask NONE
13:45:00.894 [29687] <2> write_data_tir: writing first TIR block to media, bytes = 65536
13:45:00.932 [29687] <2> write_ndmp_media: Completed mover processing for fragment
13:45:00.932 [29687] <2> write_ndmp_media: bytes_to_write 26624, BUFF_SIZE 65536
13:45:00.933 [29687] <2> write_ndmp_media: image_bytes_processed 0 image_bytes_needed 18446744073709551615
13:45:00.954 [29687] <2> NdmpMoverSession[0]: mover halted reason NDMP_MOVER_HALT_CONNECT_CLOSED
13:45:00.966 [29687] <2> NdmpMoverSession[0]: Flush complete
13:45:01.130 [29687] <2> NdmpMoverSession[0]: ndmp_mover_stop status = 0
13:45:01.130 [29687] <2> NdmpMoverSession[0]: Shutdown complete
13:45:01.159 [29687] <2> NdmpMoverSession[0]: Shutdown complete
13:45:01.159 [29687] <2> write_ndmp_media: Writing 26624 bytes without mover
13:45:01.318 [29687] <2> write_ndmp_media: 0 bytes written by mover, kbytes_this_time = 666
13:45:01.319 [29687] <4> report_throughput: VBRT 1 29687 7 1 IBM.ULTRIUM-TD2.001 MEDIA1 0 1 0 692  692 (bptm.c.27873)
13:45:01.319 [29687] <2> write_data_tir: Total Kbytes transferred 75097


Example (2) When the NDMP Mover is DISABLED, then the connection back to the NetBackup media server is not required and the data is written block by block to the device.

' NDMP_MOVER_CLIENT_DISABLE ' defined in the ' /usr/openv/netbackup/db/config/ndmp.cfg ' file.

13:38:30.754 [29159] <2> check_using_mover: Writing image to NDMP device, block by block
13:38:30.754 [29159] <2> check_using_mover: The tape operation will continue but might be slow.
13:38:30.754 [29159] <2> write_data_tir: writing first TIR block to media, bytes = 65536
13:38:31.705 [29159] <4> report_throughput: VBRT 1 29159 7 1 IBM.ULTRIUM-TD2.000 MEDIA1 0 1 0 666  666 (bptm.c.27873)
13:38:31.705 [29159] <2> write_data_tir: Total Kbytes transferred 75097

 

If the BPTM log states " <2> ndmp_setup_for_write: NDMP mover will not be used, tir_size = <value> BUFF_SIZE = 65536 ", then the backup is too small for the NDMP client mover to be used.

Solution

The problems can be caused by;

1) A firewall restricting communications over certain ports or from certain IP address ranges.

2) NetBackup configurations when ports are restricted.

3) Multiple interfaces on the NDMP Filer.  If the Filer is unable to respond back within 75 seconds to the requested IP address and port number, the STATUS 23 failure will be reported back to the NetBackup media server.

NB:  If multiple interfaces are suspected, try and use the preferred interface option on the Filer to specify the interface the Filer should use for NDMP.

    options ndmpd
    options ndmpd.preferred_interface <interface_name_eg_e0a>

To troubleshoot further TCP tools such as ' snoop ' (Solaris), ' tcpdump ' (Linux, AIX, HP-UX and Solaris), ' nettl ' (HP-UX) and from the NetApp Filers there is a program called ' pktt '.
The idea is to determine (with the aid of the NetBackup BPTM, NDMP and NDMPagent logs), how the communications are being performed, and why the backups/communications are failing after the data has been sent to storage.  It needs to confirmed how the communications are being attempted and over which interfaces, as this will help demonstrate which end is failing to communicate correctly.

To use the ' pktt ' program on the Filer, the following commands can be used.

Filer> pktt start all -d /<output_location>/
<interface-name>: started packet trace
<interface-name>: started packet trace
lo: started packet trace

Filer> pktt stop all
<interface-name>: Tracing stopped and packet trace buffers released.
<interface-name>: Tracing stopped and packet trace buffers released.
lo: Tracing stopped and packet trace buffers released

NB: The output will be stored in '*.trc' format, however, this can be read using tools like Wireshark or ' tcpdump -r <filename.trc> '.

 

Workaround

If the problems cannot be found, an alternative option can be used within NetBackup.  To test, disable the NDMP mover, by entering ' NDMP_MOVER_CLIENT_DISABLE ' in the ' /usr/openv/netbackup/db/config/ndmp.cfg ' configuration file.  NB:  The NDMP mover client is used during duplications, verify, and import operations of NDMP tape, therefore, these operations will be slower as the entire image will be transferred block-by-block.

 

 

Applies To

NetBackup 7.x, NDMP NetApp Filer version 8.1 (and some previous versions)

Was this content helpful?