STATUS CODE 41: Possible causes of exit status 41 (network connection timed out) on NetBackup for Lotus Notes Agent backups

Article: 100016299
Last Published: 2021-07-19
Ratings: 1 1
Product(s): NetBackup & Alta Data Protection

Problem

STATUS CODE 41: Possible causes of exit status 41 (network connection timed out) on NetBackup for Lotus Notes Agent backups

Error Message

EXIT STATUS 41: network connection timed out

Solution

Overview: There are several causes of status code 41with Lotus Notes backups. This article outlines those causes.

Troubleshooting:
Typical situations where Lotus Notes backup swill fail with status 41 include:

1. Client Read Timeout value set too low to allow backup processes to complete
2. Lotus Notes ID file incorrectly configured
3. Lotus Notes ID file incorrectly specified in notes.ini
4. Networking issues related to the TcpMaxDataRetransmissions setting in the registry
5. Incorrect configuration of the NetBackup (tm) for Lotus Notes extension

Each of these potential problems are covered in depth below.

1. Client ReadTimeout value set too low to allow backup processes to complete:

TheNetBackup for Lotus Notes backup agent proceeds through four different processes to back up a database. These processes are visible in the bpbkar log (if logging is set high enough) as seen below:

First, the agent will locatethe databases:
06/07/02 05:21:39 AM: [441]: INF -FileAction() NSFSearch() Found<database_name>.nsf

Next, the agent will save thedatabase location:
06/07/02 05:22:57 AM: [439]:INF - CopyLocalToMaster() Process object<database_name>.nsf
06/07/0205:22:57 AM: [439]: INF - CopyLocalToMaster() Allocate memory for theobject

Then, the agent will open the database forinterrogation:
06/07/02 05:24:43 AM: [439]: INF -NBLN_FindNextFile() <Enter>

Finally, the agent willtransfer the data to a NetBackup storageunit:
06/07/02 05:24:44 AM: [439]: TAR - Backup:C:\Notes\<database_name>.nsf
06/07/0205:24:44 AM: [439]: INF - read non-blocking message of length1
<snip>
06/07/02 05:31:57AM: [439]: INF - read non-blocking message of length1
06/07/02 05:31:57 AM: [439]: FIL -970574 7 5530 19 33216 root root 970390 1022105311 1022105311 1013440636/C/Notes/<database_name>.nsf

If any of these processes take longer than the configured setting for Client Read Timeout, the job will fail.

The bpbrm log will show the timeout message:

17:21:39 [1892.752] <2> bpbrmspawn_child: "D:\Veritas NetBackup\NetBackup\bin\bptm.exe" -w -pid 1892 -cmailatm -den 13 -rt 8 -rn 0 -stunit adicsrvr-dlt-robot-tld-0 -cl NotesTest -bt1013041286 -b mailatm_1013041286 -st 0 -cj 6 -p Notes -ru root -rclnt mailatm-rclnthostname mailatm -rl 1 -rp 1209600 -sl Full -ct 25 -v -mediasvr adicsrvr-jobid 633 -masterversion340000
17:21:39 [1892.752] <2>bpbrm create_mm_terminate: created terminate event pid1860
17:21:39 [1892.752] <2>bpbrm write_continue_backup: wrote CONTINUE BACKUP onCOMM_SOCK
17:21:40 [1892.752]<4> bpbrm main: from client mailatm: TRV - BACKUP 2/6/02 7:19:45 PMmailatm NotesTest Full FULL
17:22:10[1892.752] <2> bpbrm mm_sig: received ready signal from mediamanager
17:31:51 [1892.752] <2>bpbrm readline: bpbrm timeout after 300seconds
17:31:51 [1892.752] <2>bpbrm kill_child_process:start
17:33:13 [1892.752] <2>bpbrm wait_for_child: start
17:33:13[1892.752] <2> bpbrm wait_for_child: child exit_status =150
17:33:13 [1892.752] <2>inform_client_of_status: INF - Server status =41
17:33:18 [1892.752] <4>bpbrm Exit: client backup EXIT STATUS 41: network connection timedout

Resolution:
By increasing the Client ReadTimeout, backups are allowed to proceed through all the backup processes tofinish successfully.

To increase the Client Read Timeout in NetBackup 4.5 using the administrative console, go to the master server and open the administrative console. Then locate the Lotus Notes server in the Clientssection of the Host Properties area. Right-click on the Notes server, and select Properties. Go to the Universal Settings tab, and increase the Client Read Timeout value. Depending on the number of databases, this value may need to be set to 1800 or higher.  

To increase the Client Read Timeout in NetBackup 5.0 and 5.1 using the administrative console, go to the master server and open the administrative console. Then locate the Lotus Notes server in the Clients section of the  Host Properties area. Right-click on the Notes server, and select  Properties. Go to the Timeouts section, and increase the Client Read Timeout value. Depending on the number of databases, this value may need to be set to 1800 or higher.  

To increase the Client Read Timeout by modifying the registry, start Regedit by selecting  Start | Run | Regedit. Then go to HKEY_LOCAL_MACHINE\SOFTWARE\VERITAS\NetBackup\CurrentVersion\Config. Inthe Config key, find the CLIENT_READ_TIMEOUT value. If it does not exist, create it by selecting Edit | New | DWORD Value, and call the newvalue CLIENT_READ_TIMEOUT. Then open the value, change the base number from hexadecimal to decimal, and then set the value. Depending on the number of databases, this value may need to be set to 1800 or higher.  

2. Lotus Notes ID file incorrectlyconfigured:

When backing up Lotus Notes databases, the backup fails with status 41 and the backup consistently "hangs" on the same database. Are view of the bpbkar log file shows the backup is in the  NBLN_FindNextFile() portion of the backup, but provides little else in terms of detail. This is because the "hang" is on the Lotus side of the backup.

05/23/02 10:14:12 AM: [153]: INF -CopyLocalToMaster() Get nextobject
05/23/02 10:14:12 AM: [153]:INF - CopyLocalToMaster()<Exit>
05/23/02 10:14:12 AM:[153]: INF - SearchForElements() Buffer size:352176
05/23/02 10:14:12 AM: [153]:INF - SearchForElements()<Exit>
05/23/02 10:14:12 AM:[153]: INF - NBLN_OpenEnumerate()<Exit>
05/23/02 10:14:12 AM:[153]: INF - NBLN_FindNextFile()<Enter>
05/23/02 10:14:13 AM:[153]: INF - NBLN_FindNextFile()<Enter>
05/23/02 10:14:13 AM:[153]: INF - NBLN_FindNextFile()<Enter>
05/23/02 10:14:13 AM:[153]: INF - NBLN_FindNextFile() <Enter>

Please note there are multiple reasons which can cause a backup to fail at this point( NBLN_FindNextFile), including corrupt databases and client read timeouts set too low.

Resolution:
By adding the following lines to the server's notes.ini file, more detailed logs (specifically, a debug.txt file) can be generated by Notes.

DEBUG_OUTFILE=C:\DEBUG.TXT
DEBUG_THREADID=1
DEBUG_CAPTURE_TIMEOUT=1
DEBUG_SHOW_TIMEOUT=1

Please refer to Lotus article 162400 on the Lotus Support site(   https://www-3.ibm.com/software/lotus/support/) for more information about these parameters.

After adding these parameters to the notes.ini file, reboot the Notes server, and run another backup. Then review the debug.txt file, searching for the text" password." Something similar to the following should be found in the debug text file.

[0034:0002-012B] The ID file being usedis:c:\lotus\notes\ids\bfender.id
Enter password (press the Esc key to abort): [0190:0002-018F] 07/12/2002 02:13:19 PM Pushing mailbox01.nsf to tcpip

Note the ID for which a password is being requested, and open that ID file from within Lotus Notes. At the bottom of the dialog box, there is an option to not prompt for password. This option must be selected.

 

Once this option is selected, saved, and the server rebooted, the problem should be resolved.

3. Lotus Notes ID file incorrectly specified in notes.ini:

In the Lotus Notes notes.ini file, there are two fields which, if present, may need to be changed. The two entries are KeyFilename and ServerKeyFilename. The problem will occur if the ID specified for either of those values does not have necessary permissions to access the databases on the Notes server. This occurs most often when either the KeyFilename or the ServerKeyFilename is set to some ID file other than the server.id file.

Resolution:
Search the server's notes.ini file for the values KeyFilename and  ServerKeyFilename. If they are present, change both values to  server.id.


4. Networking issues related to theTcpMaxDataRetransmissions setting in the registry:

A review of the  bpbkar log file, shows the following messages:

12:19:41.700 AM: [93.142] <4>ov_log::OVLoop:Timestamp
12:22:50.372 AM: [93.142]<4> ov_log::OVLoop:Timestamp
12:24:54.950 AM: [93.142]<4> ov_log::OVLoop:Timestamp
12:26:34.044 AM: [93.142]<4> ov_log::OVLoop:Timestamp
12:27:34.825 AM: [93.142]<4> ov_log::OVLoop:Timestamp
12:28:35.763 AM: [93.142]<4> ov_log::OVLoop:Timestamp
12:28:37.606 AM: [93.142]<4> tar_base::V_vTarMsgW: INF - tar message received from tar_backup::backup_data_state
12:28:37.606AM: [93.142] <2> tar_base::V_vTarMsgW: FTL - tar file write error(10054)
12:28:37.606 AM: [93.142]<4> lotus_access::V_CloseForRead: INF -<Enter>

Resolution:
Increasing the  TcpMaxDataRetransmissions value from the default of 5 to 10 allows backups to complete successfully. Refer to Microsoft TechArticle 170359 for details on how to change this value.  

The text of the article is reproduced below for convenience:

*Begin Microsoft Article
-----------------------------------------------------------------------------------------------------------------------------------------
How to Modify the TCP/IP Maximum Retransmission Timeout
The information in this article applies to:

   * Microsoft Windows 2000Server
   * Microsoft Windows 2000 Advanced Server
   * Microsoft Windows 2000 Professional
   *Microsoft Windows 2000 Datacenter Server
   * Microsoft Windows NT Workstation 4.0
   * Microsoft Windows NT Server 4.0

This article was previously published under Q170359
IMPORTANT: This article contains information about modifying the registry. Before you modify the registry, make sure to back it up and make sure that you understand how to restore the registry if a problem occurs. For information about how to back up/restore, and edit the registry, click the following article number to view the article in the Microsoft Knowledge Base:256986 Description of the Microsoft Windows Registry

SUMMARY
TCP starts a retransmission timer when each outbound segment is handed down to IP. If no acknowledgment has been received for the data in a given segment before the timer expires, then the segment is retransmitted, up to the TcpMaxDataRetransmissions times. The default value for this parameter is 5.

The retransmission timer is initialized to three seconds when a TCP connection is established; however, it is adjusted on the fly to match the characteristics of the connection using Smoothed Round Trip Time(SRTT) calculations as described in RFC793. The timer for a given segment is doubled after each retransmission of that segment. Using this algorithm, TCP tunes itself to the normal delay of a connection. TCP connections over high-delay links will take much longer to time out than those over low-delay links.

By default, after the retransmission timer hits 240 seconds, it uses that value for retransmission of any segment that needs to be retransmitted. This can be a cause of long delays for a client to time out on a slow link.

For additional information about the latest service pack for Windows 2000, click the article number below to view the article in the Microsoft Knowledge Base:
260910 - How to Obtain the Latest Windows 2000Service Pack

MORE INFORMATION
Warning: If you use Registry Edit or incorrectly, you may cause serious problems that may require you to reinstall your operating system. Microsoft cannot guarantee that you can solve problems that result from using Registry Editor incorrectly. Use Registry Editor at your own risk.

Windows provides a mechanism to control the initial retransmit time, and then the retransmit time is self-tuning. To change the initial retransmit time, modify the following values in the following registry key:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters
Value Name:  TcpMaxDataRetransmissions
Data Type:   REG_DWORD -Number
Valid Range: 0 - 0xFFFFFFFF
Default:    5

Description: This parameter controls the number of times TCP retransmits an individual data segment (non connect segment) before aborting the connection. The retransmission timeout is doubled with each successive retransmission on a connection. It is reset when responses resume. The base timeout value is dynamically determined by the measured round-trip time on the connection.
Change the following key in Windows NT4.0:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters
Value Name:  InitialRtt
Data Type:   REG_DWORD
Valid Range: 0-65535(decimal)
Default:     0xBB8 (3000 decimal)

Description: This parameter controls the initial retransmission timeout used by TCP on each new connection. It applies to the connection request (SYN) and to the first data segment(s) sent on each connection.

For example, the value data 5000decimal sets the initial retransmit time to five seconds.
Change the following key in Windows2000:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\ID for Adapter
Value Name:  TCP InitialRtt
Data Type:  REG_DWORD
Valid Range: 3000-65535 (decimal)
Default:     0xBB8(3000 decimal)

Description: This parameter controls the initial retransmission timeout used by TCP on each new connection. It applies to the connection request (SYN) and to the first data segments sent on each connection. For example, the value data 5000 decimal sets the initial retransmit time to five seconds.
------------------------------------------------------------------------------------------------------------------------------------------------------------
*End of Microsoft Article


Note: You can only increase the value for the initial timeout. Decreasing the value is not supported. For additional information about retransmit time, click the article numbers below to view the articles in the Microsoft Knowledge Base:
232512 - TCP/IP may Retransmit Packets Prematurely
223450 - TCP Initial Retransmission Timer Adjustment Added to Windows NT
For additional information, search the Web for RFC 793(Section 3.7) TCP Protocol Specification.


5. Incorrect configuration of the NetBackup for Lotus Notes extension:

In the  bpbkar log file, the last two lines associated with the backup show the following:

11:38:14.985 AM: [333.379]<4> dos_backup::V_VerifyFileList: INF - Replaced: F:\notes\data with LotusNotes:\F:\notes\data
11:38:14.985 AM:[333.379] <333> nbex_DebugLog: INF - NBLN_Connect() <Enter>NotesIniPath:'F:\notes\notes.ini'

Resolution:
Areview of the NetBackup registry on the Lotus Notes server shows the LOTUS_NOTES_INI and the LOTUS_NOTES_PATH are not configuredcorrectly. These entries are not required for all installations, but if they do exist, the values must be correct for the backup to run successfully. Syntax is critical.

1. Click Start | Run and type regedt32
2. Select the  HKEY_LOCAL_MACHINE key
3. Navigate to  SOFTWARE\VERITAS\NetBackup\CurrentVersion\Config
4. Highlight the  Config key, and from the Edit menu, click Add Value... to add a new value
5. Set the Value Name to LOTUS_NOTES_PATH, and select REG_SZ for the Data Type
6. Click OK to add value, and the String Editor dialog box appears
7. For the value data, enter the path to the Notes nserver.exe file; for example, if a search of the hard drive for the nserver.exe file determined the file was located in D:\Lotus\domino, enter D:\Lotus\domino, and click OK to accept the value
8. Repeat this process for the LOTUS_NOTES_INI value. Specify both, the directory location as well as the file name; for example, D:\Lotus\domino\notes.ini for the value data.
 
 

 

Was this content helpful?