Inconsistent name resolution behavior causes intermittent status 46 and status 59

Article: 100029136
Last Published: 2013-04-06
Ratings: 0 0
Product(s): NetBackup & Alta Data Protection

Problem

During backup, restore, and other operation NetBackup relies on forward and reverse host name lookup to connect to hosts and authenticate inbound connections. If these operating system functions do not consistently return values that match the NetBackup configuration then failures may result occasionally. Because the behavior is intermittent, the problem often automatically corrects after an hour, but may reoccur hours or days later.

 

Error Message

The Job Details show that the backup failed with:

access to the client was not allowed (59)

 

The bpcd debug log shows that initially connections from source IP address 2.2.2.2 are resolved to hostname myserver.domain which is in the server list and accepted.

13:05:28.673 [11120] <2> logconnections: BPCD ACCEPT FROM 2.2.2.2.38700 TO 3.3.3.3.1556 fd = 0
13:05:28.682 [11120] <2> bpcd peer_hostname: Connection from host myserver.domain (2.2.2.2) port 38700
13:05:28.682 [11120] <2> bpcd valid_server: comparing mymaster.domain and myserver.domain
13:05:28.683 [11120] <2> bpcd valid_server: comparing myserver.domain and myserver.domain

13:05:28.683 [11120] <4> bpcd valid_server: hostname comparison succeeded
...snip...
13:05:28.882 [11120] <2> bpcd exit_bpcd: exit status 0  ----------->exiting

 

A few minutes later, the source IP is resolved to a different, alternate form of the, hostname which is not in the server list, and the connection is closed without accepting any commands from the server.

13:06:06.016 [11254] <2> logconnections: BPCD ACCEPT FROM 2.2.2.2.43626 TO 3.3.3.3.1556 fd = 0
13:06:06
.024 [11254] <2> bpcd peer_hostname: Connection from host myserver.sub.domain (2.2.2.2) port 43626
13:06:06.024 [11254] <2> bpcd valid_server: comparing mymaster.domain and myserver.sub.domain
13:06:06.025 [11254] <2> bpcd valid_server: comparing myserver.domain and myserver.sub.domain
13:06:06.025 [11254] <4> bpcd valid_server: myserver.sub.domain is not a master server
13:06:06.025 [11254] <16> bpcd valid_server: myserver.sub.domain is not a media server either

...snip...
13:06:06.172 [11254] <2> bpcd exit_bpcd: exit status 46  ----------->exiting
13:06:06.172 [11254] <4> bpcd exit_bpcd: FTL - BPCD EXIT STATUS 46

 

An hour later, the source IP is resolved to a third hostname which is also not in the server list.  Because this hostname is incorrect and appears to be from an invalid host, the connection is also rejected.

14:06:15.979 [11279] <2> logconnections: BPCD ACCEPT FROM 2.2.2.2.43456 TO 3.3.3.3.1556 fd = 0
14:06:15
.988 [11279] <2> bpcd peer_hostname: Connection from host myserver.sub.sub.domain (2.2.2.2) port 43456
14:06:15.988 [11279] <2> bpcd valid_server: comparing mymaster.domain and myserver.sub.sub.domain
14:06:15.988 [11279] <2> bpcd valid_server: comparing myserver.domain and myserver.sub.sub.domain
14:06:15.988 [11279] <4> bpcd valid_server: myserver.sub.sub.domain is not a master server
14:06:15.988 [11279] <16> bpcd valid_server: myserver.sub.sub.domain is not a media server either

...snip...
14:06:16.133 [11279] <2> bpcd exit_bpcd: exit status 46  ----------->exiting
14:06:16.133 [11279] <4> bpcd exit_bpcd: FTL - BPCD EXIT STATUS 46

 

An hour later, the source IP resolves to the original hostname and operations are again successful.

15:06:45.455 [11350] <2> logconnections: BPCD ACCEPT FROM 2.2.2.2.55728 TO 3.3.3.3.1556 fd = 0
15:06:45
.464 [11350] <2> bpcd peer_hostname: Connection from host myserver.domain (2.2.2.2) port 55728
15:06:45.464 [11350] <2> bpcd valid_server: comparing mymaster.domain and myserver.domain
15:06:45.464 [11350] <2> bpcd valid_server: comparing myserver.domain and myserver.domain
15:06:45.464 [11350] <4> bpcd valid_server: hostname comparison succeeded

...snip...
15:06:45.660 [11350] <2> bpcd exit_bpcd: exit status 0  ----------->exiting

 

The bpbrm debug log shows the failed connection attempt and that the client status 46 is mapped to a server status 59.

14:06:40.165 [25689] <2> bpbrm start_bpcd_stat: bpbrm.c.22004: bpcd_client_hostname: myclient.domain
14:06:40.167 [25689] <2> vnet_pbxConnect: pbxConnectEx Succeeded
14:06:40.167 [25689] <2> logconnections: BPCD CONNECT FROM 2.2.2.2.43456 TO 3.3.3.3.1556 fd = 8

...snip...
14:06:40.364 [25689] <2> ConnectToBPCD: bpcd_connect_and_verify(myclient.domain, myclient.domain) failed: 46

14:06:40.364 [25689] <16> bpbrm start_bpcd_stat: bpcd on myclient.domain exited with status 59: access to the client was not allowed
...snip...
14:06:40.397 [25689] <2> bpbrm send_status_to_parent: EXIT myclient.domain_1365188798 59 sent to parent process for jobid = 65694.

 

Cause

The root problem is that the DNS servers did not respond to all requests in a consistent manner.

In the situation above, the primary DNS server was offline and the configuration of the secondary did not match the primary. The secondary could not handle the request load and the client operating system contacted the tertiary DNS which returned a third and different hostname. One hour later the primary DNS was back online and the next backup attempt was successful.

Note: On NetBackup 7.1 and higher, the NetBackup host cache will retain the unexpected values for an hour. Hence the one hour delays above before each change in behavior.

 

Solution

Ensure that the configured DNS servers are responsive to the load and consistently return the correct and expected hostnames and IP addresses.

Note: To allow NetBackup 7.1 or higher to use any name resolution changes without delay, clear the NetBackup Host Cache on that host.E.g.

UNIX/Linux: /usr/openv/netbackup/bin/bpclntcmd -clear_host_cache
Windows: install_path\NetBackup\bin\bpclntcmd -clear_host_cache

 

If the DNS inconsistency cannot be corrected quickly, several workarounds are possible.

Workaround #1:

  • Configure the NetBackup hosts to use host files before DNS and configure the forward and reverse lookup appropriate to match the NetBackup configuration.
  • Be sure to clear the NetBackup Host Cache after making the changes.

Workaround #2:

  • Extend the NetBackup configuration to accommodate the name resolution results from any of the DNS servers that could potentially be used.

 

In this case the hostnames returned by the secondary and tertiary DNS servers for the IP address of the media server were temporarily added to the /usr/openv/netbackup/bp.conf file on the client host. That allowed the client host to treat that hostname as a valid NetBackup server.

Example:

SERVER = mymaster.domain
SERVER = myserver.domain
SERVER = myserver.sub.domain        # For secondary DNS
SERVER = myserver.sub.sub.domain    # For tertiary DNS

Once the network administrator updated the configuration on the secondary and tertiary DNS servers, the bottom two server entries were no longer needed and were removed.

 

Applies To
Any operating system platform and any NetBackup version

 

References

Etrack : 3137835

Was this content helpful?