SAN Client backups failing intermittently with STATUS 174 and nbftsrvr buffer processing failures in [ProcessReadWrite]

Article: 100004435
Last Published: 2013-08-30
Ratings: 0 0
Product(s): NetBackup & Alta Data Protection

Problem

 
Due to certain Initiator driver errors or environmental issues, SAN Client jobs can encounter sequence errors during data transfers. Those jobs could fail with STATUS 174 error.
It is also possible that some initiator driver errors can cause our application to resend a CDB (Command Descriptor Block) while the FT Media Server still has an outstanding DMA (Direct Memory Access) transfer from the previous command.
 
Note: This second problem condition is more likely to happen if an empty /usr/openv/var/FTCLIENT_VALIDATE_VIA_CHECK_CONDITION file exists.
 
 

 

Error Message

Job details
Critical bptm(pid=xxxxx) data buffers out of sequence, expected number 1842595, received 1842598    <<<---Received number is higher

Critical bptm(pid=xxxxx) data buffers out of sequence, expected number 17, received 20

BPTM log
11:44:35.144 [5167] <32> write_data: data buffers out of sequence, expected number 17, received 20
11:44:35.144 [5167] <2> write_backup: write_data() returned, exit_status = 174, CINDEX = 0, TWIN_INDEX = 0, backup_status = -8

NBFTSRVR (OID:199) log
06/22/09 11:44:34.297 [Debug] NB 51216 FATServer 199 PID:7667 TID:4124814240 File ID:199 [No context] 1 [ProcessReadWrite] BUFFER sequence error, cmd = 0x0x3b, pipe = 0x0x1, ftseq = 0x0x10, rseq = 0x0x13, state = 0x0x1
06/22/09 11:44:34.297 [Debug] NB 51216 FATServer 199 PID:7667 TID:4124814240 File ID:199 [No context] 1 [ProcessReadWrite] WRITE BUFFER sequence error 16 19


OR

[ProcessReadWrite] DMAbuff in use on pipe 6 IOpending=3 at ftseq 2113882 index 10? <<<---Pending commands for a buffer

 

 

Cause

Usually caused by defective initiator drivers or overabundant commands from client utilities or device monitoring programs.

Solution

To avoid the issue, disable any client utilities or device monitoring programs on the clients from interfacing with the FT devices. Also verify that all SAN clients have up to date initiator drivers. 

On NetBackup SAN Clients, we recommend using the updated Initiator drivers as a best practice.

Workaround
NetBackup 6.5.6 and later releases include a touch file ( /usr/openv/var/FTCLIENT_VALIDATE_VIA_CHECK_CONDITION ).

This touch file needs to be created on the client(s) experiencing the failures. The file should be created with the value of '1' in it. After the touch file is created, additional Client Data Block (CDB) information is supplied which counteracts the problems when the additional TUR data is sent.  Unfortunately, there will be a reduction in performance as the CDBs are sent twice.

NetBackup 7.1.0.3 and later releases  also include support for this touch file, it must be created with the value of '1' in it. Failure to do so can have a negative impact on performance and stability. 


Applies To

NetBackup 6.X / 7.X , SAN Client backups

References

Etrack : 1736627

Was this content helpful?