SAN Client backups failing intermittently with STATUS 174 and nbftsrvr buffer processing failures in [ProcessReadWrite]
Problem
It is also possible that some initiator driver errors can cause our application to resend a CDB (Command Descriptor Block) while the FT Media Server still has an outstanding DMA (Direct Memory Access) transfer from the previous command.
Error Message
Job details
Critical bptm(pid=xxxxx) data buffers out of sequence, expected number 1842595, received 1842598 <<<---Received number is higher
Critical bptm(pid=xxxxx) data buffers out of sequence, expected number 17, received 20
BPTM log
11:44:35.144 [5167] <32> write_data: data buffers out of sequence, expected number 17, received 20
11:44:35.144 [5167] <2> write_backup: write_data() returned, exit_status = 174, CINDEX = 0, TWIN_INDEX = 0, backup_status = -8
NBFTSRVR (OID:199) log
06/22/09 11:44:34.297 [Debug] NB 51216 FATServer 199 PID:7667 TID:4124814240 File ID:199 [No context] 1 [ProcessReadWrite] BUFFER sequence error, cmd = 0x0x3b, pipe = 0x0x1, ftseq = 0x0x10, rseq = 0x0x13, state = 0x0x1
06/22/09 11:44:34.297 [Debug] NB 51216 FATServer 199 PID:7667 TID:4124814240 File ID:199 [No context] 1 [ProcessReadWrite] WRITE BUFFER sequence error 16 19
OR
[ProcessReadWrite] DMAbuff in use on pipe 6 IOpending=3 at ftseq 2113882 index 10? <<<---Pending commands for a buffer
Cause
Usually caused by defective initiator drivers or overabundant commands from client utilities or device monitoring programs.
Solution
To avoid the issue, disable any client utilities or device monitoring programs on the clients from interfacing with the FT devices. Also verify that all SAN clients have up to date initiator drivers.
On NetBackup SAN Clients, we recommend using the updated Initiator drivers as a best practice.
Workaround
NetBackup 6.5.6 and later releases include a touch file ( /usr/openv/var/FTCLIENT_VALIDATE_VIA_CHECK_CONDITION ).
This touch file needs to be created on the client(s) experiencing the failures. The file should be created with the value of '1' in it. After the touch file is created, additional Client Data Block (CDB) information is supplied which counteracts the problems when the additional TUR data is sent. Unfortunately, there will be a reduction in performance as the CDBs are sent twice.
NetBackup 7.1.0.3 and later releases also include support for this touch file, it must be created with the value of '1' in it. Failure to do so can have a negative impact on performance and stability.
Applies To
NetBackup 6.X / 7.X , SAN Client backups