On Windows Media servers, occassional error 1117, drive I/O errors are seen in the problems report and in the job details in the Activity Monitor.

Article: 100029110
Last Published: 2013-04-15
Ratings: 1 0
Product(s): NetBackup

Problem

HP has identified an issue that can appear on both Microsoft Windows Server 2003/2008, both 32bit and 64bit editions and with HP Insight Management Agents installed.

The issue is described as "Random backup failures when HP StorageWorks Ultrium Tape Drive is connected to a LSI based Host Bus Adapter and Storport driver version installed on the system is later than 5.2.3790.3959, due to Insight Manager Storage Agent timeout if the driver returns SCSI status BUSY and Storport driver retries the command unlimited times. In most of the cases, the tape drive will be discovered properly by the Operative System and will work fine when tested with HP Library And Tape Tools. Even if all possible polling to the tape drive is already prevented, the drive will fail backups randomly. System Event Log will not show any data that can be related to a drive or HBA failure (Event IDs 7, 9, 11 or 15). "

 

Error Message

When this occurs, the bptm log at verbose 5 will show the following:

backup start:

16:20:01.452 [7132.6636] <2> bptm: INITIATING (VERBOSE = 5): -pid 7788 -den 6 -rt 8 -rn 0 -cj 1 -mpx 20 -reqid -1341926046 -jm -brm -p NetBackupFull -stunit [MEDIA_SERVER}-hcart-robot-tld-0 -eari 0 -maxfrag 1048576 -v -masterversion 710000 -mediasvr [MEDIA_SERVER]

 

The backup does actually write for some time, as evidenced by the throughput messages in the logs:

16:41:18.118 [7132.6636] <4> report_throughput: VBRT 1 7132 1 1 HP.ULTRIUM4-SCSI.000 NF2529 0 3 0 1 2 13716352 12990144 12880000  39586496(bptm.c.25449)
16:41:18.118 [7132.6636] <2> write_data: Total Kbytes transferred 39586496


At random times, the drive will report an I/O error triggering error recovery to occur with the drive:

16:44:18.736 [7132.6636] <2> write_data: write of 65536 bytes indicated only 0 bytes were written, err = 1117
16:44:18.751 [7132.6636] <4> write_data: WriteFile failed with: The request could not be performed because of an I/O device error. (1117);bytes written = 65536; size = 0
16:44:18.751 [7132.6636] <2> send_brm_msg: MEDIA NOT READY
16:44:18.751 [7132.6636] <2> write_data: attempting write error recovery, err = 1117
16:44:18.751 [7132.6636] <2> tape_error_rec: error recovery to block 713471 requested
 

Cause

HP has identified this issue and has provided a solution.

It has been noted that this issue could appear on both Microsoft Windows Server in both 32bit and 64bit editions and with HP Insight Management Agents installed.

Solution

 

 HP Documented solution:
 
1. Click on Run, type regedit.
2. Open the path HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\[device identifier name for the tape LTO-3 tape device]\[numeric device instance id for the LTO-3 tape device]\Device Parameters\
3. Right Click Device Parameters, click New > Key and name it Storport.
4. Right Click Storport key > New > DWORD and name it as BusyRetryCount and the value should be set to 250 decimal.
5. Exit regedit.
6. Reboot the server.
 
More information about the registry setting can be found in Microsoft documentation, and in this case, the default value of 20 has been changed to 250.

 

**** Updated NIC drivers should also be installed.  WIthout the updated drivers, the fix may not work correctly.

 

 

Was this content helpful?