Product Enhancement: DMP should try all possibilities to service I/O upon receipt of a SCSI illegal request following HBA failure

Article: 100004151
Last Published: 2014-01-02
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem

DMP device is getting disabled when only one path has failed. DMP is treating illegal request sense key as device failure and this resulting in application I/O failure even when the device is healthy and there are other paths available.
 
When an I/O error occurs on a path of a DMP device, DMP does I/O error analysis on that path and as part of error analysis, it sends a SCSI inquiry command against that path. In this instance, the response of the SCSI inquiry operation, whereby DMP obtains the check condition status and the sense key as an "illegal request". DMP treats this event as a disk failure instead of a path failure and hence disables the DMP device and fails the application I/O on the DMP device.

The "illegal request" will typically suggest that I/O will fail against any of the paths and hence DMP does not retry. Here is the probable sequence:

•    DMP sends a correct SCSI inquiry request
•    The HBA layers receive the request, however responds with an “illegal request”
•    The device responds back to DMP saying the request is invalid
•    Since the response came from the end device, which is typically same across any of the paths, DMP fails the I/O

Since in this case the error returned by the device indicated a media failure, DMP did not retry any of the related paths. In the event that the error indicated a path failure or the HBA could not send the packet request to the device, DMP would have tried an alternate path.

In order to cater for the above event, we will now request that DMP try alternate paths instead of failing the I/O by changing the action to a DMP_PATH_FAILURE event.


 

Error Message

/etc/vx/dmpevents.log


<snippet>
Thu Nov 11 16:45:31.158: I/O error occured on Path c2t0d4s2 belonging to Dmpnode
EMC2_3
Thu Nov 11 16:45:31.160: SCSI error occured on Path c2t0d4s2: opcode=0x12
reported illegal request (status=0x2, key=0x5, asc=0x20, ascq=0x0)
Thu Nov 11 16:45:31.160: Media error occured on Dmpnode EMC2_3
Thu Nov 11 16:45:31.160: I/O analysis done on Path c2t0d4s2 belonging to Dmpnode
EMC2_3
<snippet>


NOTE:  DMP treats the above SCSI "illegal request" as a media failure, and thus does not try any other path.
 

Cause

DMP is behaving as expected and this behaviour is the same with 5.1SP1 at present.
 
DMP looks at the sense key values which are part of the SCSI standard
 
/*
 * SCSI Sense Key values
 */
#define DMP_KEY_GOOD    0x00
#define DMP_KEY_RECOVERED       0x01
#define DMP_KEY_NOTRDY  0x02
#define DMP_KEY_MEDERR  0x03
#define DMP_KEY_HWERR   0x04
#define DMP_KEY_ILLREQ  0x05
#define DMP_KEY_UNITATT 0x06
#define DMP_KEY_DATA_PROTECT    0x07
#define DMP_KEY_ABORTED_CMD     0x0A
#define DMP_KEY_ABORTED 0x0B
#define DMP_KEY_OVRFLW  0x0D

 

Solution

DMP Product Enhancement


1) We don't expect the SCSI "illegal request" to be returned for a SCSI inquiry request, as a SCSI inquiry is something any SCSI device would respond to if its accessible.

2) Even if we have genuine case of an "illegal request" then DMP will now try all the paths and eventually fail the I/O. The only side effect is that the paths would be marked as DISABLED, which would the expected event if the DMP restore task sent out a SCSI probe and it failed.

Instead of failing the device, we would now fail the path. So, the I/O request will be retried via the other enabled paths. This would avoid any such component failure for a given path.

 

Hot-fixes created for Solaris Sparc and Linux related versions


Solaris (sparc)    Etrack 2216441   (2201149)    VM 5.0 MP3 RP4 HF9
Linux                     Etrack 2216442   (2201149)    VM 5.0 MP4 HF7

 


Applies To

Cross Platform

 

References

Etrack : 2201149

Was this content helpful?