Veritas Volume Manager (VxVM) 5.1 may report an EMC SRDF R2 device in an "error" state if the source SRDF R1 device is labelled with a Solaris (EFI) disk label when the physical disk size is less than 1TB

Article: 100008766
Last Published: 2017-11-14
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem

Veritas Volume Manager (VxVM) 5.1 may report an EMC SRDF R2 device in an "error" state, if the source SRDF R1 device is labelled with a Solaris (EFI) disk label when the physical disk size is less than 1TB.

If the disk is labelled with the a Solaris disk label (SMI) as the disk is less than 1TB, the VxVM product will continue to report the correct "online" device status for the EMC SRDF R2 related devices.

Error Message


Sample messages
(from the /etc/vx/dmpevents.log and /var/adm/messages files)
 

Applicable to the EMC SRDF R2 (Write-Disabled "WD") lun with the Solaris "EFI" disk label:

/etc/vx/dmpevents.log
Fri Jul  6 15:15:38.466: Disabled Path c1t5006048C536979A0d65s2 belonging to Dmpnode emc1_0072 due to path failure
 

# grep dmp_tur /var/adm/messages | grep 0xca | tail
Jul  6 15:07:05 bashful vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x13d/0xca

 

Cause


The outcome of my tests, revealed that the SRDF-R2 lun would be reported as "error" if the lun, less than 1TB in size is stamped with an EFI label against the SRDF-R1 lun.


If I was to label the source SRDF-R1 lun, SYMDEV "008c" with a Solaris "SMI" disk label, the SRDF-R2 device will no longer report the false VxVM "error" device status.
The "error" states normally refers to disks reported in a Not-Ready(NR) state or highlight a potential disk (Hardware) failure.


Dopey # vxdg -g SRDFdg rmdisk emc0_008c
Dopey # /etc/vx/bin/vxdisksetup -i emc0_008c label=smi



If this is not done, the "dmp_tur_temp_pgr" messages will continue to be reported in the /etc/vx/dmpevents.log file

Jul  6 15:07:05 bashful vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x13d/0xca


HEX (CA) relates to 202 in Decimal
============================


Bashful # ls -al /dev/vx/rdmp/* | grep -w 202

crw-------   1 root     root     317, 202 Jul  6 12:49 /dev/vx/rdmp/emc1_0072s2


The "< redirect /dev/rdsk/c#t#d#" workaround needs to be implemented following the disk label change from EFI to SMI.


Bashful # vxdisk -eo alldgs list | grep -i srdf
emc1_0072    auto           -            -           error                c1t5006048C536979A0d65 srdf-r2
emc1_0073    auto:cdsdisk   -            (SRDFdg)    online               c1t5006048C536979A0d66s2 srdf-r2
emc1_0074    auto:cdsdisk   -            (SRDFdg)    online               c1t5006048C536979A0d67s2 srdf-r2
emc1_0075    auto:cdsdisk   -            (SRDFdg)    online               c1t5006048C536979A0d68s2 srdf-r2


Bashful # vxdisk list emc1_0072
Device:    emc1_0072
devicetag: emc1_0072
type:      auto
flags:     error private autoconfig
pubpaths:  block=/dev/vx/dmp/emc1_0072 char=/dev/vx/rdmp/emc1_0072
guid:      {a4b7e6d6-c772-11e1-af5d-0003baa8421b}
udid:      EMC%5FSYMMETRIX%5F000290301414%5F1400072000
site:      -
errno:     Configuration daemon error 6
Multipathing information:
numpaths:   1
c1t5006048C536979A0d65  state=disabled


Workaround to remove stale OS EFI device handle:


- Redirect I/O to the applicable paths


Bashful # </dev/rdsk/c1t5006048C536979A0d65

Bashful # vxdisk scandisks

Bashful # vxdisk list emc1_0072

Device:    emc1_0072
devicetag: emc1_0072
type:      auto
hostid:    rdgv240sol15
disk:      name= id=1341584044.142.dopey
group:     name=SRDFdg id=1341574853.131.dopey
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig autoimport
pubpaths:  block=/dev/vx/dmp/emc1_0072s2 char=/dev/vx/rdmp/emc1_0072s2
guid:      {d76e4b54-c774-11e1-af5d-0003baa8421b}
udid:      EMC%5FSYMMETRIX%5F000290301414%5F1400072000
site:      -
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=65792 len=4058368 disk_offset=0
private:   slice=2 offset=256 len=65536 disk_offset=0
update:    time=1341584051 seqno=0.7
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=48144
logs:      count=1 len=7296
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
 config   priv 000256-048207[047952]: copy=01 offset=000192 enabled
 log      priv 048208-055503[007296]: copy=01 offset=000000 enabled
 lockrgn  priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths:   1
c1t5006048C536979A0d65s2        state=enabled

Bashful # date
Friday,  6 July 2012 15:18:35 BST


/etc/vx/dmpevents.log ( whilst an EFI lun )
=================
Fri Jul  6 15:15:38.466: Disabled Path c1t5006048C536979A0d65s2 belonging to Dmpnode emc1_0072 due to path failure


Bashful # grep dmp_tur /var/adm/messages | grep 0xca | tail
Jul  6 15:15:48 bashful vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x13d/0xca
Jul  6 15:16:41 bashful vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x13d/0xca
Jul  6 15:16:46 bashful vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x13d/0xca
 

Solution


Change the Solaris disk label from EFI to SMI

 

If the impacted disk (emc0_008c) is labelled with a Solaris "SMI" disk label, neither of DMP related messages "dmp_tur_temp_pgr" or the "path failure" will be reported for the SRDF-R2 related devices, as the paths will no longer be disabled, thus avoiding the VxVM "error" device status.


The data residing on the source SRDF R1 lun needs to be relocated, so the disk can be relabelled with a valid SMI Solaris disk label. It is also advisable to re-initialize the disk using the vxdisksetup command to ensure the VxVM VTOC is updated for the specific disk.

# vxdisksetup -i <da-name>

 


With the product enhancement, the Solaris EFI labelled SDRF-R2 EFI LUN will no longer go into an error state.
The required change is implemented as part of vxdmp, /kernerl/drv/sparcv9/vxdmp.

Background:
 
SCSI inquiry shows the SRDF R2 LUN in a write protected mode(offset=108,  first bit = 0x01).
 
# vxscsiinq -d /dev/vx/rdmp/emc1_0072
[… …]
Bytes: 104 - 111    0x12  0x00  0x40  0x00  0x55  0x10  0x6a  0x16  ..@.U.j.
[… …]


The SRDF R2 device with an EFI label goes into error state as the library call efi_alloc_and_read failed on cdev , prtvtoc (which call efi_alloc_and_read) works on raw dev but fails on the dmp path. The EFI label is placed on disk which is less than 1Tb on the R1 side which works fine.
 
System function efi_alloc_and read() is called by vxconfigd or prtvtoc, either method issues a DKIOCGGEOM pass through ioctl to check if it is a EFI device. If the ioctl failed with an ENOTSUP response , it is treated as EFI device.
 

In case of any pass through ioctl fails, dmp will call the Veritas APM routine or the DMP internal routine to get the "path state" and start path failover if needed. As the device is in a write protected state, the get path state routine for the EMC A/A type of array returns a DMP_PATH_FAILURE, hence the path/dmpnode gets disabled. The following ioctl issued from vxconfigd or prtvtoc (DKIOCGEXTVTOC) failed with VT_EIO since no active path was detected.
 
For 5.0MP3RP5, DMP doesn’t check the "path state" and issues a failover in through the ioctl procedure, so we don’t have such issue.

This is a Solaris specific issue, as the efi_alloc_and_read() system call performs a DKIOC ioctls for Solaris only.
On other Platforms, we detect and allocate EFI partition table by reading the disk header, so the path won’t be disabled and DA online won’t fail.

With the enhancement we will no longer disable the path even when the LUN is incorrectly configured with the wrong Solaris label (EFI where SMI should be used)
 


Applies To


Solaris 10 - VxVM 5.1 SP1 RP2 P2 HF8 example


Sample configuration

Server "Dopey" contains the SRDF-R1 luns, 4 in total, one configured with an EFI Solaris disk label, the other three disks (less than 1TB) are configured with the Solaris "SMI" disk labels as they are less than 1TB in size.

Server "Bashful" contains the SRDF-R2 luns, 4 in total and will herit the Solaris disk content from the SRDF R1 devices.


The Veritas Volume Manager (VxVM) diskgroup is created on the production server (Dopey) consisting of the EMC SRDF R1 devices.


Dopey # vxdg init SRDFdg emc0_008c emc0_008d emc0_008e emc0_008f


Note: 008c is an EFI disk (no s2 slice) The other disks have a Solaris "SMI" disk label and are not impacting the SRDF R2 related devices.


Dopey # vxdisk -eo alldgs list | grep -i srdf
emc0_008c    auto:cdsdisk   emc0_008c    SRDFdg      online               c1t5006048C5368E580d266 srdf-r1                 <<< EFI disk
emc0_008d    auto:cdsdisk   emc0_008d    SRDFdg      online               c1t5006048C5368E580d267s2 srdf-r1
emc0_008e    auto:cdsdisk   emc0_008e    SRDFdg      online               c1t5006048C5368E580d268s2 srdf-r1
emc0_008f    auto:cdsdisk   emc0_008f    SRDFdg      online               c1t5006048C5368E580d269s2 srdf-r1


Bashful
======


The DR server (Bashful) has the SRDF R2 luns provisioned. The SRDF R2 related luns will be in a write-disabled state whilst the SRDF replication state is active.


Bashful # vxdisk -eo alldgs list | grep -i srdf
emc1_0072    auto           -            -           error                c1t5006048C536979A0d65 srdf-r2          <<< Side effect of the EMC SRDF R1 lun containing an EFI label
emc1_0073    auto:cdsdisk   -            (SRDFdg)    online               c1t5006048C536979A0d66s2 srdf-r2
emc1_0074    auto:cdsdisk   -            (SRDFdg)    online               c1t5006048C536979A0d67s2 srdf-r2
emc1_0075    auto:cdsdisk   -            (SRDFdg)    online               c1t5006048C536979A0d68s2 srdf-r2

 

References

Etrack : 2850929

Was this content helpful?