How to recover a mirrored volume from the out-of-date plex

Article: 100038010
Last Published: 2012-09-07
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Description

Background:
As a result of some cabling issue, a 2 disk mirrored volume had both disks detached. After fixing the hardware issue, need to recover the volume from the first detached disk to get back the older version of data rather than the current data.

# mount /<mountpoint>  ( assume entry exist in /etc/vfstab)

Status of disk group and disk :
# vxdisk list
DEVICE       TYPE      DISK         GROUP        STATUS
c2t0d0s2     auto       -            -            error
c2t1d0s2     auto       -            -            error
-            -         mydg01       mydg         failed was:c2tod0s2
-            -         mydg02       mydg         failed was:c2t1d0s2

Error
Warning: VxVM vxio V-5-0-4 Plex detached from volume
Warning: VxVM vxio V-5-0-386 Subdisk failed in plex in vol
NOTICE: VxVM vxdmp V-5-0-0 i/o error occurred on dmpnode
V-5-1-7934 Disk group : Disabled by errors

Solution

Determine the first detached disk which contains the outdated data then use the associated plex to recover the volume.

1). Check the system log /var/adm/messages to identify the first detached disk according to the timeline of the issues happening. From /var/adm/message we have

         In this case, it's c2t0d0 detached firstly then c2t1d0 detached some time later .

2). Checking the ssb (Serial Split Brain) ID of both disks (for details please see 000027459).

# vxdisk list c2t0d0 | egrep "update|ssb"
update:    time=1326389192 seqno=0.16
ssb:       actual_seqno=0.0                       << still at “0”

# vxdisk list c2t1d0 | egrep "update|ssb"
update:    time=1326389919 seqno=0.22
ssb:       actual_seqno=0.1                                         << incremented by 1

# /etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/rdmp/c2t0d0s2 | egrep "dm|ssb|#config"
#config: tid=0.1058 nstpool=0 nrvg=0 nrlink=0 ncache=0 nvol=1 nplex=2 nsd=2 ndm=2 nda=0 nexp=0
dm   mydg01
  ssbid=0.0
dm   mydg02
  ssbid=0.0                                                    << less than the actual ssbid=0.1

# /etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/rdmp/c2t1d0s2 | egrep "dm\ |ssb|#config"
#config: tid=0.1059 nstpool=0 nrvg=0 nrlink=0 ncache=0 nvol=1 nplex=2 nsd=2 ndm=2 nda=0 nexp=0
dm   mydg01
  ssbid=0.0
dm   mydg02
  ssbid=0.1                                                                     << expected ssbid=ssb actual_seqno

3). Disable access to the disk=c2t1d0s2 ( data is up-to-date)  to prevent resync with the other disk using vxdmpadm to disable all the paths to the disk.

# vxdmpadm -f disable path=c2t1d0s2

Note:  this step is mandatory to prevent automatic recovery by using the most up-to-date plex once starting volume. Also same result will occur if you import a  disk group by selecting the outdated disk (e.g. command vxdg -o selectcp=<id_of_disk_c2t0d0> import <diskgroup> )

4). Forcefully import the disk group.

In this case, disk group name is mydg and use the option "-f" to import disk group forcefully

# vxdg -f import mydg
VxVM vxdg WARNING V-5-1-560 Disk mydg02: Not found, last known location: c2t1d0s2
-bash-3.00

# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c2t0d0s2     auto:cdsdisk    mydg01       mydg         online
c2t1d0s2     auto            -            -            error
-            -         mydg02       mydg         failed was:c2t1d0s2

5). Mount file system (fsck if necessary), and verify if the data before proceeding

# fsck -F vxfs /dev/vx/rdsk/mydg/myvol     ( If -o full is required please contact Veritas for assistance)  
# mount -F vxfs /dev/vx/dsk/mydg/myvol /tmnt 
# umount /tmnt                                              (
 after checking umount the filesystem)

Note: At this stage on some occasion this data may be corrupted and note recoverable and you may need to revert to the latest copy of the data.

6). If everything is good, enable the disk's access to prepare re-attaching. In this case, need to use command to enable the corresponding DMP path.

# vxdmpadm enable path=c2t1d0s2

7). Attach the newly configured disk and sync the plex.

In this case, attach and sync the disk c2t1d0 from ENABLED plex which resides on disk c2t0d0. Since ssb id is not matching when re-attaching the disk, the option to override ssb id should be used. During syncing (the progress can be observed by command “vxtask”), the state of plex "myvol-02" is changed from NODEVICE, then to STALE, finally to ACTIVE/ENABLE. Customer notices a 2-disks-mirrored volume unusable then finds both disks have been detached due to cables issues. After fixing the hardware issue, customer requests to recover the volume from the first detached disk to get back the older version of data rather than the current data -- this is a rare case since mostly customers hope to recover more current data which resides on the last available disk generally.

 

 

# vxreattach -br c2t1d0s2
VxVM vxdg ERROR V-5-1-10127 associating disk-media mydg02 with c2t1d0s2:
Serial Split Brain detected. Use -o overridessb to reattach the disk/site

# vxreattach -br -o overridessb c2t1d0s2

# vxtask list
TASKID  PTID TYPE/STATE    PCT   PROGRESS
   167     -     PARENT/R  0.00% 2/0(1) VXRECOVER mydg02 mydg

8). Finaly,  check the status of the fully recovered mirrored volume before remounting
# vxprint –g mydg -ht
dm mydg01       c2t0d0s2     auto     65536    982672   -
dm mydg02       c2t1d0s2     auto     65536    982672   -
v  myvol        -          ENABLED  ACTIVE   262144   SELECT   -      fsgen
pl myvol-01     myvol      ENABLED  ACTIVE   262144   CONCAT   -        RW
sd mydg01-01    myvol-01   mydg01   0        262144   0        c2t0d0   ENA
pl myvol-02     myvol      ENABLED  ACTIVE   262144   CONCAT   -        RW
sd mydg02-01    myvol-02   mydg02   0        262144   0        c2t1d0   ENA

 

Was this content helpful?