Important Update: Cohesity Products Knowledge Base Articles


All Cohesity Knowledge Base Articles are now managed via the Cohesity Support Portal: https://support.cohesity.com/s/searchunify. The Knowledge Base articles available here will not reflect the latest information or may no longer be accessible.

How to replace a faulty Solaris boot disk (boot device)

Article: 100037917
Last Published: 2014-01-02
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Description


The following article attempts to explain the steps required to replace a faulty boot disk (boot device) on Solaris.


Figure 1.0
 



In the above example, disk media (dm) name "rootdg01" has failed and needs to be replaced.


Configuration details
 

# modinfo | grep vx
35  1347360  37f28 308   1  vxdmp (VxVM 5.0-2006-05-11a: DMP Drive)
37 7c002000 337840 309   1  vxio (VxVM 5.0-2006-05-11a I/O driver)
39  137b4e0    d48 310   1  vxspec (VxVM 5.0-2006-05-11a control/st)
188 7b7ff338    c30 311   1  vxportal (VxFS 5.0_REV-5.0A55_sol portal )
189 7ae00000 1ba6d0  21   1  vxfs (VxFS 5.0_REV-5.0A55_sol SunOS 5)

 
# uname -a
SunOS dopey 5.10 Generic_138888-01 sun4v sparc SUNW,T5140

 
# cat /etc/release
                     Solaris 10 10/08 s10s_u6wos_07b SPARC
          Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
                       Use is subject to license terms.
                           Assembled 27 October 2008


Veritas Volume Manager (VxVM) disk content



# vxdisk -eo alldgs list
DEVICE       TYPE      DISK         GROUP        STATUS       OS_NATIVE_NAME
c1t0d0s2     auto      -             -            error        c1t0d0s2                 <<<<<<<<<<< disk to replace
c1t1d0s2     auto      -             -            online       c1t1d0s2
c1t2d0s2     auto      rootdg02      rootdg       online       c1t2d0s2
c1t3d0s2     auto      rootdg03      rootdg       online       c1t3d0s2
-            -         rootdg01     rootdg       failed was:c1t0d0s2



Disk Healthcheck


Boot disk c1t0d0s2 has failed, unable to label access the disk VTOC


# prtvtoc /dev/rdsk/c1t0d0s2
prtvtoc: /dev/rdsk/c1t0d0s2: Unable to read Disk geometry errno = 0x5



Current boot device


The system is booted from c1t2d0 in this instance, as shown by the Solaris prtconf command


# prtconf -vp | grep boot
        bootarchive:  '/ramdisk-root'
        bootfs:  fe942968
        bootargs:  00
        bootpath:  '/pci@400/pci@0/pci@8/scsi@0/disk@2,0:a'   <<<<<<      The Solaris server is currently booted from c1t2d0s2 ( aka rootdg02 )
        reboot-command:
        auto-boot-on-error?:  'false'
        auto-boot?:  'false'
        network-boot-arguments:
        boot-command:  'boot'
        boot-file:
        boot-device:  '/pci@400/pci@0/pci@8/scsi@0/disk@0,0:a disk net'
        multipath-boot?:  'false'
        boot-device-index:  '0'
        error-reset-recovery:  'boot'




NEW to VxVM 6.0

With VxVM 6.0 onwards, it will be possible to display the bootpath (disk the server is actually booted from) using the VxVM vxeeprom CLI command:


Sample output


# vxeeprom bootpath
/pci@1c,600000/scsi@2/disk@1,0:a


This saves the need to run O/S specific commands (SOLARIS SPARC only) such as prtconf.



Diskgroup configuration prior to disk replacement


# vxprint -qhtg rootdg
dg rootdg       default      default  72000    1232444437.8.dopey

dm rootdg01     -            -        -        -        NODEVICE                            <<<<< disks needs to be replaced
dm rootdg02     c1t2d0s2     auto     101759   286596864 -
dm rootdg03     c1t3d0s2     auto     81151    286596864 SPARE

v  rootdg017vol -            ENABLED  ACTIVE   1444992  ROUND     -        gen
pl rootdg017vol-01 rootdg017vol ENABLED ACTIVE 1444992  CONCAT    -        RW
sd rootdg03-03  rootdg017vol-01 rootdg03 285151872 1444992 0      c1t3d0   ENA
pl rootdg017vol-02 rootdg017vol ENABLED ACTIVE 1444992  CONCAT    -        RW
sd rootdg02-03  rootdg017vol-02 rootdg02 285151872 1444992 0      c1t2d0   ENA

v  rootvol      -            ENABLED  ACTIVE   251693184 ROUND    -        root
pl rootvol-02   rootvol      ENABLED  ACTIVE   251693184 CONCAT   -        RW
sd rootdg02-02  rootvol-02   rootdg02 33458688 251693184 0        c1t2d0   ENA
pl rootvol-03   rootvol      ENABLED  ACTIVE   251693184 CONCAT   -        RW
sd rootdg03-01  rootvol-03   rootdg03 0        251693184 0        c1t3d0   ENA

v  swapvol      -            ENABLED  ACTIVE   33458688 ROUND     -        swap
pl swapvol-02   swapvol      ENABLED  ACTIVE   33458688 CONCAT    -        RW
sd rootdg02-01  swapvol-02   rootdg02 0        33458688 0         c1t2d0   ENA
pl swapvol-03   swapvol      ENABLED  ACTIVE   33458688 CONCAT    -        RW
sd rootdg03-02  swapvol-03   rootdg03 251693184 33458688 0        c1t3d0   ENA



Steps


1.] As the disk is reported as failed was, the Veritas Disk Access (DA) name can be removed from VxVM's view.

Close the Veritas Disk Access (DA) name to be replaced, ie c1t0d0s2 as in this instance.


# vxdisk rm c1t0d0s2

# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t1d0s2     auto:none       -            -            online invalid
c1t2d0s2     auto:sliced     rootdg02     rootdg       online
c1t3d0s2     auto:sliced     rootdg03     rootdg       online spare
-            -         rootdg01     rootdg       failed was:c1t0d0s2


2.] View the O/S device handles prior to removing the faulty disk.


# cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 disk         connected    configured   unknown   <<<<<< access path to be removed
c1::dsk/c1t1d0                 disk         connected    configured   unknown
c1::dsk/c1t2d0                 disk         connected    configured   unknown
c1::dsk/c1t3d0                 disk         connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb0/3                         unknown      empty        unconfigured ok
usb1/1                         unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok
usb2/1                         unknown      empty        unconfigured ok
usb2/2                         usb-storage  connected    configured   ok
usb2/3                         unknown      empty        unconfigured ok
usb2/4                         usb-hub      connected    configured   ok
usb2/4.1                       unknown      empty        unconfigured ok
usb2/4.2                       unknown      empty        unconfigured ok
usb2/4.3                       unknown      empty        unconfigured ok
usb2/4.4                       unknown      empty        unconfigured ok
usb2/5                         unknown      empty        unconfigured ok


3.] Disable all the paths relating to the faulty boot disk. In this instance, there is a single path to c1t0d0s2.


# vxdmpadm -f disable path=c1t0d0s2


4.] Unconfigure the O/S device handles.


In this instance, the cfgadm interface can be used to unconfigure the internal boot device instance.


# cfgadm -c unconfigure c1::dsk/c1t0d0

# cfgadm -al

Ap_Id                          Type         Receptacle   Occupant     Condition
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 disk         connected    unconfigured   unknown   <<<<<<<<<<<< unconfigured
c1::dsk/c1t1d0                 disk         connected    configured   unknown
c1::dsk/c1t2d0                 disk         connected    configured   unknown
c1::dsk/c1t3d0                 disk         connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb0/3                         unknown      empty        unconfigured ok
usb1/1                         unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok
usb2/1                         unknown      empty        unconfigured ok
usb2/2                         usb-storage  connected    configured   ok
usb2/3                         unknown      empty        unconfigured ok
usb2/4                         usb-hub      connected    configured   ok
usb2/4.1                       unknown      empty        unconfigured ok
usb2/4.2                       unknown      empty        unconfigured ok
usb2/4.3                       unknown      empty        unconfigured ok
usb2/4.4                       unknown      empty        unconfigured ok
usb2/5                         unknown      empty        unconfigured ok


5.] Clean-up the stale O/S device handles.


# devfsadm -Cvc disk
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s0
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s1
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s2
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s3
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s4
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s5
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s6
devfsadm[27465]: verbose: removing file: /dev/dsk/c1t0d0s7
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s0
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s1
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s2
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s3
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s4
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s5
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s6
devfsadm[27465]: verbose: removing file: /dev/rdsk/c1t0d0s7


6.] Remove the faulty disk.


Faulty disk removed


# cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t1d0                 disk         connected    configured   unknown
c1::dsk/c1t2d0                 disk         connected    configured   unknown
c1::dsk/c1t3d0                 disk         connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb0/3                         unknown      empty        unconfigured ok
usb1/1                         unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok
usb2/1                         unknown      empty        unconfigured ok
usb2/2                         usb-storage  connected    configured   ok
usb2/3                         unknown      empty        unconfigured ok
usb2/4                         usb-hub      connected    configured   ok
usb2/4.1                       unknown      empty        unconfigured ok
usb2/4.2                       unknown      empty        unconfigured ok
usb2/4.3                       unknown      empty        unconfigured ok
usb2/4.4                       unknown      empty        unconfigured ok
usb2/5                         unknown      empty        unconfigured ok


7.] Refresh VxVM details refreshed before NEW disk is inserted.


# vxdctl enable


Note: Make sure the Veritas Disk Access (da) name (c1t0d0s2) relating to the faulty boot device is no longer listed by "vxdisk list", prior to inserting the replacement disk.


# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t1d0s2     auto:none       -            -            online invalid
c1t2d0s2     auto:sliced     rootdg02     rootdg       online
c1t3d0s2     auto:sliced     rootdg03     rootdg       online spare
-            -         rootdg01     rootdg       failed was:c1t0d0s2


8.] Insert the replacement disk


NEW disk inserted



# tail -f /var/adm/messages
<snippet>
Sep  2 21:16:53 dopey genunix: [ID 408114 kern.info] /pci@400/pci@0/pci@8/scsi@0/sd@0,0 (sd0) offline
Sep  2 21:27:36 dopey SC Alert: [ID 394168 daemon.notice] IPMI | minor: ID =    3 : 09/02/2011 : 11:46:55 : Entity Presence : /HDD0/PRSNT : Device Absent
Sep  2 21:27:48 dopey genunix: [ID 408114 kern.info] /pci@400/pci@0/pci@8/scsi@0/sd@0,0 (sd0) offline
Sep  2 21:28:29 dopey SC Alert: [ID 404314 daemon.notice] IPMI | minor: ID =    4 : 09/02/2011 : 11:48:00 : Entity Presence : /HDD0/PRSNT : Device Present
Sep  2 21:28:46 dopey scsi: [ID 193665 kern.info] sd0 at mpt0: target 0 lun 0
Sep  2 21:28:46 dopey genunix: [ID 936769 kern.info] sd0 is /pci@400/pci@0/pci@8/scsi@0/sd@0,0
Sep  2 21:28:46 dopey genunix: [ID 408114 kern.info] /pci@400/pci@0/pci@8/scsi@0/sd@0,0 (sd0) online
Sep  2 21:28:49 dopey SC Alert: [ID 624537 daemon.error] Chassis | major: Hot insertion of HDD0
Sep  2 21:28:51 dopey vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 308/0x18
Sep  2 21:28:51 dopey vxdmp: [ID 736771 kern.notice] NOTICE: VxVM vxdmp V-5-0-148 enabled path 32/0x0 belonging to the dmpnode 308/0x18
<snippet>


9.] View the revised O/S device handle content, following the disk replacement.



# cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 disk         connected    configured   unknown          <<<<< New disk seen by leadville (cfgadm) stack
c1::dsk/c1t1d0                 disk         connected    configured   unknown
c1::dsk/c1t2d0                 disk         connected    configured   unknown
c1::dsk/c1t3d0                 disk         connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb0/3                         unknown      empty        unconfigured ok
usb1/1                         unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok
usb2/1                         unknown      empty        unconfigured ok
usb2/2                         usb-storage  connected    configured   ok
usb2/3                         unknown      empty        unconfigured ok
usb2/4                         usb-hub      connected    configured   ok
usb2/4.1                       unknown      empty        unconfigured ok
usb2/4.2                       unknown      empty        unconfigured ok
usb2/4.3                       unknown      empty        unconfigured ok
usb2/4.4                       unknown      empty        unconfigured ok
usb2/5                         unknown      empty        unconfigured ok


9a.]  In some cases, you may need to run the cfgadm command manually to pick up the newly presented disk.

# cfgadm -c configure c1::dsk/c1t0d0

10.] Create the OS device handles for the replacement disk.


# devfsadm

 
# echo | format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>                     <<<<< new disk seen by format
          /pci@400/pci@0/pci@8/scsi@0/sd@0,0
       1. c1t1d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@1,0
       2. c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@2,0
       3. c1t3d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@3,0
Specify disk (enter its number): Specify disk (enter its number):

 

11.] Label the new (replacement) disk (i.e. c1t0d0) using the Solaris format utility.



Label the new disk using format



# format  c1t0d0
selecting c1t0d0
[disk formatted]


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> p


PARTITION MENU:
        0      - change `0' partition
        1      - change `1' partition
        2      - change `2' partition
        3      - change `3' partition
        4      - change `4' partition
        5      - change `5' partition
        6      - change `6' partition
        7      - change `7' partition
        select - select a predefined table
        modify - modify a predefined partition table
        name   - name the current table
        print  - display the current table
        label  - write partition map and label to the disk
        !<cmd> - execute <cmd>, then return
        quit
partition> p
Current partition table (original):
Total disk cylinders available: 14087 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 -  2060       20.00GB    (2061/0/0)   41945472
  1       swap    wu    2061 -  3091       10.01GB    (1031/0/0)   20982912
  2     backup    wm       0 - 14086      136.71GB    (14087/0/0) 286698624
  3        usr    wm    3092 -  5152       20.00GB    (2061/0/0)   41945472
  4        var    wm    5153 -  7213       20.00GB    (2061/0/0)   41945472
  5 unassigned    wm    7214 -  9274       20.00GB    (2061/0/0)   41945472
  6       home    wm    9275 - 14086       46.70GB    (4812/0/0)   97933824
  7 unassigned    wm       0                0         (0/0/0)             0

partition> l
Ready to label disk, continue? yes

partition> q


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> q


12.]

Refresh VxVM with the new disk content



# vxdisk scandisks

# vxdisk list

DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:none      -            -            online  invalid                                <<<<<<<<< New disk seen by VxVM
c1t1d0s2     auto:none       -            -            online invalid
c1t2d0s2     auto:sliced     rootdg02     rootdg       online
c1t3d0s2     auto:sliced     rootdg03     rootdg       online spare
-            -         rootdg01     rootdg       failed was:c1t0d0s2


13.] Prepare the new disk for VxVM use.


# /etc/vx/bin/vxdisksetup -i c1t0d0 format=sliced noreserve


# vxdisk list

DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:sliced     -            -            online                                 <<<<<<<<< New disk ready for use
c1t1d0s2     auto:none       -            -            online invalid
c1t2d0s2     auto:sliced     rootdg02     rootdg       online
c1t3d0s2     auto:sliced     rootdg03     rootdg       online spare
-            -         rootdg01     rootdg       failed was:c1t0d0s2


# vxdg -g rootdg -k adddisk rootdg01=c1t0d0s2


# vxdisk list

DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:sliced     rootdg01     rootdg       online               <<<<<<<<< New disk assigned to rootdg diskgroup
c1t1d0s2     auto:none       -            -            online invalid
c1t2d0s2     auto:sliced     rootdg02     rootdg       online
c1t3d0s2     auto:sliced     rootdg03     rootdg       online spare

Figure 2.0





14.] Change the spare flag status if applicable.

# vxprint -qhtg rootdg
dg rootdg       default      default  72000    1232444437.8.dopey

dm rootdg01     c1t0d0s2     auto     81407    286617216 -                       <<<<<< Make the New disk, the spare disk
dm rootdg02     c1t2d0s2     auto     101759   286596864 -
dm rootdg03     c1t3d0s2     auto     81151    286596864 SPARE

v  rootdg017vol -            ENABLED  ACTIVE   1444992  ROUND     -        gen
pl rootdg017vol-01 rootdg017vol ENABLED ACTIVE 1444992  CONCAT    -        RW
sd rootdg03-03  rootdg017vol-01 rootdg03 285151872 1444992 0      c1t3d0   ENA
pl rootdg017vol-02 rootdg017vol ENABLED ACTIVE 1444992  CONCAT    -        RW
sd rootdg02-03  rootdg017vol-02 rootdg02 285151872 1444992 0      c1t2d0   ENA

v  rootvol      -            ENABLED  ACTIVE   251693184 ROUND    -        root
pl rootvol-02   rootvol      ENABLED  ACTIVE   251693184 CONCAT   -        RW
sd rootdg02-02  rootvol-02   rootdg02 33458688 251693184 0        c1t2d0   ENA
pl rootvol-03   rootvol      ENABLED  ACTIVE   251693184 CONCAT   -        RW
sd rootdg03-01  rootvol-03   rootdg03 0        251693184 0        c1t3d0   ENA

v  swapvol      -            ENABLED  ACTIVE   33458688 ROUND     -        swap
pl swapvol-02   swapvol      ENABLED  ACTIVE   33458688 CONCAT    -        RW
sd rootdg02-01  swapvol-02   rootdg02 0        33458688 0         c1t2d0   ENA
pl swapvol-03   swapvol      ENABLED  ACTIVE   33458688 CONCAT    -        RW
sd rootdg03-02  swapvol-03   rootdg03 251693184 33458688 0        c1t3d0   ENA



Toggle spare flag from rootdg03 to rootdg01


# vxedit -g rootdg set spare=off rootdg03

# vxedit -g rootdg set spare=on rootdg01

# vxprint -qhtg rootdg

dg rootdg       default      default  72000    1232444437.8.dopey

dm rootdg01     c1t0d0s2     auto     81407    286617216 SPARE                          <<<<< SPARE flag set against the newly replaced disk
dm rootdg02     c1t2d0s2     auto     101759   286596864 -
dm rootdg03     c1t3d0s2     auto     81151    286596864 -

v  rootdg017vol -            ENABLED  ACTIVE   1444992  ROUND     -        gen
pl rootdg017vol-01 rootdg017vol ENABLED ACTIVE 1444992  CONCAT    -        RW
sd rootdg03-03  rootdg017vol-01 rootdg03 285151872 1444992 0      c1t3d0   ENA
pl rootdg017vol-02 rootdg017vol ENABLED ACTIVE 1444992  CONCAT    -        RW
sd rootdg02-03  rootdg017vol-02 rootdg02 285151872 1444992 0      c1t2d0   ENA

v  rootvol      -            ENABLED  ACTIVE   251693184 ROUND    -        root
pl rootvol-02   rootvol      ENABLED  ACTIVE   251693184 CONCAT   -        RW
sd rootdg02-02  rootvol-02   rootdg02 33458688 251693184 0        c1t2d0   ENA
pl rootvol-03   rootvol      ENABLED  ACTIVE   251693184 CONCAT   -        RW
sd rootdg03-01  rootvol-03   rootdg03 0        251693184 0        c1t3d0   ENA

v  swapvol      -            ENABLED  ACTIVE   33458688 ROUND     -        swap
pl swapvol-02   swapvol      ENABLED  ACTIVE   33458688 CONCAT    -        RW
sd rootdg02-01  swapvol-02   rootdg02 0        33458688 0         c1t2d0   ENA
pl swapvol-03   swapvol      ENABLED  ACTIVE   33458688 CONCAT    -        RW
sd rootdg03-02  swapvol-03   rootdg03 251693184 33458688 0        c1t3d0   ENA



Process complete.
 

Was this content helpful?