[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Failed mirrored drives -- how to rebuild mirror with replacement drive?



Hi,

We have a DS20E running Tru64 5.1.  There are two internal disks that
are mirrored by LSM.  Overnight one of them failed which brought the
server down (see my previous email).  We've rebooted off the one good
remaining drive.  In the meantime we've removed the bad drive and have
received a replacement for it (DS-RZ2ED-LS).  I'm told that this is
hot-swappable so I can put it in with the sytem up and rebuild the
mirror.

My questions are:

1) How do I make sure that once the new blank drive is installed the
system doesn't think that's the good drive and resync the existing drive
to this new, blank drive thereby wiping out the entire system disk?  Is
that even a possibility?

2) How do I rebuild the mirror?

Here's the volprint output:

root@odin==> volprint
Disk group: rootdg

TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0
PUTIL0
dg rootdg       rootdg       -        -        -        -        -
-

dm dsk0f-AdvFS  -            -        -        -        NODEVICE -
-
dm dsk0h        -            -        -        -        NODEVICE -
-
dm dsk1d-AdvFS  dsk1d        -        4267761  -        -        -
-
dm dsk1f-AdvFS  dsk1f        -        12878154 -        -        -
-
dm dsk1h        dsk1h        -        4301507  -        -        -
-
dm root01       -            -        -        -        NODEVICE -
-
dm root02       dsk1a        -        636421   -        -        -
-
dm swap01       -            -        -        -        NODEVICE -
-
dm swap02       dsk1b        -        12354045 -        -        -
-
dm usr01        -            -        -        -        NODEVICE -
-

v  rootvol      root         ENABLED  636421   -        ACTIVE   -
-
pl rootvol-01   rootvol      DISABLED 636421   -        NODEVICE -
-
sd root01-01p   rootvol-01   DISABLED 16       0        NODEVICE -
-
sd root01-01    rootvol-01   DISABLED 636405   16       NODEVICE -
-
pl rootvol-02   rootvol      ENABLED  636421   -        ACTIVE   -
-
sd root02-02p   rootvol-02   ENABLED  16       0        -        -
-
sd root02-02    rootvol-02   ENABLED  636405   16       -        -
-

v  swapvol      swap         ENABLED  12354045 -        ACTIVE   -
-
pl swapvol-02   swapvol      ENABLED  12354045 -        ACTIVE   -
-
sd swap02-02    swapvol-02   ENABLED  12354045 0        -        -
-
pl swapvol-01   swapvol      DISABLED 12354045 -        NODEVICE -
-
sd swap01-01    swapvol-01   DISABLED 12354045 0        NODEVICE -
-

v  usrvol       fsgen        ENABLED  4267761  -        ACTIVE   -
-
pl usrvol-02    usrvol       ENABLED  4267761  -        ACTIVE   -
-
sd dsk1d-01     usrvol-02    ENABLED  4267761  0        -        -
-
pl usrvol-01    usrvol       DISABLED 4267761  -        NODEVICE -
-
sd usr01-01     usrvol-01    DISABLED 4267761  0        NODEVICE -
-

v  vol-dsk0f    fsgen        ENABLED  12878154 -        ACTIVE   -
-
pl vol-dsk0f-02 vol-dsk0f    ENABLED  12878154 -        ACTIVE   -
-
sd dsk1f-01     vol-dsk0f-02 ENABLED  12878154 0        -        -
-
pl vol-dsk0f-01 vol-dsk0f    DISABLED 12878154 -        NODEVICE -
-
sd dsk0f-01     vol-dsk0f-01 DISABLED 12878154 0        NODEVICE -
-


To my untrained eye it looks like dsk0 was the failed drive.


Would the following be the series of commands I would issue to rebuild?

1. Dissassociate and remove all plexes associated with failed disk. 

#> volplex -o rm dis rootvol-01 swapvol-01 usrvol-01

2. Remove the failed objects from LSM. 

#> voldg rmdisk root01 swap01 usr01 dskoh ## what about 'dsk0f-AdvFS' ?

#> voldisk rm dsk0a dsk0b dsk0g dsk0d 

3. Replace disk and scan. 

#> hwmgr -scan scsi 

#> dsfmgr -e dskX dsk0 (where dskX=newly scanned disk) 

#> disklabel -rw dsk0

4. Remirror the boot disk. 

#> volrootmir -a dsk0


Anything here incorrect?  Did I miss anything?


Many, many thanks!
Andy