Showing posts with label VxVM - Problem Solving. Show all posts
Showing posts with label VxVM - Problem Solving. Show all posts
Wednesday, December 12, 2012 at 12:12 PM | 5 comments
Here’s a collection of my VxVM documents. Some were from my own work and some were taken from the web.

These documents were posted mainly for my personal references and for others that may find them useful.


Volume Administration

Veritas Cluster Filesystem

Veritas Disks Administration

Veritas DMP

Fixes, Tips and Tricks

Array Support Libraries

VxVM Manuals

Some Very Helpful Sites

Man Pages



I was trying to upgrade Veritas Volume Manager from 3.5 to 4.1 when the upgrade script stopped and complained about a volume not in sync. It was a swap volume.

# vxprint -ht swapvol
Disk group: rootdg

v swapvol - ENABLED NEEDSYNC 8395200 ROUND - swap
pl swapvol-01 swapvol ENABLED ACTIVE 8395200 CONCAT - RW
sd rootdisk-01 swapvol-01 rootdisk 5159231 8395200 0 c0t8d0 ENA
pl swapvol-02 swapvol ENABLED ACTIVE 8395200 CONCAT - RW
sd disk01-04 swapvol-02 disk01 19853376 8395200 0 c1t8d0 ENA

To fix this problem, all I did was issue a resync.

# vxvol resync swapvol

# vxprint -ht swapvol
Disk group: rootdg

v swapvol - ENABLED ACTIVE 8395200 ROUND - swap
pl swapvol-01 swapvol ENABLED ACTIVE 8395200 CONCAT - RW
sd rootdisk-01 swapvol-01 rootdisk 5159231 8395200 0 c0t8d0 ENA
pl swapvol-02 swapvol ENABLED ACTIVE 8395200 CONCAT - RW
sd disk01-04 swapvol-02 disk01 19853376 8395200 0 c1t8d0 ENA
What to do when "vxdisk list" shows status of 'online dgdisabled'.

Details:


aixsrv01:# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
EMC_CLARiiON0_0 auto:cdsdisk EMC_CLARiiON0_0 dygy2502 online
EMC_CLARiiON0_1 auto:cdsdisk - (dvgy2500) online
EMC_CLARiiON0_2 auto:cdsdisk EMC_CLARiiON0_4 dvgyappl online
EMC_CLARiiON0_3 auto:cdsdisk EMC_CLARiiON0_3 dvgy2503 online
EMC_CLARiiON0_4 auto:cdsdisk EMC_CLARiiON0_4 dvgy2504 online
EMC_CLARiiON0_5 auto:cdsdisk EMC_CLARiiON0_5 dvgy25 online
EMC_CLARiiON0_6 auto:cdsdisk EMC_CLARiiON0_9 dvgy26 online dgdisabled
EMC_CLARiiON0_7 auto:cdsdisk EMC_CLARiiON0_8 dygy2501 online
EMC_CLARiiON0_8 auto:cdsdisk - (dvgy2506) online
EMC_CLARiiON0_9 auto:cdsdisk - (dvgy2505) online
EMC_CLARiiON0_10 auto:cdsdisk - (dvgy2507) online
EMC_CLARiiON0_11 auto:cdsdisk EMC_CLARiiON0_11 dvgy25db2 online


This situation can happen when every disk in a disk group is lost from a bad power supply, power turned off to the disk array, cable disconnected, zoning problems, etc.

The disk group will not show in the output from vxprint -ht.

aixsrv01:# vxprint -htg dvgy26
VxVM vxprint ERROR V-5-1-582 Disk group dvgy26: No such disk group


The disk group will show as disabled in vxdg list:

aixsrv01:# vxdg list
NAME STATE ID
dygy2501 enabled,cds 1189621899.78.aixsrv01
dvgyappl enabled,cds 1190904062.52.aixsrv01
dvgy25 enabled,cds 1189622068.88.aixsrv01
dvgy25db2 enabled,cds 1189622043.86.aixsrv01
dvgy26 disabled 1189538508.74.aixsrv01
dvgy2503 enabled,cds 1189621988.82.aixsrv01
dvgy2504 enabled,cds 1189622014.84.aixsrv01
dygy2502 enabled,cds 1189621955.80.aixsrv01


This is the output of vxdg list dvgy26:

aixsrv01:# vxdg list dvgy26
Group: dvgy26
dgid: 1189538508.74.aixsrv01
import-id: 1024.22
flags: disabled
version: 0
alignment: 0 (bytes)
local-activation: read-write
ssb: off
detach-policy: invalid
copies: nconfig=default nlog=default
config: seqno=0.1103 permlen=1280 free=1259 templen=11 loglen=192
config disk EMC_CLARiiON0_6 copy 1 len=1280 state=clean online
log disk EMC_CLARiiON0_6 copy 1 len=192


Your filesystems will of course fail and the operating system will report it as corrupted.

aixsrv01:# df -k > /dev/null
df: /db2/dwins26q: I/O error
df: /backup: I/O error
df: /db/dwdb26q/dwins25q/NODE0000: I/O error
df: /db/dwins26q/dwdb26q/syscatspace/NODE0000: I/O error
df: /db/dwins26q/dwdb26q/tempspace01/NODE0000: I/O error
df: /dba/dwins26q: I/O error
df: /db2/dwmysld: I/O error
df: /backup/wiminst: I/O error


Once you have confirmed that the disk storage is powered-up, running, and operational and if the LUNs are in a SAN, zoning is configured right, this problem can be remedied by deporting, and then importing the disk group:

# vxdg deport dvgy26

# vxdg import dvgy26
VxVM vxdg ERROR V-5-1-587 Disk group dvgy26: import failed: No valid disk found containing disk group


If volume manager can't see the disks, and your SAN or storage administrator has confirmed that the LUNs were fine and presented to your server, then rescan the disks.

aixsrv01:# vxdisk scandisks
aixsrv01:# vxdctl enable
aixsrv01:# vxdg import dvgy26


Otherwise, your diskgroup should be showing up as enabled.

aixsrv01:# vxdg list
NAME STATE ID
dygy2501 enabled,cds 1189621899.78.aixsrv01
dvgyappl enabled,cds 1190904062.52.aixsrv01
dvgy25 enabled,cds 1189622068.88.aixsrv01
dvgy25db2 enabled,cds 1189622043.86.aixsrv01
dvgy26 enabled,cds 1189538508.74.aixsrv01
dvgy2503 enabled,cds 1189621988.82.aixsrv01
dvgy2504 enabled,cds 1189622014.84.aixsrv01
dygy2502 enabled,cds 1189621955.80.aixsrv01


The disk group now shows in vxprint -ht with the volumes and plexes disabled:

aixsrv01:# vxprint -htg dvgy26
DG NAME NCONFIG NLOG MINORS GROUP-ID
ST NAME STATE DM_CNT SPARE_CNT APPVOL_CNT
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
CO NAME CACHEVOL KSTATE STATE
VT NAME NVOLUME KSTATE STATE
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
SC NAME PLEX CACHE DISKOFFS LENGTH [COL/]OFF DEVICE MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO

dg dvgy26 default default 9000 1189538508.74.aixsrv01

dm EMC_CLARiiON0_9 EMC_CLARiiON0_6 auto 2048 67102464 -

v backup - DISABLED ACTIVE 4194304 SELECT - fsgen
pl backup-01 backup DISABLED ACTIVE 4194304 CONCAT - RW
sd EMC_CLARiiON0_9-02 backup-01 EMC_CLARiiON0_9 8388608 4194304 0 EMC_CLARiiON0_6 ENA

v db - DISABLED ACTIVE 1048576 SELECT - fsgen
pl db-01 db DISABLED ACTIVE 1048576 CONCAT - RW
sd EMC_CLARiiON0_9-04 db-01 EMC_CLARiiON0_9 16777216 1048576 0 EMC_CLARiiON0_6 ENA

v dba - DISABLED ACTIVE 4194304 SELECT - fsgen
pl dba-01 dba DISABLED ACTIVE 4194304 CONCAT - RW
sd EMC_CLARiiON0_9-03 dba-01 EMC_CLARiiON0_9 12582912 4194304 0 EMC_CLARiiON0_6 ENA

v db2 - DISABLED ACTIVE 8388608 SELECT - fsgen
pl db2-01 db2 DISABLED ACTIVE 8388608 CONCAT - RW
sd EMC_CLARiiON0_9-01 db2-01 EMC_CLARiiON0_9 0 8388608 0 EMC_CLARiiON0_6 ENA

v dwmysld - DISABLED ACTIVE 2097152 SELECT - fsgen
pl dwmysld-01 dwmysld DISABLED ACTIVE 2097152 CONCAT - RW
sd EMC_CLARiiON0_9-09 dwmysld-01 EMC_CLARiiON0_9 55574528 2097152 0 EMC_CLARiiON0_6 ENA

v lg1 - DISABLED ACTIVE 10485760 SELECT - fsgen
pl lg1-01 lg1 DISABLED ACTIVE 10485760 CONCAT - RW
sd EMC_CLARiiON0_9-08 lg1-01 EMC_CLARiiON0_9 45088768 10485760 0 EMC_CLARiiON0_6 ENA

v syscat - DISABLED ACTIVE 2097152 SELECT - fsgen
pl syscat-01 syscat DISABLED ACTIVE 2097152 CONCAT - RW
sd EMC_CLARiiON0_9-05 syscat-01 EMC_CLARiiON0_9 17825792 2097152 0 EMC_CLARiiON0_6 ENA

v tp01 - DISABLED ACTIVE 4194304 SELECT - fsgen
pl tp01-01 tp01 DISABLED ACTIVE 4194304 CONCAT - RW
sd EMC_CLARiiON0_9-07 tp01-01 EMC_CLARiiON0_9 40894464 4194304 0 EMC_CLARiiON0_6 ENA

v ts01 - DISABLED ACTIVE 20971520 SELECT - fsgen
pl ts01-01 ts01 DISABLED ACTIVE 20971520 CONCAT - RW
sd EMC_CLARiiON0_9-06 ts01-01 EMC_CLARiiON0_9 19922944 20971520 0 EMC_CLARiiON0_6 ENA


Verify that the disks on the diskgroup are all online.

aixsrv01:# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
EMC_CLARiiON0_0 auto:cdsdisk EMC_CLARiiON0_0 dygy2502 online
EMC_CLARiiON0_1 auto:cdsdisk - (dvgy2500) online
EMC_CLARiiON0_2 auto:cdsdisk EMC_CLARiiON0_4 dvgyappl online
EMC_CLARiiON0_3 auto:cdsdisk EMC_CLARiiON0_3 dvgy2503 online
EMC_CLARiiON0_4 auto:cdsdisk EMC_CLARiiON0_4 dvgy2504 online
EMC_CLARiiON0_5 auto:cdsdisk EMC_CLARiiON0_5 dvgy25 online
EMC_CLARiiON0_6 auto:cdsdisk EMC_CLARiiON0_9 dvgy26 online
EMC_CLARiiON0_7 auto:cdsdisk EMC_CLARiiON0_8 dygy2501 online
EMC_CLARiiON0_8 auto:cdsdisk - (dvgy2506) online
EMC_CLARiiON0_9 auto:cdsdisk - (dvgy2505) online
EMC_CLARiiON0_10 auto:cdsdisk - (dvgy2507) online
EMC_CLARiiON0_11 auto:cdsdisk EMC_CLARiiON0_11 dvgy25db2 online


Now the volumes can be started:

aixsrv01:# vxvol -g dvgy26 startall

aixsrv01:# vxprint -htg dvgy26 | egrep '^v|^pl'
v backup - ENABLED ACTIVE 4194304 SELECT - fsgen
pl backup-01 backup ENABLED ACTIVE 4194304 CONCAT - RW
v db - ENABLED ACTIVE 1048576 SELECT - fsgen
pl db-01 db ENABLED ACTIVE 1048576 CONCAT - RW
v dba - ENABLED ACTIVE 4194304 SELECT - fsgen
pl dba-01 dba ENABLED ACTIVE 4194304 CONCAT - RW
v db2 - ENABLED ACTIVE 8388608 SELECT - fsgen
pl db2-01 db2 ENABLED ACTIVE 8388608 CONCAT - RW
v dwmysld - ENABLED ACTIVE 2097152 SELECT - fsgen
pl dwmysld-01 dwmysld ENABLED ACTIVE 2097152 CONCAT - RW
v lg1 - ENABLED ACTIVE 10485760 SELECT - fsgen
pl lg1-01 lg1 ENABLED ACTIVE 10485760 CONCAT - RW
v syscat - ENABLED ACTIVE 2097152 SELECT - fsgen
pl syscat-01 syscat ENABLED ACTIVE 2097152 CONCAT - RW
v tp01 - ENABLED ACTIVE 4194304 SELECT - fsgen
pl tp01-01 tp01 ENABLED ACTIVE 4194304 CONCAT - RW
v ts01 - ENABLED ACTIVE 20971520 SELECT - fsgen
pl ts01-01 ts01 ENABLED ACTIVE 20971520 CONCAT - RW


The filesystems on these volumes may not be in consistent state. So, run a filesystem check before mounting them.

aixsrv01:# for i in `grep dvgy26 /etc/filesystems | awk '{ print $3 }'`
> do
> fsck -y $i
> mount $i
> done


note: This example was taken from an AIX server, but all veritas commands here will work on all UNIX platforms.
How to recover and start a Veritas Volume Manager logical volume where the volume is DISABLED ACTIVE and has a plex that is DISABLED RECOVER

Details:


When a system encounters a problem with a volume or a plex, or if Veritas Volume Manager (VxVM) has any reason to believe that the data is not synchronized, VxVM changes the kernel state, KSTATE and state, STATE, of the volume and its plexes accordingly. The plex state can be stale, empty, nodevice, etc. A particular plex state does not necessarily mean that the data is good or bad. The plex state is representative of VxVM's perception of the data in a plex.

The output from the vxprint utility using the switches "-h" and "-t" (for more information about these switches and all applicable switches, see the man page for vxprint) displays information from records in VxVM disk group configurations, including the KSTATE and STATE of a volume and plex as indicated in columns 4 and 5 respectively in the table below. When viewing the configuration records of a VxVM disk group using the vxprint utility and the KSTATE and STATE fields display DISABLED ACTIVE for the volume and DISABLED RECOVER for the plex, recovery steps need to be followed to bring the volume back to an ENABLED ACTIVE state so it can be mounted and make the file system accessible again.

From the below output, it can be seen that the KSTATE and STATE for the volume test is DISABLED ACTIVE and its plex test-01 is DISABLED RECOVER.

# vxprint -ht -g testdg


DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE



dg testdg default default 84000 970356463.1203.alu

dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -

v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED RECOVER 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA




Follow these steps to change KSTATE and STATE of a plex that is DISABLED RECOVER to ENABLED ACTIVE so the volume can be recovered / started and the file system mounted:

1. Change the plex test-01 to the DISABLED STALE state:

vxmend -g  diskgroup fix stale <plex_name>


For example:

# vxmend -g testdg fix stale test-01


This output shows the plex test-01 as DISABLED STALE:

# vxprint -ht -g testdg


DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE

dg testdg default default 84000 970356463.1203.alu

dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -

v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED STALE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA



2. Change the plex test-01 to the DISABLED CLEAN state:

vxmend -g diskgroup fix clean <plex_name>


For example:

# vxmend -g testdg fix clean test-01


This output shows the plex test-01 as DISABLED CLEAN:

# vxprint -ht -g testdg


DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE

dg testdg default default 84000 970356463.1203.alu

dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -

v test - DISABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test DISABLED CLEAN 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA



3. Start the volume test:

vxvol -g diskgroup start <volume>


For example:

# vxvol -g diskgroup start test


This output shows that the volume test and its plex test-01 are both ENABLED ACTIVE:

# vxprint -ht -g testdg

DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH USETYPE PREFPLEX RDPOL
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE

dg testdg default default 84000 970356463.1203.alu

dm testdg01 c1t4d0s2 sliced 2179 8920560 -
dm testdg02 c1t6d0s2 sliced 2179 8920560 -

v test - ENABLED ACTIVE 17840128 fsgen - SELECT
pl test-01 test ENABLED ACTIVE 17841120 CONCAT - RW
sd testdg01-01 test-01 testdg01 0 8920560 0 c1t4d0 ENA
sd testdg02-01 test-01 testdg02 0 8920560 8920560 c1t6d0 ENA



4. Mount the volume to its associated mount point (refer to the /etc/vfstab file if the mount point location is not known) if the file system is a Veritas File System (VxFS) file system:

mount -F vxfs /dev/vx/dsk/diskgroup/volume /mount-point


For example:

# mount -F vxfs /dev/vx/dsk/testdg/test /testvol


Note: An error may be generated stating that the file system needs to be checked for consistency. If this occurs, run the VxFS specific fsck utility (/usr/lib/fs/vxfs/fsck) where the default is to replay the intent log, instead of performing a full structural file system check which is usually sufficient to set the file system to CLEAN and allow the volume to be mounted.
Visit the Site
MARVEL and SPIDER-MAN: TM & 2007 Marvel Characters, Inc. Motion Picture © 2007 Columbia Pictures Industries, Inc. All Rights Reserved. 2007 Sony Pictures Digital Inc. All rights reserved. blogger template by blog forum