The import a Cluster Volume Manager disk group in a Veritas Storage Foundation (tm) for Oracle RAC, Storage Foundation Cluster File System 4.0/4.1 environment fails with error message "Error saving vxprint file"

The import a Cluster Volume Manager disk group in a Veritas Storage Foundation (tm) for Oracle RAC, Storage Foundation Cluster File System 4.0/4.1 environment fails with error message "Error saving vxprint file"

Article: 100017031
Last Published: 2007-01-24
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem

The import a Cluster Volume Manager disk group in a Veritas Storage Foundation (tm) for Oracle RAC, Storage Foundation Cluster File System 4.0/4.1 environment fails with error message "Error saving vxprint file"

Error Message

VCS ERROR V-16-10001-1044 (node1) CVMVolDg:OraSrvm_dg:online: Error saving vxprint file

Solution

Summary:

In certain circumstances, Cluster Volume Manager will fail to autoimport the disk group. This occurs when the disk group is onlined on the slave Cluster Volume Manager node, after it had been recently onlined/offlined on the master node while the slave node was offline. If the slave node comes online after the master node is online, then the slave node gets the seqno from the master node and no issues are seen.

This issue occurs when the last master node to deport the Cluster Volume Manager disk group updates the "seqno" for the disk group and flushes this configuration information to disk on deport. When the other node, which was slave last time the disk group was imported, tries to immediately import the disk group, it does so with the help of stale in-memory information regarding the seqno for the disk group from the last time it had imported the disk group. This configuration mismatch causes the import to fail. The workaround involves "refreshing" the in-memory copy of the disk group configuration from the disk-copy by running  the command vxdisk -o alldgs list or vxdisk -a online before onlining the disk group.


Scenario:

Initially, node0 is the master, node1 is the slave

1. Online Cluster Volume Manager on both the nodes:
vxclustadm -m vcs -t gab startnode (on node0 and node1)
2. Offline Cluster Volume Manager on both nodes
vxclustadm stopnode (on node0 and node1)
3. Online Cluster Volume Manager on node0
vxclustadm -m vcs -t gab startnode (on node0)
4. Offline Cluster Volume Manager on node0
vxclustadm stopnode (on node0)
5. Online Cluster Volume Manger on node1-> Will fail to autoimport the shared diskgroup
vxclustadm -m vcs -t gab startnode (on node1)

The problem is that the online of the disk group will fail, requiring administrator intervention to clear the fault.


Symptoms:

Sample of errors seen during import of Cluster File System disk group, performed under Veritas Cluster Server control:
/var/VRTSvcs/log/engine_A.log
2005/06/03 13:22:27 VCS NOTICE V-16-1-10301 Initiating Online of Resource vxfsckd (Owner: unknown, Group: cvm) on System node1
2005/06/03 13:22:29 VCS INFO V-16-1-10298 Resource vxfsckd (Owner: unknown, Group: cvm) is online on node1 (VCS initiated)
2005/06/03 13:22:30 VCS ERROR V-16-10001-1009 (node1) CVMVolDg:OraSrvm_dg:online: could not find diskgroup cfsdg imported. If it was previously deported, it will have to be manually imported
2005/06/03 13:22:31 VCS ERROR V-16-10001-1044 (node1) CVMVolDg:OraSrvm_dg:online: Error saving vxprint file
2005/06/03 13:22:31 VCS INFO V-16-2-13001 (node1) Resource(OraSrvm_dg): Output of the completed operation (online)
VxVM vxprint ERROR V-5-1-582 Disk group cfsdg: No such disk group

/var/adm/messages
Jun  3 13:22:30 node1 Had[12909]: [ID 702911 daemon.notice] VCS ERROR V-16-1-1009 (node1) CVMVolDg:OraSrvm_dg:online:could not find diskgroup cfsdg imported. If it was previously deported, it will have to be manually imported
Jun  3 13:22:31 node1 Had[12909]: [ID 702911 daemon.notice] VCS ERROR V-16-1-1044 (node1) CVMVolDg:OraSrvm_dg:online:Error saving vxprint file
Jun  3 13:23:56 node1 vxfen: [ID 214757 kern.notice] NOTICE: VCS FEN INFO V-11-1-34 The ioctl VXFEN_IOC_CLUSTSTAT returned 0
Jun  3 13:24:35 node1 Had[12909]: [ID 702911 daemon.notice] VCS ERROR V-16-1-13066 (node1) Agent is calling clean for resource(OraSrvm_dg) because the resource is not up even after online completed.

Debug logs from /etc/vxvm/vxconfigd.log (USE: vxconfigd -k -x 9 -x log)

06/03 13:22:18:  VxVM vxconfigd DEBUG V-5-1-5786 priv_read_toc_block: bad seqno
06/03 13:22:18:  VxVM vxconfigd DEBUG V-5-1-681 IOCTL VOLDIO_READ len=16 priv,drid=1024.10,offset=2272: start (thread=53)
06/03 13:22:18:  VxVM vxconfigd DEBUG V-5-1-682 IOCTL completion (thread 53): return 0(0x0)
06/03 13:22:18:  VxVM vxconfigd DEBUG V-5-1-5786 priv_read_toc_block: bad seqno
06/03 13:22:18:  VxVM vxconfigd DEBUG V-5-1-5310 da_join failed, thread 53: Disk private region contents are invalid


Workaround:

Before onlining a Cluster Volume Manager disk group using vxclustadm , or under Veritas Cluster Server when onlining the cvm_clust resource, the following commands must be run to rescan the disks to allow the import to succeed:
 
1. Resync the Veritas Volume Manager objects on-disk and in-memory:
# vxdisk -o alldgs list
or
# vxdisk -a online
 

2. Import the disk group using the Cluster Volume Manager command vxclustadm or the Cluster Server command hares:
# vxclustadm -m vcs -t gab startnode
or
# hares -online cvm -sys node1

Note: Under Veritas Cluster Server, if the Cluster Volume Manager service group has faulted after its default 60 second timeout period, then this Cluster Server fault must be cleared, and the group associated with it re-onlined.
Example:
# hagrp -clear cvm
# hagrp -online cvm -sys node1

A permanent fix will be made available in a future patch release.
 

 

Was this content helpful?