Veritas Access Appliance 8.2 Troubleshooting Guide
- Introduction
- General troubleshooting procedures
- Monitoring Access Appliance
- Common recovery procedures
- Bringing services online
- Speeding up episodic replication
- Troubleshooting the Access Appliance cloud as a tier feature
- Troubleshooting Access Appliance installation and configuration issues
- Troubleshooting Access Appliance CIFS issues
- Troubleshooting Access Appliance GUI startup issues
- Troubleshooting Veritas Data Deduplication issues
Replacing an Access Appliance node
In some cases, you may need to replace a Access Appliance node. This section describes the steps to replace a Access Appliance node.
It is not recommended to delete the node where the management console is running. however, if you want to delete this node, you should first switch the management console on to other node and then attempt to the delete node.
To switch the management console node:
- Go to the Access command-line interface.
- Run the network ip addr show command and note the console IP address.
- Run the network ip addr online console_ip target_node command.
You will exit the Access command-line interface. You need to log in again.
To replace an Access Appliance node
- Before you delete the node from the cluster, make sure that you do not remove the CVM master node. To remove the CVM master node, you need to switch the CVM master node by switching the Management Console to other node.
- If you do not want to trigger the Hot Relocation, set the following tunables to -1 from the CVM master node.
#vxtune node_reloc_timeout -1
Note:
After you set the node_reloc_timeout, the storage_reloc_timeout is automatically set to -1.
- Run the cluster del command for the node that is to be replaced.
fss7310>cluster del fss7310_02
- Verify that all the plexes are in the NODEVICE/DISABLED state. You can use the #vxprint -p command to check the plex states.
- Run the following command to detach the plexes of the volumes:
# vxplex -f -g <dg-name> -o rm dis <plex-name>
- Remove all the disks that are in
failed was:
state from the disk group by using the vxdg rmdisk command. This command needs to be run from the CVM master node.#vxdg - g <dg-name> rmdisk <disk-name>
- Run the vxdisk rm command for the removed disks from all the nodes in the cluster.
#vxdisk rm <disk-name>
Note:
This command needs to be run for all the disks from all the nodes in the cluster.
- After all the plexes are disabled, add the new node in the cluster by using the following command:
fss7310>cluster add <node-ip>
- Run the storage disk format command from the Access Appliance management console node for all the disks from the newly added node.
fss7310>storage disk format <list-of-disks>
- Add all the disks from the newly added node to the Access Appliance pool created by using the storage pool adddisk command.
fss7310> storage pool adddisk pool1 <list-of-devices>
- Run the storage fs addmirror command to mirror the file system.
fss7310> storage fs addmirror <fs-name> <pool-name>
- Run the vxassist command to mirror the
_nlm_
volume as well.#vxassist - b - g <dg-name> mirror _nlm_
: Replacing an Access Appliance node
To replace an Access Appliance node
- Change the value of the vxtune tunable to disable hot relocation:
# vxtune node_reloc_timeout -1
- Run the following command to remove the node from the cluster.
fss7310> cluster del fss7310_02 Access Appliance 8.2 Delete Node Program fss7310_02 Copyright (c) 2017 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. The Licensed Software and Documentation are deemed to be "commercial computer software" and "commercial computer software documentation" as defined in FAR Sections 12.212 and DFARS Section 227.7202. Logs are being written to /var/tmp/installaccess-201803130635kXW while installaccess is in progress. Access Appliance 8.2 Delete Node Program fss7310_02 Checking communication on fss7310_01 ........................................... Done Checking communication on fss7310_02 ........................................... Done Checking communication on fss7310_03 ........................................... Done Checking communication on fss7310_04 ........................................... Done Checking VCS running state on fss7310_01 ....................................... Done Checking VCS running state on fss7310_02 ....................................... Done Checking VCS running state on fss7310_03 ....................................... Done Checking VCS running state on fss7310_04 ....................................... Done The following changes will be made on the cluster: Failover service group VIPgroup4 will be switched to fss7310_01 Switching failover service group(s) ............................................ Done Waiting for service group(s) to come online on the other sub-cluster ........... Done All the online failover service group(s) that can be switched have been switched to the other sub-cluster. The following parallel service group(s) in the sub-cluster will be offline: fss7310_02: CanHostConsole CanHostNLM Phantomgroup_pubeth0 ReconfigGroup cvm iSCSI_INIT vrts_vea_cfs_int_cfsmount1 vrts_vea_cfs_int_cfsmount2 vrts_vea_cfs_int_cfsmount3 vrts_vea_cfs_int_cfsmount4 vrts_vea_cfs_int_cfsmount5 vrts_vea_cfs_int_cfsmount6 Offline parallel service group(s) .............................................. Done Waiting for service group(s) to be taken offline on the sub-cluster ............ Done Stopping VCS on fss7310_02 ..................................................... Done Stopping Fencing on fss7310_02 ................................................. Done Stopping gab on fss7310_02 ..................................................... Done Stopping llt on fss7310_02 ..................................................... Done Clean up deleted nodes information on the cluster .............................. Done Clean up deleted nodes ......................................................... Done Delete node completed successfully installaccess log files and summary file are saved at: /opt/VRTS/install/logs/installaccess-201803130635kXW
- Verify that the plex states are set to NODEVICE/DISABLED.
[root@fss7310_01 ~]# vxclustadm nidmap Name CVM Nid CM Nid State fss7310_01 2 0 Joined: Master fss7310_03 3 2 Joined: Slave fss7310_04 1 3 Joined: Slave [root@fss7310_01 ~]# vxprint -p | grep -i nodevice pl _nlm_-02 _nlm_ DISABLED 2097152 - NODEVICE - - pl _nlm__dcl-02 _nlm__dcl DISABLED 67840 - NODEVICE - - pl test1_tier1-P02 test1_tier1-L01 DISABLED 699392 - NODEVICE - - pl test1_tier1-P04 test1_tier1-L02 DISABLED 699392 - NODEVICE - - pl test1_tier1-P06 test1_tier1-L03 DISABLED 699392 - NODEVICE - - pl test1_tier1_dcl-02 test1_tier1_dcl DISABLED 67840 - NODEVICE - - pl test2_tier1-P02 test2_tier1-L01 DISABLED 699392 - NODEVICE - - pl test2_tier1-P04 test2_tier1-L02 DISABLED 699392 - NODEVICE - - pl test2_tier1-P06 test2_tier1-L03 DISABLED 699392 - NODEVICE - - pl test2_tier1_dcl-02 test2_tier1_dcl DISABLED 67840 - NODEVICE - - pl test3_tier1-P02 test3_tier1-L01 DISABLED 699392 - NODEVICE - - pl test3_tier1-P04 test3_tier1-L02 DISABLED 699392 - NODEVICE - - pl test3_tier1-P06 test3_tier1-L03 DISABLED 699392 - NODEVICE - - pl test3_tier1_dcl-02 test3_tier1_dcl DISABLED 67840 - NODEVICE - - pl test4_tier1-P02 test4_tier1-L01 DISABLED 699392 - NODEVICE - - pl test4_tier1-P04 test4_tier1-L02 DISABLED 699392 - NODEVICE - - pl test4_tier1-P06 test4_tier1-L03 DISABLED 699392 - NODEVICE - - pl test4_tier1_dcl-02 test4_tier1_dcl DISABLED 67840 - NODEVICE - - pl test5_tier1-P02 test5_tier1-L01 DISABLED 699392 - NODEVICE - - pl test5_tier1-P04 test5_tier1-L02 DISABLED 699392 - NODEVICE - - pl test5_tier1-P06 test5_tier1-L03 DISABLED 699392 - NODEVICE - - pl test5_tier1_dcl-02 test5_tier1_dcl DISABLED 67840 - NODEVICE - - [root@fss7310_01 ~]# vxdisk list | grep "failed was:" - - emc0_2256 sfsdg failed was:emc0_2256 - - emc0_2264 sfsdg failed was:emc0_2264 - - emc0_2272 sfsdg failed was:emc0_2272 - - emc0_2280 sfsdg failed was:emc0_2280 - - emc0_2288 sfsdg failed was:emc0_2288 - - emc0_2296 sfsdg failed was:emc0_2296 - - emc0_2304 sfsdg failed was:emc0_2304 - - emc0_2312 sfsdg failed was:emc0_2312 - - emc0_2320 sfsdg failed was:emc0_2320 - - emc0_2328 sfsdg failed was:emc0_2328 - - emc0_2336 sfsdg failed was:emc0_2336 - - emc0_2344 sfsdg failed was:emc0_2344 - - emc0_2352 sfsdg failed was:emc0_2352 - - emc0_2360 sfsdg failed was:emc0_2360
- Remove the affected mirrors for the volumes that are present on the system.
[root@fss7310_01 ~]# vxplex -f -g sfsdg -o rm dis test1_tier1-P02 [root@fss7310_01 ~]# for i in `vxprint -p | grep -i NODEVICE | awk '{print $2}'` > do > echo "vxplex -f -g sfsdg -o rm dis $i" > vxplex -f -g sfsdg -o rm dis $i > done vxplex -f -g sfsdg -o rm dis _nlm_-02 vxplex -f -g sfsdg -o rm dis _nlm__dcl-02 vxplex -f -g sfsdg -o rm dis test1_tier1-P04 vxplex -f -g sfsdg -o rm dis test1_tier1-P06 vxplex -f -g sfsdg -o rm dis test1_tier1_dcl-02 vxplex -f -g sfsdg -o rm dis test2_tier1-P02 vxplex -f -g sfsdg -o rm dis test2_tier1-P04 vxplex -f -g sfsdg -o rm dis test2_tier1-P06 vxplex -f -g sfsdg -o rm dis test2_tier1_dcl-02 vxplex -f -g sfsdg -o rm dis test3_tier1-P02 vxplex -f -g sfsdg -o rm dis test3_tier1-P04 vxplex -f -g sfsdg -o rm dis test3_tier1-P06 vxplex -f -g sfsdg -o rm dis test3_tier1_dcl-02 vxplex -f -g sfsdg -o rm dis test4_tier1-P02 vxplex -f -g sfsdg -o rm dis test4_tier1-P04 vxplex -f -g sfsdg -o rm dis test4_tier1-P06 vxplex -f -g sfsdg -o rm dis test4_tier1_dcl-02 vxplex -f -g sfsdg -o rm dis test5_tier1-P02 vxplex -f -g sfsdg -o rm dis test5_tier1-P04 vxplex -f -g sfsdg -o rm dis test5_tier1-P06 vxplex -f -g sfsdg -o rm dis test5_tier1_dcl-02 [root@fss7310_01 ~]# vxprint -p Disk group: sfsdg TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 pl _nlm_-01 _nlm_ ENABLED 2097152 - ACTIVE - - pl _nlm__dcl-01 _nlm__dcl ENABLED 67840 - ACTIVE - - pl test1_tier1-P01 test1_tier1-L01 ENABLED 699392 - ACTIVE - - pl test1_tier1-P03 test1_tier1-L02 ENABLED 699392 - ACTIVE - - pl test1_tier1-P05 test1_tier1-L03 ENABLED 699392 - ACTIVE - - pl test1_tier1-03 test1_tier1 ENABLED 2098176 - ACTIVE - - pl test1_tier1_dcl-01 test1_tier1_dcl ENABLED 67840 - ACTIVE - - pl test2_tier1-P01 test2_tier1-L01 ENABLED 699392 - ACTIVE - - pl test2_tier1-P03 test2_tier1-L02 ENABLED 699392 - ACTIVE - - pl test2_tier1-P05 test2_tier1-L03 ENABLED 699392 - ACTIVE - - pl test2_tier1-03 test2_tier1 ENABLED 2098176 - ACTIVE - - pl test2_tier1_dcl-01 test2_tier1_dcl ENABLED 67840 - ACTIVE - - pl test3_tier1-P01 test3_tier1-L01 ENABLED 699392 - ACTIVE - - pl test3_tier1-P03 test3_tier1-L02 ENABLED 699392 - ACTIVE - - pl test3_tier1-P05 test3_tier1-L03 ENABLED 699392 - ACTIVE - - pl test3_tier1-03 test3_tier1 ENABLED 2098176 - ACTIVE - - pl test3_tier1_dcl-01 test3_tier1_dcl ENABLED 67840 - ACTIVE - - pl test4_tier1-P01 test4_tier1-L01 ENABLED 699392 - ACTIVE - - pl test4_tier1-P03 test4_tier1-L02 ENABLED 699392 - ACTIVE - - pl test4_tier1-P05 test4_tier1-L03 ENABLED 699392 - ACTIVE - - pl test4_tier1-03 test4_tier1 ENABLED 2098176 - ACTIVE - - pl test4_tier1_dcl-01 test4_tier1_dcl ENABLED 67840 - ACTIVE - - pl test5_tier1-P01 test5_tier1-L01 ENABLED 699392 - ACTIVE - - pl test5_tier1-P03 test5_tier1-L02 ENABLED 699392 - ACTIVE - - pl test5_tier1-P05 test5_tier1-L03 ENABLED 699392 - ACTIVE - - pl test5_tier1-03 test5_tier1 ENABLED 2098176 - ACTIVE - - pl test5_tier1_dcl-01 test5_tier1_dcl ENABLED 67840 - ACTIVE - -
- Remove the affected disks from the disk group by using the vxdg rmdisk command and from all the nodes in the cluster by using the vxdisk rm command.
[root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2288 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2272 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2280 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2296 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2304 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2312 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2320 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2328 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2336 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2344 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2352 [root@fss7310_01 bin]# vxdg -g sfsdg rmdisk emc0_2360 [root@fss7310_01 bin]# for i in `vxdisk list | grep -i error | awk '{print $1}'`; do vxdisk rm $i; done [root@fss7310_03 ~]# for i in `vxdisk list | grep -i error | awk '{print $1}'`; do vxdisk rm $i; done [root@fss7310_04 ~]# for i in `vxdisk list | grep -i error | awk '{print $1}'`; do vxdisk rm $i; done
- Before adding the node to the cluster, verify that the IP address that you want to assign to the node does not exist in the cluster:
fss7310>network ip addr show
If the IP address is listed in the output, remove the IP address by running the following command:
fss7310>network ip addr del ip_of_new_node
- Add the node by running the cluster add command for the cluster by using IP.
- Change the sysadmin user (default IPMI user) password by using the following command:
Admin> ipmi passwd username old_password new_password
where username is sysadmin and old_password is the default password P@ssw0rd
- Add the disks from the newly added node in the pool that is already present.
[root@fss7310_01 scripts]# /opt/VRTSnas/clish/bin/clish -u admin -c "storage disk format emc0_2257,emc0_2265,emc0_2273,emc0_2281,emc0_2289,emc0_2297,emc0_2305, emc0_2313,emc0_2321,emc0_2329,emc0_2337,emc0_2345,emc0_2353,emc0_2361" You may lose all the data on the disk, do you want to continue (y/n, the default is n):y ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2257 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2265 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2273 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2281 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2289 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2297 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2305 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2313 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2321 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2329 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2337 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2345 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2353 has been formatted successfully. ACCESS Disk SUCCESS V-493-10-4 disk format: emc0_2361 has been formatted successfully. [root@fss7310_01 scripts]# /opt/VRTSnas/clish/bin/clish -u admin -c "storage pool adddisk pool1 emc0_2257,emc0_2265,emc0_2273,emc0_2281,emc0_2289,emc0_2297,emc0_2305, emc0_2313,emc0_2321,emc0_2329,emc0_2337,emc0_2345,emc0_2353,emc0_2361" ACCESS Pool SUCCESS V-493-10-2914 Successfully added disks to pool
- Mirror the volume by using the storage addmirror command.
fss7310> storage fs list FS STATUS SIZE LAYOUT MIRRORS COLUMNS USE% USED NFS CIFS FTP SECONDARY SHARED SHARED SHARED TIER ===== ====== ==== ======= ======= ======= ==== ==== ====== ====== ====== ========= test1 online 1.00G striped 1 3 10% 103M no no no no test2 online 1.00G striped 1 3 10% 103M no no no no test3 online 1.00G striped 1 3 10% 103M no no no no test4 online 1.00G striped 1 3 10% 103M no no no no test5 online 1.00G striped 1 3 10% 103M no no no no fss7310> storage fs addmirror test1 pool1 100% [#] Adding mirror to filesystem ACCESS fs SUCCESS V-493-10-2131 Added mirror for fs test1 fss7310> storage fs addmirror test2 pool1 100% [#] Adding mirror to filesystem ACCESS fs SUCCESS V-493-10-2131 Added mirror for fs test2 fss7310> storage fs addmirror test3 pool1 100% [#] Adding mirror to filesystem ACCESS fs SUCCESS V-493-10-2131 Added mirror for fs test3 fss7310> storage fs addmirror test4 pool1 100% [#] Adding mirror to filesystem ACCESS fs SUCCESS V-493-10-2131 Added mirror for fs test4
- Mirror the
_nlm_
volume by using the vxassist mirror command.[root@fss7310_01 bin]# vxassist -b -g sfsdg mirror _nlm_ [root@fss7310_01 bin]# vxprint _nlm_ Disk group: sfsdg TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 v _nlm_ fsgen ENABLED 2097152 - ACTIVE ATT1 - pl _nlm_-01 _nlm_ ENABLED 2097152 - ACTIVE - - sd emc0_2255-01 _nlm_-01 ENABLED 2097152 0 - - - pl _nlm_-02 _nlm_ ENABLED 2097152 - TEMPRMSD ATT - sd emc0_2257-01 _nlm_-02 ENABLED 2097152 0 - - - dc _nlm__dco _nlm_ - - - - - - v _nlm__dcl gen ENABLED 67840 - ACTIVE - - pl _nlm__dcl-01 _nlm__dcl ENABLED 67840 - ACTIVE - - sd emc0_2255-02 _nlm__dcl-01 ENABLED 67840 0 - - - sp _nlm__cpmap _nlm_ - - - - - -