Veritas Access 故障排除指南

Last Published:
Product(s): Access (7.4)
Platform: Linux
  1. 简介
    1.  
      关于故障排除
    2.  
      故障排除过程的通用技巧
    3.  
      故障排除过程的一般方法
    4.  
      关于 support 用户帐户
    5.  
      配置 support 用户帐户
    6.  
      使用 support 帐户登录
  2. 常规故障排除过程
    1.  
      关于常规故障排除过程
    2.  
      查看 Veritas Access 日志文件
    3.  
      关于事件日志
    4.  
      关于 shell 活动日志
    5.  
      设置 CIFS 日志级别
    6.  
      设置 NetBackup 客户端日志级别和调试选项
    7.  
      检索并发送调试信息
    8.  
      两个连续 OpenStack 命令之间的延迟不足可能会导致失败
  3. 监视 Veritas Access
    1.  
      关于监视 Veritas Access 操作
    2.  
      监视处理器活动
    3.  
      生成 CPU 和设备利用率报告
    4.  
      监视网络通信
    5.  
      导出和显示网络通信详细信息
  4. 常见恢复过程
    1.  
      关于常见的恢复过程
    2.  
      重新启动服务器
    3. 使服务联机
      1.  
        使用 services 命令
    4.  
      从非正常关闭中恢复
    5.  
      测试网络连接
    6.  
      使用 traceroute 进行故障排除
    7.  
      使用 traceroute 命令
    8.  
      收集文件系统的元数据保存映像
    9.  
      更换以太网接口卡(联机模式)
    10.  
      更换以太网接口卡(脱机模式)
    11.  
      更换 Veritas Access 节点
    12.  
      更换磁盘
    13. 加速复制
      1.  
        关于同步复制作业
      2.  
        同步间歇性复制作业
    14.  
      卸载修补程序版本或软件升级
  5. 对“将 Veritas Access 云作为层”功能进行故障排除
    1.  
      云分层的故障排除技巧
    2.  
      在云层读取或写入数据时出现问题
    3.  
      用于检查云分层错误的日志位置
  6. 对 Veritas Access 安装和配置问题进行故障排除
    1.  
      如何查找管理控制台 IP
    2.  
      查看安装日志
    3.  
      安装失败且未完成
    4.  
      从群集中排除 PCI ID
    5.  
      无法从 root 文件系统损坏中恢复
    6.  
      storage disk list 命令不返回任何结果
  7. 对 LTR 升级进行故障排除
    1.  
      查找日志文件以对 LTR 升级进行故障排除
    2.  
      对 LTR 的升级前问题进行故障排除
    3.  
      对 LTR 的升级后问题进行故障排除
  8. 对 Veritas Access CIFS 问题进行故障排除
    1.  
      拒绝用户访问 CTDB 目录共享
  9. 对 Veritas Access GUI 启动问题进行故障排除
    1.  
      解决 GUI 启动问题

更换磁盘

在某些情况下,您可能需要更换现有磁盘。本节介绍更换磁盘的步骤。

更换磁盘

  1. 从阵列端删除需要更换的磁盘。
  2. 从阵列端将新磁盘添加到系统中。
  3. 在群集中的所有节点上运行以下命令,以从 Veritas Volume Manager (VxVM) 视图中清除旧磁盘。
    #vxdisk rm <old-disk-name>
  4. 在要为其更换磁盘的节点上运行以下命令。
    # vxdisk scandisks
  5. 使用 vxdisksetup 命令初始化已添加到群集中的新磁盘。
    #/etc/vx/bin/vxdisksetup –fi <new-disk-name>
  6. 在发生故障的磁盘所在的节点上,对新添加的设备应用类似于 Veritas Access 池名称的标记。
    # vxdisk settag site=<pool-name> <new-disk-name>
  7. 在发生故障的磁盘所在的节点上,运行 vxdiskadm 命令并选择 option #5,以将发生故障的磁盘更换为新磁盘。
    #vxdiskadm

    注意:

    如果从下级节点触发磁盘更换操作,vxrecover 命令将失败。

  8. 如果从下级节点触发磁盘更换操作,请从从属节点对受影响的所有卷运行以下命令。
    #vxrecover –b –c –s <vol-name>
  9. 将新添加的磁盘重命名为磁盘访问名称。
    #vxedit –g <dg-name> rename <old-disk-name> <new-disk-name>
  10. 根据新添加的磁盘名称重命名子磁盘。
    #vxedit –g <dg-name> rename <old-subdisk-name> <new-subdisk-name>

示例:从主节点更换磁盘

此示例介绍将 emc0_2255 磁盘更换为 emc0_2263 磁盘的过程。已排除 emc0_2263 磁盘,以后将添加该磁盘以模拟磁盘添加操作。

从主节点更换磁盘

  1. 运行 vxdmpadm exclude 命令,以删除 emc0_2255 磁盘。
    # vxdmpadm exclude dmpnodename=emc0_2255
  2. 运行 vxdmpadm include 命令,以包括 emc0_2263 磁盘。
    # vxdmpadm include dmpnodename=emc0_2263

    注意:

    您可以运行 vxdisk scandisks 命令以扫描磁盘。

  3. 运行 settag 命令,以将标记应用于基础磁盘。
    # vxdisk settag emc0_2263 tag=pool1
  4. 运行 vxdiskadm 命令并选择 option #5,以更换发生故障的磁盘。
    [root@fss7310_01 ~]# vxdiskadm
    Volume Manager Support Operations
    Menu:: VolumeManager/Disk
    1 Add or initialize one or more disks
    2 Encapsulate one or more disks
    3 Remove a disk
    4 Remove a disk for replacement 5 Replace a failed or removed disk
    6 Mirror volumes on a disk
    7 Move volumes from a disk
    8 Enable access to (import) a disk group
    9 Remove access to (deport) a disk group
    10 Enable (online) a disk device
    11 Disable (offline) a disk device
    12 Mark a disk as a spare for a disk group
    13 Turn off the spare flag on a disk
    14 Unrelocate subdisks back to a disk
    15 Exclude a disk from hot-relocation use
    16 Make a disk available for hot-relocation use
    17 Prevent multipathing/Suppress devices from VxVM's view
    18 Allow multipathing/Unsuppress devices from VxVM's view
    19 List currently suppressed/non-multipathed devices
    20 Change the disk naming scheme
    21 Change/Display the default disk layouts
    22 Dynamic Reconfiguration Operations
    list List disk information
    
    Select an operation to perform: 5
    
    Replace a failed or removed disk
    Menu:: VolumeManager/Disk/ReplaceDisk
    
    Use this menu operation to specify a replacement disk for a disk
    that you removed with the "Remove a disk for replacement" menu
    operation, or that failed during use. You will be prompted for
    a disk name to replace and a disk device to use as a replacement.
    You can choose an uninitialized disk, in which case the disk will
    be initialized, or you can choose a disk that you have already
    initialized using the Add or initialize a disk menu operation.
    
    Select a removed or failed disk [<disk>,list,q,?] list
    Disk group: sfsdg
    DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
    dm emc0_2255 - - - - NODEVICE
    
    Select a removed or failed disk [<disk>,list,q,?] emc0_2255
    
    The following devices are available as replacements:
    emc0_2263
    You can choose one of these devices to replace emc0_2255. 
    Choose "none" to initialize another device to replace emc0_2255.
    Choose a device, or select "none" [<device>,none,q,?] 
    (default: emc0_2263) emc0_2263
    VxVM INFO V-5-2-382
    The requested operation is to use the initialized device emc0_2263
    to replace the removed or failed disk emc0_2255 in disk group sfsdg.
    Continue with operation? [y,n,q,?] (default: y) y
    Use FMR for plex resync? [y,n,q,?] (default: n)
    VxVM INFO V-5-2-282 Replacement of disk emc0_2255 in group 
    sfsdg with disk device emc0_2263 completed successfully.
    Replace another disk? [y,n,q,?] (default: n)
  5. 根据磁盘访问名称重命名磁盘,以避免出现 vxdg 问题。
    # vxedit -g sfsdg rename emc0_2255 emc0_2263
    
    # vxdisk list | grep emc0_2263 emc0_2263 auto:cdsdisk emc0_2263 
    sfsdg online shared
  6. 根据磁盘访问名称重命名子磁盘。
    # vxedit -g sfsdg rename emc0_2255-03 emc0_2263-03
    # vxedit -g sfsdg rename emc0_2255-02 emc0_2263-02
    
    [root@fss7310_01 ~]# vxprint -pvs | grep -i 2263
    sd emc0_2263-02 vol1-P01 ENABLED 699136 0 - - -
    sd emc0_2263-03 vol1_dcl-01 ENABLED 67840 0 - - -
    [root@fss7310_01 ~]# vxprint -pvs | grep -i 2255
    [root@fss7310_01 ~]#

示例:从下级节点更换磁盘

此示例介绍将 emc0_2273 磁盘更换为 emc0_2305 磁盘的过程。已排除 emc0_2263 磁盘,以后将添加该磁盘以模拟磁盘添加操作。

从下级节点更换磁盘

  1. 运行 vxdmpadm exclude 命令,以删除 emc0_2273 磁盘。
    # vxdmpadm exclude dmpnodename=emc0_2273
  2. 运行 vxdmpadm include 命令,以包括 emc0_2305 磁盘。
    # vxdmpadm include dmpnodename=emc0_2305

    注意:

    您可以运行 vxdisk scandisks 命令以扫描磁盘。

  3. 从群集中的其余节点运行 vxdisk rm 命令:
    [root@fss7310_02 ~]# vxdisk rm emc0_2273
    [root@fss7310_01 ~]# vxdisk rm emc0_2273
  4. 运行 settag 命令,以将标记应用于基础磁盘:
    # vxdisk settag emc0_2305 tag=pool1
  5. 运行 vxdiskadm 命令并选择 option #5,以更换发生故障的磁盘。
    [root@fss7310_01 ~]# vxdiskadm
    
    Volume Manager Support Operations
    Menu:: VolumeManager/Disk
    
    1 Add or initialize one or more disks
    2 Encapsulate one or more disks
    3 Remove a disk
    4 Remove a disk for replacement 5 Replace a failed or removed disk
    6 Mirror volumes on a disk
    7 Move volumes from a disk
    8 Enable access to (import) a disk group
    9 Remove access to (deport) a disk group
    10 Enable (online) a disk device
    11 Disable (offline) a disk device
    12 Mark a disk as a spare for a disk group
    13 Turn off the spare flag on a disk
    14 Unrelocate subdisks back to a disk
    15 Exclude a disk from hot-relocation use
    16 Make a disk available for hot-relocation use
    17 Prevent multipathing/Suppress devices from VxVM's view
    18 Allow multipathing/Unsuppress devices from VxVM's view
    19 List currently suppressed/non-multipathed devices
    20 Change the disk naming scheme
    21 Change/Display the default disk layouts
    22 Dynamic Reconfiguration Operations
    list List disk information
    
    ? Display help about menu
    ?? Display help about the menuing system
    q Exit from menus
    
    Select an operation to perform: 5
    Replace a failed or removed disk
    
    Menu:: VolumeManager/Disk/ReplaceDisk
    Use this menu operation to specify a replacement disk for a disk
    that you removed with the "Remove a disk for replacement" menu
    operation, or that failed during use. You will be prompted for
    a disk name to replace and a disk device to use as a replacement.
    You can choose an uninitialized disk, in which case the disk will
    be initialized, or you can choose a disk that you have already
    initialized using the Add or initialize a disk menu operation.
    Select a removed or failed disk [<disk>,list,q,?] list
    Disk group: sfsdg
    
    DM NAME      DEVICE TYPE PRIVLEN PUBLEN STATE
    dm emc0_2273 -      -    -       -      NODEVICE
    
    Select a removed or failed disk [<disk>,list,q,?] emc0_2273
    The following devices are available as replacements:
    emc0_2305
    You can choose one of these devices to replace emc0_2255. Choose 
    "none" to initialize another device to replace emc0_2255.
    Choose a device, or select "none" [<device>,none,q,?] 
    (default: emc0_2305) emc0_2305
    VxVM INFO V-5-2-382
    The requested operation is to use the initialized device emc0_2305
    to replace the removed or failed disk emc0_2273 in disk group sfsdg.
    
    Continue with operation? [y,n,q,?] (default: y)
    Use FMR for plex resync? [y,n,q,?] (default: n) VxVM vxrecover 
    ERROR V-5-1-16084 Disk group: sfsdg is shared. The command can be 
    executed only on the master. Use -c option to recover all the shared 
    disk groups from slaves.
    VxVM INFO V-5-2-282 Replacement of disk emc0_2273 in group sfsdg 
    with disk device emc0_2305 completed successfully.
    Replace another disk? [y,n,q,?] (default: n)
  6. 运行以下命令可恢复受影响的卷。
    # vxrecover -b -c -s vol1
  7. 将磁盘重命名为磁盘访问名称,以避免出现 vxdg 问题。
    # vxedit -g sfsdg rename emc0_2273 emc0_2305
    
    # vxdisk list | grep emc0_2305
    emc0_2305 auto:cdsdisk emc0_2305 sfsdg online shared
  8. 根据新添加的磁盘遵循的命名约定来重命名子磁盘。
    # vxedit -g sfsdg rename emc0_2273-02 emc0_2305-02
    # vxedit -g sfsdg rename emc0_2273-03 emc0_2305-03
    
    # vxprint -pvs | grep -i emc0_2305
    sd emc0_2305-02 vol1-P02 ENABLED 699136 0 - - -
    sd emc0_2305-03 vol1_dcl-02 ENABLED 67840 0 - - -
    # vxprint -pvs | grep -i emc0_2273