Sign In
Forgot Password

Don’t have an account? Create One.

7.4.2 Update3 patch for RHEL7 platform

Cumulative Patch

Abstract

7.4.2 Update3 patch for RHEL7 platform

Description

SORT ID: 17405

 

Fixes the following incidents:

4020528,4046515,4046520,4046526,4045605,4045606,4046521,4046525,4048981,4007372,4007374,4012397,4019536,4007692,
4007763,4008986,4049572,4046423,4021055,4006982,4007375,4007376,4007677,4046524,4019533,4021057,4046415,4046419,
4021059,4013034,4039475,4046200,4046420,4019535,4021058,4013036,4013955,4023471,4014715,4019695,3999030,4007226,
4049693,4049416,4049522,4037283,4018173,4018178,4039517,4046906,4046907,4046908,4047592,4047595,4047695,4047722,
4018182,4020207,4020438,4021238,4021240,4021346,4021359,4021366,4021428,4021748,4023095,4018180,4008606,4010892,
4011866,4011971,4012730,4012848,4013155,4013169,4013718,4047510,4049440,4023468,4013985,4013420,4040238,4040608,
4042686,4044184,4046265,4046266,4046267,4046271,4046272,4046829,4047568,4049091,4049097,4012765,4014720,4015287,
4015835,4016721,4017282,4017818,4017820,4019877,4020055,4020056,4020912,4020337,4002850,4005220,4010353,4012061,
4012522,4012787,4012800,4012801,4012842,4012936,4013084,4013143,4013144,4013626,4013738,4014244

 

Patch ID:

VRTSamf-7.4.2.2100-RHEL7 for VRTSamf
  VRTSaslapm-7.4.2.2200-RHEL7 for VRTSaslapm
  VRTScavf-7.4.2.1300-GENERIC for VRTScavf
  VRTSgab-7.4.2.2100-RHEL7 for VRTSgab
  VRTSglm-7.4.2.1500-RHEL7 for VRTSglm
  VRTSgms-7.4.2.1200-RHEL7 for VRTSgms
  VRTSllt-7.4.2.2100-RHEL7 for VRTSllt
  VRTSodm-7.4.2.2200-RHEL7 for VRTSodm
  VRTSpython-3.7.4.35-RHEL7 for VRTSpython
  VRTSsfcpi-7.4.2.1100-GENERIC for VRTSsfcpi
  VRTSsfmh-7.4.2.501-0 for VRTSsfmh
  VRTSvcs-7.4.2.2100-RHEL7 for VRTSvcs
  VRTSvcsag-7.4.2.2100-RHEL7 for VRTSvcsag
  VRTSvcsea-7.4.2.1100-RHEL7 for VRTSvcsea
  VRTSvcswiz-7.4.2.2100-RHEL7 for VRTSvcswiz
  VRTSvlic-4.01.742.300-RHEL7 for VRTSvlic
  VRTSvxfen-7.4.2.2100-RHEL7 for VRTSvxfen
  VRTSvxfs-7.4.2.2200-RHEL7 for VRTSvxfs
  VRTSvxvm-7.4.2.2200-RHEL7 for VRTSvxvm

                          * * * READ ME * * *
                      * * * InfoScale 7.4.2 * * *
                         * * * Patch 1900 * * *
                         Patch Date: 2021-10-04


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH
   * KNOWN ISSUES


PATCH NAME
----------
InfoScale 7.4.2 Patch 1900


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
RHEL7 x86-64


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSamf
VRTSaslapm
VRTScavf
VRTSgab
VRTSglm
VRTSgms
VRTSllt
VRTSodm
VRTSpython
VRTSsfcpi
VRTSsfmh
VRTSvcs
VRTSvcsag
VRTSvcsea
VRTSvcswiz
VRTSvlic
VRTSvxfen
VRTSvxfs
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * InfoScale Availability 7.4.2
   * InfoScale Enterprise 7.4.2
   * InfoScale Foundation 7.4.2
   * InfoScale Storage 7.4.2


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: VRTSvcsea-7.4.2.1100
* 4020528 (4001565) On Solaris 11.4, IMF fails to provide notifications when Oracle processes stop.
Patch ID: VRTSvcs-7.4.2.2100
* 4046515 (4040705) hacli hangs indefinitely when command exceeds character limit of 4096
* 4046520 (4040656) Gracefully restart HAD in occurrence of ENOMEM error
* 4046526 (4043700) While Online operation is in progress and the PreOnline trigger is already executing; Multiple PreOnline triggers can be executed on the same/different nodes in the cluster for failover/parallel/hybrid service groups.
Patch ID: VRTSvcsag-7.4.2.2100
* 4045605 (4038906) In case of ESXi 6.7, the VMwareDisks agent fails to perform a failover on a peer node.
* 4045606 (4042944) In a hardware replicated environment, a disk group resource may fail to import when the HARDWARE_MIRROR flag is set
* 4046521 (4030215) Azure agents now support azure-identity based credential methods
* 4046525 (4046286) Azure Cloud agents does not handle generic exceptions
* 4048981 (4048164) Cloud agents may report incorrect resource state in case cloud API hangs.
Patch ID: VRTSvcsag-7.4.2.1400
* 4007372 (4016624) When a disk group is forcibly imported with ClearClone enabled, different DGIDs are assigned to the associated disks.
* 4007374 (1837967) Application agent falsely detects an application as faulted, due to corruption caused by non-redirected STDOUT or STDERR.
* 4012397 (4012396) AzureDisk agent fails to work with latest Azure Storage SDK.
* 4019536 (4009761) A lower NFSRestart resoure fails to come online within the duration specified in OnlineTimeout when the share directory for NFSv4 lock state information contains millions of small files.
Patch ID: VRTSvcsag-7.4.2.1300
* 4007692 (4006979) When the AzureDisk resource comes online on a cluster node, it goes into the UNKNOWN state on all the other nodes.
* 4007763 (4007764) The NFS locks related log file is flooded with the "sync_dir:copy failed for link" error messages.
* 4008986 (3860766) HostMonitor agent shows incorrect swap space usage in the agent logs.
Patch ID: VRTSvcswiz-7.4.2.2100
* 4049572 (4049573) Veritas High Availability Configuration Wizard (HA-Plugin) is not supported on VMWare vCenter HTML based UI.
Patch ID: VRTSvxfen-7.4.2.2100
* 4046423 (4043619) OCPR failed from SCSI3 fencing to Customized mode
Patch ID: VRTSvxfen-7.4.2.1300
* 4021055 (4010237) On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.
Patch ID: VRTSvxfen-7.4.2.1100
* 4006982 (3988184) The vxfen process cannot complete due to incomplete vxfentab file.
* 4007375 (4000745) The VxFEN process fails to start due to late discovery of the VxFEN disk group.
* 4007376 (3996218) In a customized fencing mode, the 'vxfenconfig -c' command creates a new vxfend process even if VxFen is already configured.
* 4007677 (3970753) Freeing uninitialized/garbage memory causes panic in vxfen.
Patch ID: VRTSamf-7.4.2.2100
* 4046524 (4041596) A cluster node panics when the arguments passed to a process that is registered with AMF exceeds 8K characters.
Patch ID: VRTSamf-7.4.2.1300
* 4019533 (4018791) A cluster node panics when the AMF module attempts to access an executable binary or a script using its absolute path.
* 4021057 (4010237) On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.
Patch ID: VRTSgab-7.4.2.2100
* 4046415 (4046413) gab node count/fencing quorum not getting updated properly
* 4046419 (4046418) gab startup does not fail even if llt is not configured
Patch ID: VRTSgab-7.4.2.1300
* 4021059 (4010237) On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.
Patch ID: VRTSgab-7.4.2.1100
* 4013034 (4011683) The GAB module failed to start and the system log messages indicate failures with the mknod command.
Patch ID: VRTSllt-7.4.2.2100
* 4039475 (4045607) Performance improvement of the UDP multiport feature of LLT on 1500 MTU-based networks.
* 4046200 (4046199) llt over udp configuration now accepts any link tag name
* 4046420 (3989372) When the CPU load and memory consumption is high in a VMware environment, some nodes in an InfoScale cluster may get fenced out.
Patch ID: VRTSllt-7.4.2.1300
* 4019535 (4018581) The LLT module fails to start and the system log messages indicate missing IP address.
* 4021058 (4010237) On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.
Patch ID: VRTSllt-7.4.2.1100
* 4013036 (3985775) Sometimes, the system log may get flooded with LLT heartbeat loss messages that do not necessarily indicate any actual issues with LLT.
* 4013955 (4013953) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 9(RHEL7.9).
Patch ID: VRTSgms-7.4.2.1200
* 4023471 (4023472) Unable to load the vxgms module on linux.
Patch ID: VRTSglm-7.4.2.1500
* 4014715 (4011596) Multiple issues were observed during glmdump using hacli for communication
Patch ID: VRTSglm-7.4.2.1300
* 4019695 (4011596) man page changes for glmdump
Patch ID: VRTSglm-7.4.1.1700
* 3999030 (3999029) GLM module failed to unload because of VCS service hold.
Patch ID: VRTScavf-7.4.2.1300
* 4007226 (4007196) Potential fsck flag being set during cluster shutdown
Patch ID: VRTSpython-3.7.4.35
* 4049693 (4049692) VRTSpython package has been updated with more python modules to support Licensing component.
Patch ID: VRTSvcs-4.01.742.300
* 4049416 (4049416) Migrate Telemetry Collector from Java to Python
Patch ID: VRTSsfmh-vom-HF0742501
* 4049522 (4049521) VIOM Agent for InfoScale 7.4.2 Update3
Patch ID: VRTSvxvm-7.4.2.2200
* 4037283 (4021301) Data corruption issue observed in VxVM on RHEL8.
* 4018173 (3852146) Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options 
are
specified together
* 4018178 (3906534) After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device.
* 4039517 (4012763) IO hang may happen in VVR (Veritas Volume Replicator) configuration when SRL overflows for one rlink while another one rlink is in AUTOSYNC mode.
* 4046906 (3956607) vxdisk reclaim dumps core.
* 4046907 (4041001) In VxVM, system is getting hung when some nodes are rebooted.
* 4046908 (4038865) System panick at vxdmp module in IRQ stack.
* 4047592 (3992040) bi_error - bi_status conversion map added for proper interpretation of errors at FS side.
* 4047595 (4009353) Post enabling dmp native support machine is going in to mantaince mode
* 4047695 (3911930) Provide a way to clear the PGR_FLAG_NOTSUPPORTED on the device instead of using
exclude/include commands
* 4047722 (4023390) Vxconfigd keeps dump core as invalid private region offset on a disk.
Patch ID: VRTSvxvm-7.4.2.1500
* 4018182 (4008664) System panic when signal vxlogger daemon that has ended.
* 4020207 (4018086) system hang was observed when RVG was in DCM resync with SmartMove as ON.
* 4020438 (4020046) DRL log plex gets detached unexpectedly.
* 4021238 (4008075) Observed with ASL changes for NVMe, This issue observed in reboot scenario. For every reboot machine was hitting panic And this was happening in loop.
* 4021240 (4010612) This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on. means every nvme/ssd disks names would be 
hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....
* 4021346 (4010207) System panicked due to hard-lockup due to a spinlock not released properly during the vxstat collection.
* 4021359 (4010040) A security issue occurs during Volume Manager configuration.
* 4021366 (4008741) VxVM device files are not correctly labeled to prevent unauthorized modification - device_t
* 4021428 (4020166) Vxvm Support on RHEL8 Update3
* 4021748 (4020260) Failed to activate/set tunable dmp native support for Centos 8
* 4023095 (4007920) Control auto snapshot deletion when cache obj is full.
Patch ID: VRTSvxvm-7.4.2.1400
* 4018180 (3958062) When boot lun is migrated, enabling and disabling dmp_native_support fails.
* 4018182 (4008664) System panic when signal vxlogger daemon that has ended.
* 4020207 (4018086) system hang was observed when RVG was in DCM resync with SmartMove as ON.
* 4021346 (4010207) System panicked due to hard-lockup due to a spinlock not released properly during the vxstat collection.
* 4021428 (4020166) Vxvm Support on RHEL8 Update3
* 4021748 (4020260) Failed to activate/set tunable dmp native support for Centos 8
Patch ID: VRTSvxvm-7.4.2.1300
* 4008606 (4004455) Instant restore failed for a snapshot created on older version DG.
* 4010892 (4009107) CA chain certificate verification fails in SSL context.
* 4011866 (3976678) vxvm-recover:  cat: write error: Broken pipe error encountered in syslog.
* 4011971 (3991668) Veritas Volume Replicator (VVR) configured with sec logging reports data inconsistency when hit "No IBC message arrived" error.
* 4012730 (4012728) VxVM support for RHEL 7.9
* 4012848 (4011394) Performance enhancement for cloud tiering.
* 4013155 (4010458) In VVR (Veritas Volume replicator), the rlink might inconsistently disconnect due to unexpected transactions.
* 4013169 (4011691) High CPU consumption on the VVR secondary nodes because of high pending IO load.
* 4013718 (4008942) Docker infoscale plugin is failing to unmount the filesystem, if the cache object is full
Patch ID: VRTSaslapm-7.4.2.2200
* 4047510 (4042420) APM modules creation fails as vxvm-startup tries to make hardlink on different partition.
Patch ID: VRTSodm-7.4.2.2200
* 4049440 (4049438) VRTSodm driver will not load with 7.4.2.2200 VRTSvxfs patch.
Patch ID: VRTSodm-7.4.2.1500
* 4023468 (4023469) Unable to load the vxodm module on linux.
Patch ID: VRTSodm-7.4.2.1300
* 4013985 (4013984) VRTSodm driver will not load with 7.4.2.1300 VRTSvxfs patch.
Patch ID: VRTSvxfs-7.4.2.2200
* 4013420 (4013139) The abort operation on an ongoing online migration from the native file system to VxFS on RHEL 8.x systems.
* 4040238 (4035040) vfradmin stats command failed to show all the fields in the command output in-case job paused and resume.
* 4040608 (4008616) fsck command got hung.
* 4042686 (4042684) ODM resize fails for size 8192.
* 4044184 (3993140) Compclock was not giving accurate results.
* 4046265 (4037035) Added new tunable "vx_ninact_proc_threads" to control the number of inactive processing threads.
* 4046266 (4043084) panic in vx_cbdnlc_lookup
* 4046267 (4034910) Asynchronous access/updatation of global list large_dirinfo  can corrupt its values in multi-threaded execution.
* 4046271 (3993822) fsck stops running on a file system
* 4046272 (4017104) Deleting a lot of files can cause resource starvation, causing panic or momentary hangs.
* 4046829 (3993943) The fsck utility hit the coredump due to segmentation fault in get_dotdotlst()
* 4047568 (4046169) On RHEL8, while doing a directory move from one FS (ext4 or vxfs) to migration VxFS, the migration can fail and FS will be disable.
* 4049091 (4035057) On RHEL8, IOs done on FS, while other FS to VxFS migration is in progress can cause panic.
* 4049097 (4049096) Dalloc change ctime in background while extent allocation
Patch ID: VRTSvxfs-7.4.2.1600
* 4012765 (4011570) WORM attribute replication support in VxFS.
* 4014720 (4011596) Multiple issues were observed during glmdump using hacli for communication
* 4015287 (4010255) "vfradmin promote" fails to promote target FS with selinux enabled.
* 4015835 (4015278) System panics during vx_uiomove_by _hand.
* 4016721 (4016927) For multi cloud tier scenario, system panic with NULL pointer dereference when we try to remove second cloud tier
* 4017282 (4016801) filesystem mark for fullfsck
* 4017818 (4017817) VFR performance enhancement changes.
* 4017820 (4017819) Adding cloud tier operation fails while trying to add AWS GovCloud.
* 4019877 (4019876) Remove license library dependency from vxfsmisc.so library
* 4020055 (4012049) Documented "metasave" option and added one new option in fsck binary.
* 4020056 (4012049) Documented "metasave" option and added one new option in fsck binary.
* 4020912 (4020758) Filesystem mount or fsck with -y may see hang during log replay
Patch ID: VRTSvxfs-7.4.2.1400
* 4020337 (4020334) VxFS Dummy incidents for FLEX patch archival.
Patch ID: VRTSvxfs-7.4.2.1300
* 4002850 (3994123) Running fsck on a system may show LCT count mismatch errors
* 4005220 (4002222) Code changes have been done to prevent cluster-wide hang in a scenario where the cluster filesystem is FCL enabled and the disk layout version is greater than or equals to 14.
* 4010353 (3993935) Fsck command of vxfs may hit segmentation fault.
* 4012061 (4001378) VxFS module failed to load on RHEL8.2
* 4012522 (4012243) Read/Write performance improvement in VxFS
* 4012765 (4011570) WORM attribute replication support in VxFS.
* 4012787 (4007328) VFR source keeps processing file change log(FCL) records even after connection closure from target.
* 4012800 (4008123) VFR fails to replicate named extended attributes if the job is paused.
* 4012801 (4001473) VFR fails to replicate named extended attributes set on files
* 4012842 (4006192) system panic with NULL pointer de-reference.
* 4012936 (4000465) FSCK binary loops when it detects break in sequence of log ids.
* 4013084 (4009328) In cluster filesystem, unmount hang could be observed if smap is marked bad previously.
* 4013143 (4008352) Using VxFS mount binary inside container to mount any device might result in core generation.
* 4013144 (4008274) Race between compression thread and clone remove thread while allocating reorg inode.
* 4013626 (4004181) Read the value of VxFS compliance clock
* 4013738 (3830300) Degraded CPU performance during backup of Oracle archive logs
on CFS vs local filesystem
Patch ID: VRTSsfcpi-7.4.2.1100
* 4014244 (4014243) The patch installer does not provide the rolling upgrade option to apply a patch.


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: VRTSvcsea-7.4.2.1100

* 4020528 (Tracking ID: 4001565)

SYMPTOM:
On Solaris 11.4, IMF fails to provide notifications when Oracle processes stop.

DESCRIPTION:
On Solaris 11.4, when Oracle processes stop, IMF provides notification to Oracle agent, but the monitor is not scheduled. As as result, agent fails intelligent monitoring.

RESOLUTION:
Oracle agent now provides notifications when Oracle processes stop.

Patch ID: VRTSvcs-7.4.2.2100

* 4046515 (Tracking ID: 4040705)

SYMPTOM:
hacli hangs indefinitely when command exceeds character limit of 4096.

DESCRIPTION:
hacli hangs indefinitely when '-cmd' option value exceeds character limit 4096. Instead of returning proper error message hacli indefinitely waits for reply from vcs engine.

RESOLUTION:
Increased character limit of hacli '-cmd' option value. Now it's 7680. Also handled validations of different options of hacli. So when '-cmd' option value will exceed this new limit it will give proper error message instead of hanging.

* 4046520 (Tracking ID: 4040656)

SYMPTOM:
In result of ENOMEM error HAD restart with '-restart' option

DESCRIPTION:
When ENOMEM error occurs, HAD retries for some max limit and still if we get ENOMEM error then HAD exits. Then hashadow daemon restarts HAD with '-restart' option. So it doesn't allows to Austostart of failover SG in cluster as it considers as one of the node is in restarting mode.

RESOLUTION:
In nonoccurence of ENOMEM error HAD will gracefully exit and hashadow daemon will restart HAD without '-restart' option. So that node will not be considered as restarted and Autostart of failover SG will be triggered.

* 4046526 (Tracking ID: 4043700)

SYMPTOM:
While Online operation is in progress and the PreOnline trigger is already executing; Multiple PreOnline triggers can be executed on the same/different nodes in the cluster for failover/parallel/hybrid service groups.

DESCRIPTION:
In-progress execution of the PreOnline trigger was not accounted. Thus subsequent online operations can be accepted while there is a PreOnline trigger already executing. Hence multiple PreOnline trigger instances were executed.

RESOLUTION:
While validating an online operation in progress PreOnline triggers were also considered and subsequent online operations were rejected. This fix ensures only one execution of the PreOnline trigger for failover groups.

Patch ID: VRTSvcsag-7.4.2.2100

* 4045605 (Tracking ID: 4038906)

SYMPTOM:
In case of ESXi 6.7, the VMwareDisks agent fails to perform a failover on a peer node.

DESCRIPTION:
The VMwareDisks agent faults when you try to bring the related service group online or to fail over the service group on a peer node. This issue occurs due to the change in the behavior of the API on ESXi 6.7 that is used to attach VMware disks.

RESOLUTION:
The VMWareDisks agent is updated to support the changed behavior of the API on ESXi 6.7. The agent can now bring the service group online or perform a failover on a peer node successfully.

* 4045606 (Tracking ID: 4042944)

SYMPTOM:
In a hardware replicated environment, a disk group resource may fail to import when the HARDWARE_MIRROR flag is set

DESCRIPTION:
After the VCS hardware replication agent resource fails over control to the secondary site, the DiskGroup agent does not rescan all the required device paths in 
case of a multi-pathing configuration. 
The vxdg import operation fails, as the hardware device characteristics for all the paths are not refreshed.

RESOLUTION:
This hotfix introduces of a new resource attribute for DiskGroup agent called ScanDisks. The ScanDisks attributes enables the user to perform a selective 
devices scan for all disk paths associated with a VxVM disk group. The VxVM and DMP disks attributes are refreshed before attempting to importing hardware clone 
or replicated devices. The default value of ScanDisks is 0, which indicates a selective device scan is not performed. Even when set 0, if the disk group fails 
with an error string containing HARDWARE MIRROR during the first disk group import attempt, the DiskGroup agent will then perform a selective device scan to 
increase of the chances of a successful import.
Sample resource configurations:
For Hardware Clone DiskGroups

DiskGroup tc_dg (
DiskGroup = datadg
DGOptions = "-o useclonedev=on -o updateid"
ForceImport = 0
ScanDisks = 1
)

For Hardware Replicated DiskGroups

DiskGroup tc_dg (
DiskGroup = datadg
ForceImport = 0
ScanDisks = 1
)

* 4046521 (Tracking ID: 4030215)

SYMPTOM:
Azure agents now support azure-identity based credential methods

DESCRIPTION:
Azure credential system is revamped. The new system is available in azure-identity library.

RESOLUTION:
Azure agents now support azure-identity based credential method. With this enhancement, Azure agents  will support following Azure Python SDK versions:

azure-common==1.1.25
azure-core==1.10.0
azure-identity==1.4.1
azure-mgmt-compute==19.0.0
azure-mgmt-core==1.2.2
azure-mgmt-dns==8.0.0
azure-mgmt-network==17.1.0
azure-storage-blob==12.8.0
msrestazure==0.6.4

* 4046525 (Tracking ID: 4046286)

SYMPTOM:
Azure Cloud agents does not handle generic exceptions

DESCRIPTION:
Azure agents are handling only CloudError of Azure APIs, but there can be other error that may occur during certain failure conditions.

RESOLUTION:
Azure agents are enhanced to handle API failure conditions.

* 4048981 (Tracking ID: 4048164)

SYMPTOM:
Cloud agents may report incorrect resource state in case cloud API hangs.

DESCRIPTION:
In case Cloud SDK API/CLI hang, the monitor function of cloud agents times out. This results in un-wanted failover of service group.

RESOLUTION:
The default value of FaultOnMonitorTimeout attribute of all cloud agents are set to 0. This helps in avoiding un-wanted failover because of Cloud SDK API/CLI hang.

Patch ID: VRTSvcsag-7.4.2.1400

* 4007372 (Tracking ID: 4016624)

SYMPTOM:
When a disk group is forcibly imported with ClearClone enabled, different DGIDs are assigned to the associated disks.

DESCRIPTION:
When the ForceImport option is used, a disk group gets imported with the available disks, regardless of whether all the required disks are available or not. In such a scenario, if the ClearClone attribute is enabled, the available disks are successfully imported, but their DGIDs are updated to new values. Thus, the disks within the same disk group end up with different DGIDs, which may cause issues with the functioning of the storage configuration.

RESOLUTION:
The DiskGroup agent is updated to allow the ForceImport and the ClearClone attributes to be set to the following values as per the configuration requirements. ForceImport can be set to 0 or 1. ClearClone can be set to 0, 1, or 2. ClearClone is disabled when set to 0 and enabled when set to 1 or 2. ForceImport is disabled when set to 0 and is ignored when ClearClone is set to 1. To enable both, ClearClone and ForceImport, set ClearClone to 2 and ForceImport to 1.

* 4007374 (Tracking ID: 1837967)

SYMPTOM:
Application agent falsely detects an application as faulted, due to corruption caused by non-redirected STDOUT or STDERR.

DESCRIPTION:
This issue can occur when the STDOUT and STDERR file descriptors of the program to be started and monitored are not redirected to a specific file or to /dev/null. In this case, an application that is started by the Online entry point inherits the STDOUT and STDERR file descriptors from the entry point. Therefore, the entry point and the application, both, read from and write to the same file, which may lead to file corruption and cause the agent entry point to behave unexpectedly.

RESOLUTION:
The Application agent is updated to identify whether STDOUT and STDERR for the configured application are already redirected. If not, the agent redirects them to /dev/null.

* 4012397 (Tracking ID: 4012396)

SYMPTOM:
AzureDisk agent fails to work with latest Azure Storage SDK.

DESCRIPTION:
Latest Python SDK for Azure doesn't work with InfoScale AzureDisk agent.

RESOLUTION:
AzureDisk agent now supports latest Azure Storage Python SDK.

* 4019536 (Tracking ID: 4009761)

SYMPTOM:
A lower NFSRestart resoure fails to come online within the duration specified in OnlineTimeout when the share directory for NFSv4 lock state information contains millions of small files.

DESCRIPTION:
As part of the Online operation, the NFSRestart agent copies the NFSv4 state data of clients from the shared storage to the local path. However, if the source location contains millions of files, some of which may be stale, their movement may not be completed before the operation times out.

RESOLUTION:
A new action entry point named "cleanup" is provided, which removes stale files. The usage of the entry point is as follows:
$ hares -action <resname> cleanup -actionargs <days> -sys <sys>
  <days>: number of days, deleting files that are <days> old
Example:
$ hares -action NFSRestart_L cleanup -actionargs 30 -sys <sys>
The cleanup action ensures that files older than the number of days specified in the -actionargs option are removed; the minimum expected duration is 30 days. Thus, only the relevant files to be moved remain, and the Online operation is completed in time.

Patch ID: VRTSvcsag-7.4.2.1300

* 4007692 (Tracking ID: 4006979)

SYMPTOM:
When the AzureDisk resource comes online on a cluster node, it goes into the UNKNOWN state on all the other nodes.

DESCRIPTION:
When an AzureDisk resource is online on one node, the status of that resource appears as UNKNOWN, instead of OFFLINE, on the other nodes in the cluster. Also, if the resource is brought online on a different node, its status on the remaining nodes appears as UNKNOWN. However, if the resource is not online on any node, its status correctly appears as OFFLINE on all the nodes.
This issue occurs when the VM name on the Azure portal does not match the local hostname of the cluster node. The monitor operation of the agent compares these two values to identify whether the VM to which the AzureDisk resource is attached is part of a cluster or not. If the values do not match, the agent incorrectly concludes that the resource is attached to a VM outside the cluster. Therefore, it displays the status of the resource as UNKNOWN.

RESOLUTION:
The AzureDisk agent is modified to compare the VM name with the appropriate attribute of the of the agent so that the status of an AzureDisk resource is reported correctly.

* 4007763 (Tracking ID: 4007764)

SYMPTOM:
The NFS locks related log file is flooded with the "sync_dir:copy failed for link" error messages.

DESCRIPTION:
The smsyncd daemon used by the NFSRestart agent copies the symbolic links and the NFS locks from the /var/statmon/sm directory to a specific directory. These files and links are used to track the clients who have set a lock on the NFS mount points. If this directory already has a symbolic link with the same name that the smsyncd daemon is trying to copy, the /bin/cp commands fails and logs an error message.

RESOLUTION:
The smsyncd daemon is enhanced to copy the symbolic links even if the link with same name is present.

* 4008986 (Tracking ID: 3860766)

SYMPTOM:
When the swap space is in terabytes, HostMonitor agent shows incorrect swap space usage in the agent logs.

DESCRIPTION:
When the swap space on the system is in terabytes, HostMonitor incorrectly calculated the available swap 
capacity. The incorrect swap space usage was displayed in the HostMonitor agent log.

RESOLUTION:
Veritas has modified the HostMonitor agent code to correctly calculate the swap space capacity on the 
system.

Patch ID: VRTSvcswiz-7.4.2.2100

* 4049572 (Tracking ID: 4049573)

SYMPTOM:
Veritas High Availability Configuration Wizard (HA-Plugin) is not supported on VMWare vCenter HTML based UI.

DESCRIPTION:
Veritas HA-Plugin was based on Adobe Flex. HA-Plugin fails to work because Flex is now deprecated.

RESOLUTION:
Veritas HA-Plugin now supports VMWare vCenter HTML based UI.

Patch ID: VRTSvxfen-7.4.2.2100

* 4046423 (Tracking ID: 4043619)

SYMPTOM:
OCPR failed from SCSI3 fencing to Customized mode

DESCRIPTION:
Online Coordination Point Replacement (OCPR) was broken for SCSI3 to Customized mode based fencing. This was due to a regression due to a change in vxfend invocation

RESOLUTION:
OCPR from SCSI3 to Customized mode is working again with this fix

Patch ID: VRTSvxfen-7.4.2.1300

* 4021055 (Tracking ID: 4010237)

SYMPTOM:
On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.

DESCRIPTION:
On RHEL, device files should have correct SELinux labels to avoid unauthorized access.

RESOLUTION:
Code changes done to set correct SELinux labels.

Patch ID: VRTSvxfen-7.4.2.1100

* 4006982 (Tracking ID: 3988184)

SYMPTOM:
The vxfen process cannot complete due to incomplete vxfentab file.

DESCRIPTION:
When I/O fencing starts, the vxfen startup script creates the /etc/vxfentab file on each node. If the coordination disk discovery is slow, the vxfen startup script fails to include all the coordination points in the vxfentab file. As a result, the vxfen startup script gets stuck in a loop.

RESOLUTION:
The vxfen startup process is modified to exit from the loop if it gets stuck while configuring 'vxfenconfig -c'. On exiting from the loop, systemctl starts vxfen again and tries to use the updated vxfentab file.

* 4007375 (Tracking ID: 4000745)

SYMPTOM:
The VxFEN process fails to start due to late discovery of the VxFEN disk group.

DESCRIPTION:
When I/O fencing starts, the VxFEN startup script creates this /etc/vxfentab file on each node. During disk-based fencing, the VxVM module may take longer time to discover the VxFEN disk group. Because of this delay, the 'generate disk list' opreration times out. Therefore, the VxFEN process fails to start and reports the following error: 'ERROR: VxFEN cannot generate vxfentab because vxfendg does not exist'

RESOLUTION:
A new tunable, getdisks_timeout, is introduced to specify the timeout value for the VxFEN disk group discovery. The maximum and the default value for this tunable is 600 seconds. You can set the value of this tunable by adding an getdisks_timeout=<time_in_sec> entry in the /etc/vxfenmode file.

* 4007376 (Tracking ID: 3996218)

SYMPTOM:
In a customized fencing mode, the 'vxfenconfig -c' command creates a new vxfend process even if VxFen is already configured.

DESCRIPTION:
When you configure fencing in the customized mode and run the 'vxfenconfig -c' command, the vxfenconfig utility reports the 'VXFEN ERROR V-11-1-6 vxfen already configured...' error. Moreover, it also creates a new vxfend process even if VxFen is already configured. Such redundant processes may impact the performance of the system.

RESOLUTION:
The vxfenconfig utility is modified so that it does not create a new vxfend process when VxFen is already configured.

* 4007677 (Tracking ID: 3970753)

SYMPTOM:
Freeing uninitialized/garbage memory causes panic in vxfen.

DESCRIPTION:
Freeing uninitialized/garbage memory causes panic in vxfen.

RESOLUTION:
Veritas has modified the VxFen kernel module to fix the issue by initializing the object before attempting to free it.
 .

Patch ID: VRTSamf-7.4.2.2100

* 4046524 (Tracking ID: 4041596)

SYMPTOM:
A cluster node panics when the arguments passed to a process that is registered with AMF exceeds 8K characters.

DESCRIPTION:
This issue occurs due to improper parsing and handling of argument lists that are passed to processes registered with AMF.

RESOLUTION:
AMF is updated to correctly parse and handle argument lists for processes.

Patch ID: VRTSamf-7.4.2.1300

* 4019533 (Tracking ID: 4018791)

SYMPTOM:
A cluster node panics when the AMF module module attempts to access an executable binary or a script using its absolute path.

DESCRIPTION:
A cluster node panics and generates a core dump, which indicates that an issue with the AMF module. The AMF module function that locates an executable binary or a script using its absolute path fails to handle NULL values.

RESOLUTION:
The AMF module is updated to handle NULL values when locating an executable binary or a script using its absolute path.

* 4021057 (Tracking ID: 4010237)

SYMPTOM:
On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.

DESCRIPTION:
On RHEL, device files should have correct SELinux labels to avoid unauthorized access.

RESOLUTION:
Code changes done to set correct SELinux labels.

Patch ID: VRTSgab-7.4.2.2100

* 4046415 (Tracking ID: 4046413)

SYMPTOM:
After node addition/node deletion gab node count is not updated properly

DESCRIPTION:
gabconfig -m <node count> command displays error despite providing a correct node count

RESOLUTION:
There was a parsing issue which has been resolved by this fix

* 4046419 (Tracking ID: 4046418)

SYMPTOM:
gab startup does not fail even if llt is not configured

DESCRIPTION:
Since gab service depends on llt service, if llt service fails to start/is not configured, gab should not start

RESOLUTION:
This fix will prevent gab to start if llt is not configured

Patch ID: VRTSgab-7.4.2.1300

* 4021059 (Tracking ID: 4010237)

SYMPTOM:
On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.

DESCRIPTION:
On RHEL, device files should have correct SELinux labels to avoid unauthorized access.

RESOLUTION:
Code changes done to set correct SELinux labels.

Patch ID: VRTSgab-7.4.2.1100

* 4013034 (Tracking ID: 4011683)

SYMPTOM:
The GAB module failed to start and the system log messages indicate failures with the mknod command.

DESCRIPTION:
The mknod command fails to start the GAB module because its format is invalid. If the names of multiple drivers in an environment contain the value "gab" as a substring, all their major device numbers get passed on to the mknod command. Instead, the command must contain the major device number for the GAB driver only.

RESOLUTION:
This hotfix addresses the issue so that the GAB module starts successfully even when other driver names in the environment contain "gab" as a substring.

Patch ID: VRTSllt-7.4.2.2100

* 4039475 (Tracking ID: 4045607)

SYMPTOM:
LLT over UDP support for transmission and reception of data over 1500 MTU networks.

DESCRIPTION:
The UDP multiport feature in LLT performs poorly in case of 1500 MTU-based networks. Data packets larger than 1500 bytes cannnot be transmitted over 1500 MTU-based networks, so the IP layer fragments them appropriately for transmission. The loss of a single fragment from the set leads to a total packet (I/O) loss. LLT then retransmits the same packet repeatedly until the transmission is successful. Eventually, you may encounter issues with the Flexible Storage Sharing (FSS) feature. For example, the vxprint process or the disk group creation process may stop responding, or the I/O-shipping performance may degrade severely.

RESOLUTION:
The UDP multiport feature of LLT is updated to fragment the packets such that they can be accommodated in the 1500-byte network frame. The fragments are rearranged on the receiving node at the LLT layer. Thus, LLT can track every fragment to the destination, and in case of transmission failures, retransmit the lost fragments based on the current RTT time.

* 4046200 (Tracking ID: 4046199)

SYMPTOM:
llt over udp configuration now accepts any link tag name

DESCRIPTION:
Previously for llt over udp configuration, the tag field in link definition had to be the ethernet interface name. With this fix any string can be used as a tag name

RESOLUTION:
Any string can be used as link tag name with this fix

* 4046420 (Tracking ID: 3989372)

SYMPTOM:
When the CPU load and memory consumption is high in a VMware environment, some nodes in an InfoScale cluster may get fenced out.

DESCRIPTION:
Occasionally, in a VMware environment, the operating system may not schedule LLT contexts on time. Consequently, heartbeats from some of the cluster nodes may be lost, and those nodes may get fenced out. This situation typically occurs when the CPU load or the memory usage is high or when the VMDK snapshot or vMotion operations are in progress.

RESOLUTION:
This fix attempts to make clusters more resilient to transient issues by heartbeating using threads bound to every vCPU.

Patch ID: VRTSllt-7.4.2.1300

* 4019535 (Tracking ID: 4018581)

SYMPTOM:
The LLT module fails to start and the system log messages indicate missing IP address.

DESCRIPTION:
When only the low priority LLT links are configured over UDP, UDPBurst mode must be disabled. UDPBurst mode must only be enabled when the high priority LLT links are configured over UDP. If the UDPBurst mode gets enabled while configuring the low priority links, the LLT module fails to start and logs the following error: "V-14-2-15795 missing ip address / V-14-2-15800 UDPburst:Failed to get link info".

RESOLUTION:
This hotfix updates the LLT module to not enable the UDPBurst mode when only the low priority LLT links are configured over UDP.

* 4021058 (Tracking ID: 4010237)

SYMPTOM:
On Red Hat Enterprise Linux operating system, device files do not have correct SELinux label.

DESCRIPTION:
On RHEL, device files should have correct SELinux labels to avoid unauthorized access.

RESOLUTION:
Code changes done to set correct SELinux labels.

Patch ID: VRTSllt-7.4.2.1100

* 4013036 (Tracking ID: 3985775)

SYMPTOM:
Sometimes, the system log may get flooded with LLT heartbeat loss messages that do not necessarily indicate any actual issues with LLT.

DESCRIPTION:
LLT heartbeat loss messages can appear in the system log either due to actual heartbeat drops in the network or due to heartbeat packets arriving out of order. In either case, these messages are only informative and do not indicate any issue in the LLT functionality. Sometimes, the system log may get flooded with these messages, which are not useful.

RESOLUTION:
The LLT module is updated to lower the frequency of printing LLT heartbeat loss messages. This is achieved by increasing the number of missed sequential HB packets required to print this informative message.

* 4013955 (Tracking ID: 4013953)

SYMPTOM:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 9(RHEL7.9).

DESCRIPTION:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 
versions later than RHEL7 Update 8.

RESOLUTION:
Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 
9(RHEL7.9) is now introduced.

Patch ID: VRTSgms-7.4.2.1200

* 4023471 (Tracking ID: 4023472)

SYMPTOM:
VRTSgms module is not able to load on linux.

DESCRIPTION:
Need recompilation of VRTSgms due to recent changes in VRTSgms 
due to which some symbols are not being resolved.

RESOLUTION:
Recompiled the VRTSgms to load vxgms module.

Patch ID: VRTSglm-7.4.2.1500

* 4014715 (Tracking ID: 4011596)

SYMPTOM:
It throws error saying "No such file or directory present"

DESCRIPTION:
Bug observed during parallel communication between all the nodes. Some required temp files were not present on other nodes.

RESOLUTION:
Fixed to have consistency maintained while parallel node communication. Using hacp for transferring temp files

Patch ID: VRTSglm-7.4.2.1300

* 4019695 (Tracking ID: 4011596)

SYMPTOM:
Man page is missing details about of the feature we support

DESCRIPTION:
Need to include new option "-h" in glmdump in man page for using hacli utility for communicating across the nodes in the cluster.

RESOLUTION:
Added the details about the feature supported by glmdump in man page

Patch ID: VRTSglm-7.4.1.1700

* 3999030 (Tracking ID: 3999029)

SYMPTOM:
GLM module failed to unload because of VCS service hold.

DESCRIPTION:
GLM module was failed to unload during systemd shutdown because glm service was racing with vcs service. VCS takes hold on GLM which was resulting in failing to unload the module.

RESOLUTION:
Code is modified to add vcs service dependency in glm.service during systemd shutdown.

Patch ID: VRTScavf-7.4.2.1300

* 4007226 (Tracking ID: 4007196)

SYMPTOM:
fsck being set on the filesystem created on fss shared volume during cluster shutdown

DESCRIPTION:
fsck is being set on vxfs filesystem in case of cluster shutdown when the fss shared volume deports/disables on one node before the cfsmount resource on other node goes down

RESOLUTION:
Added code in the offline routine of cvmvoldg agent to hold the volume disable/deport if the filesystem is still mounted in case of cluster shutdown. This prevents the volume from being unavailable during cluster shutdown and fsck is not set on the filesystem

Patch ID: VRTSpython-3.7.4.35

* 4049693 (Tracking ID: 4049692)

SYMPTOM:
In order to support Licensing module, VRTSpython must include additional modules in it.

DESCRIPTION:
Licensing module utilizes VRTSpython package which needs additional modules to be added in VRTSpython.

RESOLUTION:
VRTSpython packages has been updated to include additional python modules in it.

Patch ID: VRTSvcs-4.01.742.300

* 4049416 (Tracking ID: 4049416)

SYMPTOM:
Frequent Security vulnerabilities reported in JRE

DESCRIPTION:
There are many vulnerabilities reported in JRE every quarter. To overcome this vulnerabilities issue migrate Telemetry Collector from Java to Python.
All other behavior of Telemetry Collector will remain the same.

RESOLUTION:
Migrated Telemetry Collector from Java to Python.

Patch ID: VRTSsfmh-vom-HF0742501

* 4049522 (Tracking ID: 4049521)

SYMPTOM:
N/A

DESCRIPTION:
VIOM Agent for InfoScale 7.4.2 Update3

RESOLUTION:
N/A

Patch ID: VRTSvxvm-7.4.2.2200

* 4037283 (Tracking ID: 4021301)

SYMPTOM:
Data corruption issue happened with the big size IO processed by Linux kernel IO split on RHEL8.

DESCRIPTION:
On RHEL8 or as of Linux kernel 3.13, it introduces some changes in Linux kernel block layer, new item of the bio iterator structure is used to represent the start offset of bio or bio vectors after the IO processed by Linux kernel IO split functions. Also, in recent version of vxfs, it can generate bio with larger size than the size limitation defined within Linux kernel block layer and VxVM, which lead the IO from vxfs could be split by Linux kernel. For such split IOs, VxVM does not take the new item of the bio iterator into account while process them, which caused the data is written to wrong position of volume/disk. Hence, data corruption.

RESOLUTION:
Code changes have been made to bypass the Linux kernel IO split functions, which seems redundant for VxVM IO processing.

* 4018173 (Tracking ID: 3852146)

SYMPTOM:
In a CVM cluster, when importing a shared diskgroup specifying both -c and -o
noreonline options, the following error may be returned: 
VxVM vxdg ERROR V-5-1-10978 Disk group <dgname>: import failed: Disk for disk
group not found.

DESCRIPTION:
The -c option will update the disk ID and disk group ID on the private region
of the disks in the disk group being imported. Such updated information is not
yet seen by the slave because the disks have not been re-onlined (given that
noreonline option is specified). As a result, the slave cannot identify the
disk(s) based on the updated information sent from the master, causing the
import to fail with the error Disk for disk group not found.

RESOLUTION:
The code is modified to handle the working of the "-c" and "-o noreonline"
options together.

* 4018178 (Tracking ID: 3906534)

SYMPTOM:
After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device.

DESCRIPTION:
Currently /boot is mounted on top of OS (Operating System) device. When DMP
Native support is enabled, only VG's (Volume Groups) are migrated from OS 
device to DMP device.This is the reason /boot is not migrated to DMP device.
With this if OS device path is not available then system becomes unbootable 
since /boot is not available. Thus it becomes necessary to mount /boot on DMP
device to provide multipathing and resiliency.

RESOLUTION:
Code changes have been done to migrate /boot on top of DMP device when DMP
Native support is enabled.
Note - The code changes are currently implemented for RHEL-6 only. For other
linux platforms, /boot will still not be mounted on the DMP device

* 4039517 (Tracking ID: 4012763)

SYMPTOM:
IO hang may happen in VVR (Veritas Volume Replicator) configuration when SRL overflows for one rlink while another one rlink is in AUTOSYNC mode.

DESCRIPTION:
In VVR, if the SRL overflow happens for rlink (R1) and some other rlink (R2) is ongoing the AUTOSYNC, then AUTOSYNC is aborted for R2, R2 gets detached and DCM mode is activated on R1 rlink.

However, due to a race condition in code handling AUTOSYNC abort and DCM activation in parallel, the DCM could not be activated properly and IO which caused DCM activation gets queued incorrectly, this results in a IO hang.

RESOLUTION:
The code has been modified to fix the race issue in handling the AUTOSYNC abort and DCM activation at same time.

* 4046906 (Tracking ID: 3956607)

SYMPTOM:
When removing a VxVM disk using vxdg-rmdisk operation, the following error occurs requesting a disk reclaim.
VxVM vxdg ERROR V-5-1-0 Disk <device_name> is used by one or more subdisks which are pending to be reclaimed.
Use "vxdisk reclaim <device_name>" to reclaim space used by these subdisks, and retry "vxdg rmdisk" command.
Note: reclamation is irreversible. But when issuing vxdisk-reclaim as advised, the command dumps core.

DESCRIPTION:
In the disk-reclaim code path, memory allocation can fail at realloc() but the failure 
not detected, causing an invalid address to be referenced and a core dump results.

RESOLUTION:
The disk-reclaim code path now handles failure of realloc properly.

* 4046907 (Tracking ID: 4041001)

SYMPTOM:
When some nodes are rebooted in the system, nodes cannot join back as disk attach transactions are not
happening.

DESCRIPTION:
In VxVM, when some nodes are rebooted, some plexes of volume are detached. It may happen that all plexes
of volume are disabled. In this case, if all plexes of some DCO volume become inaccessible, that DCO
volume state should be marked as BADLOG.

If state is not marked BADLOG, transactions fail with following error.
VxVM ERROR V-5-1-10128  DCO experienced IO errors during the operation. Re-run the operation after ensuring that DCO is accessible

As the transactions are failing, system goes in hang state and nodes cannot join.

RESOLUTION:
The code is fixed to mark DCO state as BADLOG when all the plexes of DCO becomes inaccessible during IO load.

* 4046908 (Tracking ID: 4038865)

SYMPTOM:
System panick at vxdmp module with following calltrace in IRQ stack.
native_queued_spin_lock_slowpath
queued_spin_lock_slowpath
_raw_spin_lock_irqsave7
dmp_get_shared_lock
gendmpiodone
dmpiodone
bio_endio
blk_update_request
scsi_end_request
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
do_IRQ
 <IRQ stack>

DESCRIPTION:
A deadlock issue can happen between inode_hash_lock and DMP shared lock, when one process holding inode_hash_lock but acquires the DMP shared lock in IRQ context, in the mean time other process holding the DMP shared lock may acquire inode_hash_lock.

RESOLUTION:
Code changes have been done to avoid the deadlock issue.

* 4047592 (Tracking ID: 3992040)

SYMPTOM:
CFS-Stress-l2 hits assert f:vx_dio_bio_done:2

DESCRIPTION:
In RHEL8.0/SLES15 kernel code, The value in bi_status isn't a standard error code at and there are completely separate set of values that are all small positive integers (for example, BLK_STS_OK and BLK_STS_IOERROR) while actual errors sent by VM are different hence VM should send proper bi_status to FS with newer kernel.

RESOLUTION:
Code changes are done to have a map for bi_status and bi_error conversion( as it's been there in Linux Kernel code - blk-core.c)

* 4047595 (Tracking ID: 4009353)

SYMPTOM:
After the command, vxdmpadm settune dmp_native_support=on, machine goes into maintenance mode. Issue is produced on physical setup with root lvm disk

DESCRIPTION:
If there is '-' in native vgname, then the script is taking an inaccurate vgname.

RESOLUTION:
Code changes have been made to fix the issue.

* 4047695 (Tracking ID: 3911930)

SYMPTOM:
Valid PGR operations sometimes fail on a dmpnode.

DESCRIPTION:
As part of the PGR operations, if the inquiry command finds that PGR is not
supported on the dmpnode node, a flag PGR_FLAG_NOTSUPPORTED is set on the
dmpnode.
Further PGR operations check this flag and issue PGR commands only if this flag
is
NOT set.
This flag remains set even if the hardware is changed so as to support PGR.

RESOLUTION:
A new command (namely enablepr) is provided in the vxdmppr utility to clear this
flag on the specified dmpnode.

* 4047722 (Tracking ID: 4023390)

SYMPTOM:
Vxconfigd crashes as a disk contains invalid privoffset(160), which is smaller than minimum required offset(VTOC 265, GPT 208).

DESCRIPTION:
There may have disk label corruption or stale information residents on the disk header, which caused unexpected label written.

RESOLUTION:
Add a assert when updating CDS label to ensure the valid privoffset written to disk header.

Patch ID: VRTSvxvm-7.4.2.1500

* 4018182 (Tracking ID: 4008664)

SYMPTOM:
System panic occurs with the following stack:

void genunix:psignal+4()
void vxio:vol_logger_signal_gen+0x40()
int vxio:vollog_logentry+0x84()
void vxio:vollog_logger+0xcc()
int vxio:voldco_update_rbufq_chunk+0x200()
int vxio:voldco_chunk_updatesio_start+0x364()
void vxio:voliod_iohandle+0x30()
void vxio:voliod_loop+0x26c((void *)0)
unix:thread_start+4()

DESCRIPTION:
Vxio keeps vxloggerd proc_t that is used to send a signal to vxloggerd. In case vxloggerd has been ended for some reason, the signal may be sent to an unexpected process, which may cause panic.

RESOLUTION:
Code changes have been made to correct the problem.

* 4020207 (Tracking ID: 4018086)

SYMPTOM:
vxiod with ID as 128 was stuck with below stack:

 #2 [] vx_svar_sleep_unlock at [vxfs]
 #3 [] vx_event_wait at [vxfs]
 #4 [] vx_async_waitmsg at [vxfs]
 #5 [] vx_msg_send at [vxfs]
 #6 [] vx_send_getemapmsg at [vxfs]
 #7 [] vx_cfs_getemap at [vxfs]
 #8 [] vx_get_freeexts_ioctl at [vxfs]
 #9 [] vxportalunlockedkioctl at [vxportal]
 #10 [] vxportalkioctl at [vxportal]
 #11 [] vxfs_free_region at [vxio]
 #12 [] vol_ru_start_replica at [vxio]
 #13 [] vol_ru_start at [vxio]
 #14 [] voliod_iohandle at [vxio]
 #15 [] voliod_loop at [vxio]

DESCRIPTION:
With SmartMove feature as ON, it can happen vxiod with ID as 128 starts replication where RVG was in DCM mode, this vxiod is waiting for filesystem's response if a given region is used by filesystem or not. Filesystem will trigger MDSHIP IO on logowner. Due to a bug in code, MDSHIP IO always gets queued in vxiod with ID as 128. Hence a dead lock situation.

RESOLUTION:
Code changes have been made to avoid handling MDSHIP IO in vxiod whose ID is bigger than 127.

* 4020438 (Tracking ID: 4020046)

SYMPTOM:
The following IO errors are reported on VxVM sub-disks result in DRL log detached without any SCSI errors detected.

VxVM vxio V-5-0-1276 error on Subdisk [xxxx] while writing volume [yyyy][log] offset 0 length [zzzz]
VxVM vxio V-5-0-145 DRL volume yyyy[log] is detached

DESCRIPTION:
DRL plexes detached as an atomic write flag (BIT_ATOMIC) was set on BIO unexpectedly. The BIT_ATOMIC flag gets set on bio only if VOLSIO_BASEFLAG_ATOMIC_WRITE flag is set on SUBDISK SIO and its parent MVWRITE SIO's sio_base_flags. When generating MVWRITE SIO,  it's sio_base_flags was copied from a gio structure, because the gio structure memory isn't initialized it may contain gabarge values, hence the issue.

RESOLUTION:
Code changes have been made to fix the issue.

* 4021238 (Tracking ID: 4008075)

SYMPTOM:
Observed with ASL changes for NVMe, This issue observed in reboot scenario. For every reboot machine was hitting panic And this was happening in loop.

DESCRIPTION:
panic was hitting for such splitted bios, root cause for this is RHEL8 introduced a new field named as __bi_remaining.
where __bi_remaining is maintanins the count of chained bios, And for every endio that __bi_remaining gets atomically decreased in bio_endio() function.
While decreasing __bi_remaining OS checks that the __bi_remaining 'should not <= 0' and in our case __bi_remaining was always 0 and we were hitting OS
BUG_ON.

RESOLUTION:
>>> For scsi devices maxsize is 4194304,
[   26.919333] DMP_BIO_SIZE(orig_bio) : 16384, maxsize: 4194304
[   26.920063] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 4194304

>>>and for NVMe devices maxsize is 131072
[  153.297387] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 131072
[  153.298057] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 131072

* 4021240 (Tracking ID: 4010612)

SYMPTOM:
$ vxddladm set namingscheme=ebn lowercase=no
This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on. means every nvme/ssd disks names would be 
hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....

DESCRIPTION:
$ vxddladm set namingscheme=ebn lowercase=no
This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on.
means every nvme/ssd disks names would be hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....
eg.
smicro125_nvme0_0 <--- disk1
smicro125_nvme1_0 <--- disk2

for lowercase=no our current code is suppressing the suffix digit of enclosurname and hence multiple disks gets same name and it is showing udid_mismatch 
because whatever udid of private region is not matching with ddl. ddl database showing wrong info because of multiple disks gets same name.

smicro125_nvme_0 <--- disk1   <<<<<<<-----here suffix digit of nvme enclosure suppressed
smicro125_nvme_0 <--- disk2

RESOLUTION:
Append the suffix integer while making da_name

* 4021346 (Tracking ID: 4010207)

SYMPTOM:
System panic occurred with the below stack:

native_queued_spin_lock_slowpath()
queued_spin_lock_slowpath()
_raw_spin_lock_irqsave()
volget_rwspinlock()
volkiodone()
volfpdiskiodone()
voldiskiodone_intr()
voldmp_iodone()
bio_endio()
gendmpiodone()
dmpiodone()
bio_endio()
blk_update_request()
scsi_end_request()
scsi_io_completion()
scsi_finish_command()
scsi_softirq_done()
blk_done_softirq()
__do_softirq()
call_softirq()

DESCRIPTION:
As part of collecting the IO statistics collection, the vxstat thread acquires a spinlock and tries to copy data to the user space. During the data copy, if some page fault happens, then the thread would relinquish the CPU and provide the same to some other thread. If the thread which gets scheduled on the CPU requests the same spinlock which vxstat thread had acquired, then this results in a hard lockup situation.

RESOLUTION:
Code has been changed to properly release the spinlock before copying out the data to the user space during vxstat collection.

* 4021359 (Tracking ID: 4010040)

SYMPTOM:
A security issue occurs during Volume Manager configuration.

DESCRIPTION:
This issue occurs during the configuration of the VRTSvxvm package.

RESOLUTION:
VVR daemon is updated so that this security issue no longer occurs.

* 4021366 (Tracking ID: 4008741)

SYMPTOM:
VxVM device files appears to have device_t SELinux label.

DESCRIPTION:
If an unauthorized or modified device is allowed to exist on the system, there is the possibility the system may perform unintended or unauthorized operations.
eg: ls -LZ
...
...
/dev/vx/dsk/testdg/vol1   system_u:object_r:device_t:s0
/dev/vx/dmpconfig         system_u:object_r:device_t:s0
/dev/vx/vxcloud           system_u:object_r:device_t:s0

RESOLUTION:
Code changes made to change the device labels to misc_device_t, fixed_disk_device_t.

* 4021428 (Tracking ID: 4020166)

SYMPTOM:
Build issue becuase of "struct request"

error: struct request has no member named next_rq
Linux has deprecated the member next_req

DESCRIPTION:
The issue was observed due to changes in OS structure

RESOLUTION:
code changes are done in required files

* 4021748 (Tracking ID: 4020260)

SYMPTOM:
While enabling dmp native support tunable dmp_native_support for Centos 8 below mentioned error was observed:

[root@dl360g9-4-vm2 ~]# vxdmpadm settune dmp_native_support=on
VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume groups

VxVM vxdmpadm ERROR V-5-1-15686 The following vgs could not be migrated as error in bootloader configuration file 

 cl
[root@dl360g9-4-vm2 ~]#

DESCRIPTION:
The issue was observed due to missing code check-ins for CentOS 8 in the required files.

RESOLUTION:
Changes are done in required files for dmp native support in CentOS 8

* 4023095 (Tracking ID: 4007920)

SYMPTOM:
vol_snap_fail_source tunable is set still largest and oldest snapshot automatically deleted when cache object becomes full

DESCRIPTION:
If vol_snap_fail_source tunable is set then oldest snapshot should not be deleted in case of cache object full. Flex requires these snapshots for rollback.

RESOLUTION:
Added fix to stop auto snapshot deletion in vxcached

Patch ID: VRTSvxvm-7.4.2.1400

* 4018180 (Tracking ID: 3958062)

SYMPTOM:
After migrating boot lun, disabling dmp_native_support fails with following error.

VxVM vxdmpadm ERROR V-5-1-15883 check_bosboot open failed /dev/r errno 2
VxVM vxdmpadm ERROR V-5-1-15253 bosboot would not succeed, please run  
manually to find the cause of failure
VxVM vxdmpadm ERROR V-5-1-15251 bosboot check failed
VxVM vxdmpadm INFO V-5-1-18418 restoring protofile
+ final_ret=18
+ f_exit 18
VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume 
groups

VxVM vxdmpadm ERROR V-5-1-15686 The following VG(s) could not be migrated as 
could not disable DMP support for LVM bootability -
        rootvg

DESCRIPTION:
After performing the boot lun migration, while enabling/disabling the DMP native support, 
VxVM was performing 'bosboot' verification with the old boot disk name, instead of the migrated disk. 
The reason was one AIX OS command was returning the old boot disk name.

RESOLUTION:
The code is changed to use correct OS command to get the boot disk name after migration.

* 4018182 (Tracking ID: 4008664)

SYMPTOM:
System panic occurs with the following stack:

void genunix:psignal+4()
void vxio:vol_logger_signal_gen+0x40()
int vxio:vollog_logentry+0x84()
void vxio:vollog_logger+0xcc()
int vxio:voldco_update_rbufq_chunk+0x200()
int vxio:voldco_chunk_updatesio_start+0x364()
void vxio:voliod_iohandle+0x30()
void vxio:voliod_loop+0x26c((void *)0)
unix:thread_start+4()

DESCRIPTION:
Vxio keeps vxloggerd proc_t that is used to send a signal to vxloggerd. In case vxloggerd has been ended for some reason, the signal may be sent to an unexpected process, which may cause panic.

RESOLUTION:
Code changes have been made to correct the problem.

* 4020207 (Tracking ID: 4018086)

SYMPTOM:
vxiod with ID as 128 was stuck with below stack:

 #2 [] vx_svar_sleep_unlock at [vxfs]
 #3 [] vx_event_wait at [vxfs]
 #4 [] vx_async_waitmsg at [vxfs]
 #5 [] vx_msg_send at [vxfs]
 #6 [] vx_send_getemapmsg at [vxfs]
 #7 [] vx_cfs_getemap at [vxfs]
 #8 [] vx_get_freeexts_ioctl at [vxfs]
 #9 [] vxportalunlockedkioctl at [vxportal]
 #10 [] vxportalkioctl at [vxportal]
 #11 [] vxfs_free_region at [vxio]
 #12 [] vol_ru_start_replica at [vxio]
 #13 [] vol_ru_start at [vxio]
 #14 [] voliod_iohandle at [vxio]
 #15 [] voliod_loop at [vxio]

DESCRIPTION:
With SmartMove feature as ON, it can happen vxiod with ID as 128 starts replication where RVG was in DCM mode, this vxiod is waiting for filesystem's response if a given region is used by filesystem or not. Filesystem will trigger MDSHIP IO on logowner. Due to a bug in code, MDSHIP IO always gets queued in vxiod with ID as 128. Hence a dead lock situation.

RESOLUTION:
Code changes have been made to avoid handling MDSHIP IO in vxiod whose ID is bigger than 127.

* 4021346 (Tracking ID: 4010207)

SYMPTOM:
System panic occurred with the below stack:

native_queued_spin_lock_slowpath()
queued_spin_lock_slowpath()
_raw_spin_lock_irqsave()
volget_rwspinlock()
volkiodone()
volfpdiskiodone()
voldiskiodone_intr()
voldmp_iodone()
bio_endio()
gendmpiodone()
dmpiodone()
bio_endio()
blk_update_request()
scsi_end_request()
scsi_io_completion()
scsi_finish_command()
scsi_softirq_done()
blk_done_softirq()
__do_softirq()
call_softirq()

DESCRIPTION:
As part of collecting the IO statistics collection, the vxstat thread acquires a spinlock and tries to copy data to the user space. During the data copy, if some page fault happens, then the thread would relinquish the CPU and provide the same to some other thread. If the thread which gets scheduled on the CPU requests the same spinlock which vxstat thread had acquired, then this results in a hard lockup situation.

RESOLUTION:
Code has been changed to properly release the spinlock before copying out the data to the user space during vxstat collection.

* 4021428 (Tracking ID: 4020166)

SYMPTOM:
Build issue becuase of "struct request"

error: struct request has no member named next_rq
Linux has deprecated the member next_req

DESCRIPTION:
The issue was observed due to changes in OS structure

RESOLUTION:
code changes are done in required files

* 4021748 (Tracking ID: 4020260)

SYMPTOM:
While enabling dmp native support tunable dmp_native_support for Centos 8 below mentioned error was observed:

[root@dl360g9-4-vm2 ~]# vxdmpadm settune dmp_native_support=on
VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume groups

VxVM vxdmpadm ERROR V-5-1-15686 The following vgs could not be migrated as error in bootloader configuration file 

 cl
[root@dl360g9-4-vm2 ~]#

DESCRIPTION:
The issue was observed due to missing code check-ins for CentOS 8 in the required files.

RESOLUTION:
Changes are done in required files for dmp native support in CentOS 8

Patch ID: VRTSvxvm-7.4.2.1300

* 4008606 (Tracking ID: 4004455)

SYMPTOM:
snapshot restore failed on a instant_snapshot created on older version DG

DESCRIPTION:
create a DG with older version, create a instant snapshot, 
do some IOs on source volume.
try to restore the snapshot.
snapshot failed for this scenario.

RESOLUTION:
rca for this issue is there flag values were conflicting.
fixed this issue code has been checkedin

* 4010892 (Tracking ID: 4009107)

SYMPTOM:
CA chain certificate verification fails in VVR when the number of intermediate certificates is greater than the depth. So, we get error in SSL initialization.

DESCRIPTION:
CA chain certificate verification fails in VVR when the number of intermediate certificates is greater than the depth. SSL_CTX_set_verify_depth() API decides the depth of certificates (in /etc/vx/vvr/cacert file) to be verified, which is limited to count 1 in code. Thus intermediate CA certificate present first  in /etc/vx/vvr/cacert (depth 1  CA/issuer certificate for server certificate) could be obtained and verified during connection, but root CA certificate (depth 2  higher CA certificate) could not be verified while connecting and hence the error.

RESOLUTION:
Removed the call of SSL_CTX_set_verify_depth() API so as to handle the depth automatically.

* 4011866 (Tracking ID: 3976678)

SYMPTOM:
vxvm-recover:  cat: write error: Broken pipe error encountered in syslog multiple times.

DESCRIPTION:
Due to a bug in vxconfigbackup script which is started by vxvm-recover "cat : write error: Broken pipe" is encountered in syslog 
and it is reported under vxvm-recover. In vxconfigbackup code multiple subshells are created in a function call and the first subshell is for cat command. When a particular if condition is satistfied, return is called exiting the later subshells even when there is data to be read in the created cat subshell, which results in broken pipe error.

RESOLUTION:
Changes are done in VxVM code to handle the broken pipe error.

* 4011971 (Tracking ID: 3991668)

SYMPTOM:
Configured with sec logging, VVR reports data inconsistency when hit "No IBC message arrived" error.

DESCRIPTION:
It might happen seconday node served updates with larger sequence ID when In-Band Control (IBC) update arrived. In this case, VVR will drop the IBC update. Any updates whose sequence ID are larger couldn't start data volume writes. They will get queued. Data lost will happen when seconary receives automic commit and clear the queue. Hence vradmin verifydata reports data inconsistency.

RESOLUTION:
Code changes have been made to trigger updates in order to start data volume writes.

* 4012730 (Tracking ID: 4012728)

SYMPTOM:
VxVM support for RHEL 7.9

DESCRIPTION:
RHEL 7.9 is a new release and hence VxVM module is tested with RHEL 7.9 kernel .

RESOLUTION:
This VxVM modules is compatible with RHEL 7.9

* 4012848 (Tracking ID: 4011394)

SYMPTOM:
As a part of verifying the performance of CFS cloud tiering verses scale out file system tiering in Access, it was found that CFS cloud tiering performance was degraded.

DESCRIPTION:
On verifying the performance of CFS cloud tiering verses scale out file system tiering in Access, it was found that CFS cloud tiering performance was degraded because the design was single threaded which was causing bottleneck and performance issues.

RESOLUTION:
Code Changes are around Multiple IO queues in the kernel, Multithreaded request loop to fetch IOs from kernel queues into userland global queue and Allow curl threads to work in parallel.

* 4013155 (Tracking ID: 4010458)

SYMPTOM:
In VVR (Veritas Volume replicator), the rlink might inconsistently disconnect due to unexpected transactions with below messages:
VxVM VVR vxio V-5-0-114 Disconnecting rlink <rlink_name> to permit transaction to proceed

DESCRIPTION:
In VVR (Veritas Volume replicator), a transaction is triggered when a change in the VxVM/VVR objects needs 
to be persisted on disk. 

In some scenario, few unnecessary transactions were getting triggered in loop. This was causing multiple rlink
disconnects with below message logged frequently:
VxVM VVR vxio V-5-0-114 Disconnecting rlink <rlink_name> to permit transaction to proceed

One such unexpected transaction was happening due to open/close on volume as part of SmartIO caching.
Additionally, vradmind daemon was also issuing some open/close on volumes as part of IO statistics collection,
which was causing unnecessary transactions. 

Additionally some unexpected transactions were happening due to incorrect checks in code related
to some temporary flags on volume.

RESOLUTION:
The code is fixed to disable the SmartIO caching on the volumes if the SmartIO caching is not configured on the system.
Additionally code is fixed to avoid the unexpected transactions due to incorrect checking on the temporary flags
on volume.

* 4013169 (Tracking ID: 4011691)

SYMPTOM:
Observed high CPU consumption on the VVR secondary nodes because of high pending IO load.

DESCRIPTION:
High replication related IO load on the VVR secondary and the requirement of maintaining write order fidelity with limited memory pools created  contention. This resulted in multiple VxVM kernel threads contending for shared resources and there by increasing the CPU consumption.

RESOLUTION:
Limited the way in which VVR consumes its resources so that a high pending IO load would not result into high CPU consumption.

* 4013718 (Tracking ID: 4008942)

SYMPTOM:
file system gets disabled when cache object gets full and hence unmount is failing.

DESCRIPTION:
When cache object gets full, IO errors comes on volume. 
Because IOs are not getting served as cache object is full so there is inconsistency of IOs.
Because of IO inconsistency vxfs gets disabled and unmount failed

RESOLUTION:
Fixed the issue and code has been checkedin

Patch ID: VRTSaslapm-7.4.2.2200

* 4047510 (Tracking ID: 4042420)

SYMPTOM:
APM modules fails to load as hard link does not get created.

DESCRIPTION:
A symlink needs to be created for the apm to get created in different partition.

RESOLUTION:
The code changes will create symlink instead of hardlinks in order to facilitate the link creation when source and destination are at different partition and also to honor thr general script flow.

Patch ID: VRTSodm-7.4.2.2200

* 4049440 (Tracking ID: 4049438)

SYMPTOM:
VRTSodm driver will not load with 7.4.2.2200 VRTSvxfs patch.

DESCRIPTION:
Need recompilation of VRTSodm due to recent changes in VRTSvxfs.

RESOLUTION:
Recompiled the VRTSodm with new changes in VRTSvxfs.

Patch ID: VRTSodm-7.4.2.1500

* 4023468 (Tracking ID: 4023469)

SYMPTOM:
VRTSodm module is not able to load on linux.

DESCRIPTION:
Need recompilation of VRTSodm due to recent changes in VRTSodm 
due to which some symbols are not being resolved.

RESOLUTION:
Recompiled the VRTSodm to load vxodm module.

Patch ID: VRTSodm-7.4.2.1300

* 4013985 (Tracking ID: 4013984)

SYMPTOM:
VRTSodm driver will not load with 7.4.2.1300 VRTSvxfs patch.

DESCRIPTION:
Need recompilation of VRTSodm due to recent changes in VRTSvxfs 
header files due to which some symbols are not being resolved.

RESOLUTION:
Recompiled the VRTSodm with new changes in VRTSvxfs header files.

Patch ID: VRTSvxfs-7.4.2.2200

* 4013420 (Tracking ID: 4013139)

SYMPTOM:
The abort operation on an ongoing online migration from the native file system to VxFS on RHEL 8.x systems.

DESCRIPTION:
The following error messages are logged when the abort operation fails:
umount: /mnt1/lost+found/srcfs: not mounted
UX:vxfs fsmigadm: ERROR: V-3-26835:  umount of source device: /dev/vx/dsk/testdg/vol1 failed, with error: 32

RESOLUTION:
The fsmigadm utility is updated to address the issue with the abort operation on an ongoing online migration.

* 4040238 (Tracking ID: 4035040)

SYMPTOM:
After replication job paused and resumed some of the fields got missed in stats command output and never shows missing fields on onward runs.

DESCRIPTION:
rs_start for the current stat initialized to the start time of the replication and default value of rs_start is zero.
Stat don't show some fields in-case rc_start is zero.

        if (rs->rs_start && dis_type == VX_DIS_CURRENT) {
                if (!rs->rs_done) {
                        diff = rs->rs_update - rs->rs_start;
                }
                else {
                        diff = rs->rs_done - rs->rs_start;
                }

                /*
                 * The unit of time is in seconds, hence
                 * assigning 1 if the amount of data
                 * was too small
                 */

                diff = diff ? diff : 1;
                rate = rs->rs_file_bytes_synced /
                        (diff - rs->rs_paused_duration);
                printf("\t\tTransfer Rate: %s/sec\n", fmt_bytes(h,rate));
        }

In replication we initialize the rs_start to zero and update with the start time but we don't save the stats to disk. That small window leave a case where
in-case, we pause the replication and start again we always see the rs_start to zero.

Now after initializing the rs_start we write to disk in the same function. In-case in resume case we found rs_start to zero, we again initialize the rs_start 
field to current replication start time.

RESOLUTION:
Write rs_start to disk and added a check in resume case to initialize rs_start value in-case found 0.

* 4040608 (Tracking ID: 4008616)

SYMPTOM:
fsck command got hung.

DESCRIPTION:
fsck got stuck due to deadlock when a thread which marked buffer aliased is waiting for itself for the reference drain, while
getting block code was called with NOBLOCK flag.

RESOLUTION:
honour NOBLOCK flag

* 4042686 (Tracking ID: 4042684)

SYMPTOM:
Command fails to resize the file.

DESCRIPTION:
There is a window where a parallel thread can clear IDELXWRI flag which it should not.

RESOLUTION:
setting the delayed extending write flag incase any parallel thread has cleared it.

* 4044184 (Tracking ID: 3993140)

SYMPTOM:
In every 60 seconds, compclock was lagging behind approximate 1.44 seconds from actual time elapsed.

DESCRIPTION:
In every 60 seconds, compclock was lagging behind approximate 1.44 seconds from actual time elapsed.

RESOLUTION:
Made adjustment to logic responsible for calculating and updating compclock timer.

* 4046265 (Tracking ID: 4037035)

SYMPTOM:
Added new tunable "vx_ninact_proc_threads" to control the number of inactive processing threads.

DESCRIPTION:
On high end servers, heavy lock contention was seen during inactive removal processing, which was caused by the large number of inactive worker threads spawned by VxFS. To avoid the contention, new tunable "vx_ninact_proc_threads" was added so that customer can adjust the number of inactive processing threads based on their server config and workload.

RESOLUTION:
Added new tunable "vx_ninact_proc_threads" to control the number of inactive processing threads.

* 4046266 (Tracking ID: 4043084)

SYMPTOM:
panic in vx_cbdnlc_lookup

DESCRIPTION:
Panic observed in the following stack trace:
vx_cbdnlc_lookup+000140 ()
vx_int_lookup+0002C0 ()
vx_do_lookup2+000328 ()
vx_do_lookup+0000E0 ()
vx_lookup+0000A0 ()
vnop_lookup+0001D4 (??, ??, ??, ??, ??, ??)
getFullPath+00022C (??, ??, ??, ??)
getPathComponents+0003E8 (??, ??, ??, ??, ??, ??, ??)
svcNameCheck+0002EC (??, ??, ??, ??, ??, ??, ??)
kopen+000180 (??, ??, ??)
syscall+00024C ()

RESOLUTION:
Code changes to handle memory pressure while changing FC connectivity

* 4046267 (Tracking ID: 4034910)

SYMPTOM:
Garbage values inside global list large_dirinfo.

DESCRIPTION:
Garbage values inside global list large_dirinfo, which will lead to fsck failure.

RESOLUTION:
Make access/updataion to global list large_dirinfo synchronous throughout the fsck binary, so that garbage values due to race condition can be avoided.

* 4046271 (Tracking ID: 3993822)

SYMPTOM:
running fsck on a file system core dumps

DESCRIPTION:
buffer was marked as busy without taking buffer lock while getting buffer from freelist in 1 thread and there was another thread 
that was accessing this buffer through its local variable

RESOLUTION:
marking buffer busy within the buffer lock while getting free buffer.

* 4046272 (Tracking ID: 4017104)

SYMPTOM:
Deleting a huge number of inodes can consume a lot of system resources during inactivations which cause hangs or even panic.

DESCRIPTION:
Delicache inactivations dumps all the inodes in its inventory, all at once for inactivation. This causes a surge in the resource consumptions due to which other processes can starve.

RESOLUTION:
Gradually process the inode inactivation.

* 4046829 (Tracking ID: 3993943)

SYMPTOM:
The fsck utility hit the coredump due to segmentation fault in get_dotdotlst().

Below is stack trace of the issue.

get_dotdotlst 
check_dotdot_tbl 
iproc_do_work
start_thread 
clone ()

DESCRIPTION:
Due to a bug in fsck utility the coredump was generated while running the fsck on the filesystem. The fsck operation aborted in between due to the coredump.

RESOLUTION:
Code changes are done to fix this issue

* 4047568 (Tracking ID: 4046169)

SYMPTOM:
On RHEL8, while doing a directory move from one FS (ext4 or vxfs) to migration VxFS, the migration can fail and FS will be disable. In debug testing, the issue was caught by internal assert, with following stack trace.

panic
ted_call_demon
ted_assert
vx_msgprint
vx_mig_badfile
vx_mig_linux_removexattr_int
__vfs_removexattr
__vfs_removexattr_locked
vfs_removexattr
removexattr
path_removexattr
__x64_sys_removexattr
do_syscall_64

DESCRIPTION:
Due to different implementation of "mv" operation in RHEL8 (as compared to RHEL7), there is a removexattr call on the target FS - which in migration case will be migration VxFS. In this removexattr call, kernel asks "system.posix_acl_default" attribute to be removed from the directory to be moved. But since the directory is not present on the target side yet (and hence no extended attributes for the directory), the code returns ENODATA. When code in vx_mig_linux_removexattr_int() encounter this error, it disables the FS and in debug pkg calls assert.

RESOLUTION:
The fix is to ignore ENODATA error and not assert or disable the FS.

* 4049091 (Tracking ID: 4035057)

SYMPTOM:
On RHEL8, IOs done on FS, while other FS to VxFS migration is in progress can cause panic, with following stack trace.
 machine_kexec
 __crash_kexec
 crash_kexec
 oops_end
 no_context
 do_page_fault
 page_fault
 [exception RIP: memcpy+18]
 _copy_to_iter
 copy_page_to_iter
 generic_file_buffered_read
 new_sync_read
 vfs_read
 kernel_read
 vx_mig_read
 vfs_read
 ksys_read
 do_syscall_64

DESCRIPTION:
- As part of RHEL8 support changes, vfs_read, vfs_write calls were replaced with kernel_read, kernel_write as the vfs_ calls are no longer exported. The kernel_read, kernel_write calls internally set the memory segment of the thread to KERNEL_DS and expects the buffer passed to have been allocated in kernel space.
- In migration code, if the read/write operation cannot be completed using target FS (VxFS), then the IO is redirected to source FS. And in doing so, the code passes the same buffer - which is a user buffer to kernel call. This worked well with vfs_read, vfs_write calls. But is does not work with kernel_read, kernel_write calls, causing a panic.

RESOLUTION:
- Fix is to use vfs_iter_read, vfs_iter_write calls, which work with user buffer. To use these methods the user buffer needs to passed as part of struct iovec.iov_base

* 4049097 (Tracking ID: 4049096)

SYMPTOM:
Tar command errors out with 1 throwing warnings.

DESCRIPTION:
This is happening due to dalloc which is changing the ctime of the file after allocating the extents `(worklist thread)->vx_dalloc_flush -> vx_dalloc_off` in between the 2 fsstat calls in tar.

RESOLUTION:
Avoiding changing ctime while allocating delayed extents in background.

Patch ID: VRTSvxfs-7.4.2.1600

* 4012765 (Tracking ID: 4011570)

SYMPTOM:
WORM attribute replication support in VxFS.

DESCRIPTION:
WORM attribute replication is not supported in VFR. Modified code to replicate WORM attribute during attribute processing in VFR.

RESOLUTION:
Code is modified to replicate WORM attributes in VFR.

* 4014720 (Tracking ID: 4011596)

SYMPTOM:
It throws error saying "No such file or directory present"

DESCRIPTION:
Bug observed during parallel communication between all the nodes. Some required temp files were not present on other nodes.

RESOLUTION:
Fixed to have consistency maintained while parallel node communication. Using hacp for transferring temp files.

* 4015287 (Tracking ID: 4010255)

SYMPTOM:
"vfradmin promote" fails to promote target FS with selinux enabled.

DESCRIPTION:
During promote operation, VxFS remounts FS at target. When remounting FS to remove "protected on" flag from target, VxFS first fetch current mount options. With Selinux enabled (either in permissive mode/enabled), OS adds default "seclable" option to mount. When VxFS fetch current mount options, "seclabel" was not recognized by VxFS. Hence it fails to mount FS.

RESOLUTION:
Code is modified to remove "seclabel" mount option during mount processing on target.

* 4015835 (Tracking ID: 4015278)

SYMPTOM:
System panics during vx_uiomove_by _hand

DESCRIPTION:
During uiomove, VxFS get the pages from OS through get_user_pages() to copy user data. Oracle use hugetablfs internally for performance reason. This can allocate hugepages. Under low memory condition, it is possible that get_user_pages() might return VxFS compound pages. In case of compound pages, only head page has valid mapping set and all other pages are mapped as TAIL_MAPPING. In case of uiomove, if VxFS gets compound page, then it try to check writable mapping for all pages from this compound page. This can result into dereferencing illegal address (TAIL_MAPPING) which was causing panic in  stack. VxFS doesn't support huge pages but it is possible that compound page is present on the system and VxFS might get one through get_user_pages.

RESOLUTION:
Code is modified to get head page in case of tail pages from compound page when VxFS checks writeable mapping.

* 4016721 (Tracking ID: 4016927)

SYMPTOM:
Remove tier command panics the system, crash has panic reason "BUG: unable to handle kernel NULL pointer dereference at 0000000000000150"

DESCRIPTION:
When fsvoladm removes device all devices are not moved. Number of device count also remains same unless it is the last device in the array. So check for free slot before trying to access device.

RESOLUTION:
In the device list check for free slot before accessing the device in that slot.

* 4017282 (Tracking ID: 4016801)

SYMPTOM:
filesystem mark for fullfsck

DESCRIPTION:
In cluster environment, some operation can be perform on primary node only. When such operations are executed from secondary node, message is 
passed to primary node. During this, it may possible sender node has some transaction and not yet reached to disk. In such scenario, if sender node rebooted 
then primary node can see stale data.

RESOLUTION:
Code is modified to make sure transactions are flush to log disk before sending message to primary.

* 4017818 (Tracking ID: 4017817)

SYMPTOM:
NA

DESCRIPTION:
In order to increase the overall throughput of VFR, code changes have been done
to replicate files parallelly.

RESOLUTION:
Code changes have been done to replicate file's data & metadata parallely over
multiple socket connections.

* 4017820 (Tracking ID: 4017819)

SYMPTOM:
Cloud tier add operation fails when user is trying to add the AWS GovCloud.

DESCRIPTION:
Adding AWS GovCloud as a cloud tier was not supported in InfoScale. With these changes, user will be able to add AWS GovCloud type of cloud.

RESOLUTION:
Added support for AWS GovCloud

* 4019877 (Tracking ID: 4019876)

SYMPTOM:
vxfsmisc.so is publicly shared library for samba and doesn't require infoscale license for its usage

DESCRIPTION:
vxfsmisc.so is publicly shared library for samba and doesn't require infoscale license for its usage

RESOLUTION:
Removed license dependency in vxfsmisc library

* 4020055 (Tracking ID: 4012049)

SYMPTOM:
"fsck" supports the "metasave" option but it was not documented anywhere.

DESCRIPTION:
"fsck" supports the "metasave" option while executing with the "-y" option. but it is not documented anywhere. Also, it tries to store metasave in a particular location. The user doesn't have the option to specify the location. If that location doesn't have enough space, "fsck" fails to take the metasave and it continues to change filesystem state.

RESOLUTION:
Code changes have been done to add one new option with which the user can specify the location to store metasave. "metasave" and "target", these two options have been added in the "usage" message of "fsck" binary.

* 4020056 (Tracking ID: 4012049)

SYMPTOM:
"fsck" supports the "metasave" option but it was not documented anywhere.

DESCRIPTION:
"fsck" supports the "metasave" option while executing with the "-y" option. but it is not documented anywhere. Also, it tries to store metasave in a particular location. The user doesn't have the option to specify the location. If that location doesn't have enough space, "fsck" fails to take the metasave and it continues to change filesystem state.

RESOLUTION:
Code changes have been done to add one new option with which the user can specify the location to store metasave. "metasave" and "target", these two options have been added in the "usage" message of "fsck" binary.

* 4020912 (Tracking ID: 4020758)

SYMPTOM:
Filesystem mount or fsck with -y may see hang during log replay

DESCRIPTION:
fsck utility is used to perform the log replay. This log replay is performed during mount operation or during filesystem check with -y option, if needed. In certain cases if there are lot of logs that needs to be replayed then it end up into consuming entire buffer cache. This results into out of buffer scenario and results into hang.

RESOLUTION:
Code is modified to make sure enough buffers are always available.

Patch ID: VRTSvxfs-7.4.2.1400

* 4020337 (Tracking ID: 4020334)

SYMPTOM:
VxFS Dummy incidents for FLEX patch archival.

DESCRIPTION:
Incident included e4009779 for the FLEX team patch.

RESOLUTION:
Incident included e4009779 for the FLEX team patch.

Patch ID: VRTSvxfs-7.4.2.1300

* 4002850 (Tracking ID: 3994123)

SYMPTOM:
Running fsck on a system may show LCT count mismatch errors

DESCRIPTION:
Multi-block merged extents in IFIAT inodes, may only process the first block of the extent, thus leaving some references unprocessed. This will lead to LCT counts not matching. Resolving the issue will require a fullfsck.

RESOLUTION:
Code changes added to process merged multi-block extents in IFIAT inodes correctly.

* 4005220 (Tracking ID: 4002222)

SYMPTOM:
The cluster can hang if the cluster filesystem is FCL enabled and its disk layout version is greater than or equals to 14.

DESCRIPTION:
VxFS worker threads that are responsible for handling "File Change Log" feature related operations, can be stuck in a deadlock if the disk layout version of the FCL enabled cluster filesystem is greater than or equals to 14.

RESOLUTION:
Code changes have been done to prevent cluster-wide hang in a scenario where the cluster filesystem is FCL enabled and the disk layout version is greater than or equals to 14.

* 4010353 (Tracking ID: 3993935)

SYMPTOM:
Fsck command of vxfs may hit segmentation fault with following stack.
#0  get_dotdotlst ()
#1  find_dotino ()
#2  dir_sanity ()
#3  pass2 ()
#4  iproc_do_work ()
#5  start_thread ()
#6  sysctl ()

DESCRIPTION:
TURNON_CHUNK() and TURNOFF_CHUNK() are modifying the values of arguments.

RESOLUTION:
Code has been modified to fix the issue.

* 4012061 (Tracking ID: 4001378)

SYMPTOM:
VxFS module failed to load on RHEL8.2

DESCRIPTION:
The RHEL8.2 is new release and it has some changes in kernel which caused VxFS module failed to load
on it.

RESOLUTION:
Added code to support VxFS on RHEL8.2

* 4012522 (Tracking ID: 4012243)

SYMPTOM:
During IO MM semaphores lock contention may reduce performance

DESCRIPTION:
During IO mmap locks taken may introduce lock contention and reduce IO performance.

RESOLUTION:
New VxFS API is introduced to skip these locks whenever required on specific file.

* 4012765 (Tracking ID: 4011570)

SYMPTOM:
WORM attribute replication support in VxFS.

DESCRIPTION:
WORM attribute replication is not supported in VFR. Modified code to replicate WORM attribute during attribute processing in VFR.

RESOLUTION:
Code is modified to replicate WORM attributes in VFR.

* 4012787 (Tracking ID: 4007328)

SYMPTOM:
After replication service is stopped on target, the job failed at source only after processing all the FCL records.

DESCRIPTION:
After replication service is stopped on target, the job failed at source only after processing all the fcl records. It should get failed immediately, but it is failed after processing all the fcl records. If target breaks the connection, ideally the source received the error, which job can fail while reading FCL records, but the source received that connection is closed but the other thread doesnt receive the signal to stop, while processing FCL and ends after processing is complete.

RESOLUTION:
If replication service is stopped at target and processing of FCL records are being handled fail immediately based on return status of the connection.

* 4012800 (Tracking ID: 4008123)

SYMPTOM:
If a file has more than one named extended attributes set & if the job is paused. It fails
to replicate the remaining named extended attributes. (This behaviour is intermittent).

DESCRIPTION:
During a VFR replication if the job is paused while a file's nxattr are getting replicated, next
time when the job is resumed, the seqno. triplet received from target side causes source to miss
the remaining nxattr.

RESOLUTION:
Handling of named extended attributes is re-worked to make sure it doesn't miss the remaining
attributes on resume.

* 4012801 (Tracking ID: 4001473)

SYMPTOM:
If a file has named extended attributes set, VFR fails to replicate the job &
job goes into failed state.

DESCRIPTION:
VFR tries to use open(2) on nxattr files, since this files are not visible outside
it fails with ENOTDIR.

RESOLUTION:
Using the internal VXFS specific API to get a valid file descriptor for nxattr files.

* 4012842 (Tracking ID: 4006192)

SYMPTOM:
system panic with NULL pointer de-reference.

DESCRIPTION:
VxFS supports checkpoint i.e. point in image copy of filesystem. For this it needs keep copy of some metadata for checkpoint. In some cases it 
misses to make copy. Later while processing files corresponds to this missed metadata, it got empty extent information. Extent information is block map for a 
give file. This empty extent information causing NULL pointer de-reference.

RESOLUTION:
Code changes are made to fix this issue.

* 4012936 (Tracking ID: 4000465)

SYMPTOM:
FSCK binary loops when it detects break in sequence of log ids.

DESCRIPTION:
When FS is not cleanly unmounted, FS will end up with unflushed intent log. This intent log will either be flushed during next subsequent mount or when fsck ran on the FS. Currently to build the transaction list that needs to be replayed, VxFS uses binary search to find out head and tail. But if there are breakage in intent log, then current code is susceptible to loop. To avoid this loop, VxFS is now going to use sequential search to find out range instead of binary search.

RESOLUTION:
Code is modified to incorporate sequential search instead of binary search to find out replayable transaction range.

* 4013084 (Tracking ID: 4009328)

SYMPTOM:
In a cluster filesystem, if smap corruption is seen and the smap is marked bad then it could cause hang while unmounting the filesystem.

DESCRIPTION:
While freeing an extent in vx_extfree1() for logversion >= VX_LOGVERSION13 if we are freeing whole AUs we set VX_AU_SMAPFREE flag for those AUs. This ensures that revoke of delegation for that AU is delayed till the AU has SMAP free transaction in progress. This flag gets cleared either in post commit/undo processing of the transaction or during error handling in vx_extfree1(). In one scenario when we are trying to free a whole AU and its smap is marked bad, we do not return any error to vx_extfree1() and neither do we add the subfunction to free the extent to the transaction. So, the VX_AU_SMAPFREE flag is not cleared and remains set even if there is no SMAP free transaction in progress. This could lead to hang while unmounting the cluster filesystem.

RESOLUTION:
Code changes have been done to add error handling in vx_extfree1 to clear VX_AU_SMAPFREE flag in case where error is returned due to bad smap.

* 4013143 (Tracking ID: 4008352)

SYMPTOM:
Using VxFS mount binary inside container to mount any device might result in core generation.

DESCRIPTION:
Using VxFS mount binary inside container to mount any device might result in core generation.
This issue is because of improper initialisation of local pointer, and dereferencing garbage value later.

RESOLUTION:
This fix properly initialises all the pointers before dereferencing them.

* 4013144 (Tracking ID: 4008274)

SYMPTOM:
Race between compression thread and clone remove thread while allocating reorg inode.

DESCRIPTION:
Compression thread does the reorg inode allocation without setting i_inreuse and it takes HLOCK in exclusive mode. Later this lock in downgraded to shared mode. While this processing is happening clone delete thread can do iget on this inode and call vx_getownership without hold. If the inode is of type IFEMR or IFPTI or FREE success is returned after the ownership call. Later in the same function getownership is called with hold set before doing the processing (truncate or mark the inode as IFPTI). Removing the first redundant ownership call

RESOLUTION:
Delay taking ownership on inode until we check the inode mode.

* 4013626 (Tracking ID: 4004181)

SYMPTOM:
VxFS internally maintains compliance clock, without this API, user will not be able to read the value

DESCRIPTION:
VxFS internally maintains compliance clock, without this API, user will not be able to read the value

RESOLUTION:
Provide an API on mount point to read the Compliance clock for that filesystem

* 4013738 (Tracking ID: 3830300)

SYMPTOM:
Heavy cpu usage while oracle archive process are running on a clustered
fs.

DESCRIPTION:
The cause of the poor read performance in this case was due to fragmentation,
fragmentation mainly happens when there are multiple archivers running on the
same node. The allocation pattern of the oracle archiver processes is 

1. write header with O_SYNC
2. ftruncate-up the file to its final size ( a few GBs typically)
3. do lio_listio with 1MB iocbs

The problem occurs because all the allocations in this manner go through
internal allocations i.e. allocations below file size instead of allocations
past the file size. Internal allocations are done at max 8 Pages at once. So if
there are multiple processes doing this, they all get these 8 Pages alternately
and the fs becomes very fragmented.

RESOLUTION:
Added a tunable, which will allocate zfod extents when ftruncate
tries to increase the size of the file, instead of creating a hole. This will
eliminate the allocations internal to file size thus the fragmentation. Fixed
the earlier implementation of the same fix, which ran into
locking issues. Also fixed the performance issue while writing from secondary node.

Patch ID: VRTSsfcpi-7.4.2.1100

* 4014244 (Tracking ID: 4014243)

SYMPTOM:
The patch installer does not provide the rolling upgrade option to apply a patch.

DESCRIPTION:
Before installing a patch, the patch installer stops the InfoScale services on all the cluster nodes, and then starts them again afterwards. Thus, a full patch upgrade involves cluster downtime. The patch installer should also allow installation of a patch by using the rolling upgrade method. In a rolling upgrade, the patch is installed on one node at a time, thereby avoiding cluster downtime.

RESOLUTION:
The patch installer is enhanced to support installation of a patch using the rolling upgrade method as well.



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch infoscale-rhel7_x86_64-Patch-7.4.2.1900.tar.gz to /tmp
2. Untar infoscale-rhel7_x86_64-Patch-7.4.2.1900.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/infoscale-rhel7_x86_64-Patch-7.4.2.1900.tar.gz
    # tar xf /tmp/infoscale-rhel7_x86_64-Patch-7.4.2.1900.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/hf
    # ./installVRTSinfoscale742P1900 [<host1> <host2>...]

You can also install this patch together with 7.4.2 base release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.4.2 directory and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory
    # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
Manual installation is not recommended.


REMOVING THE PATCH
------------------
Manual uninstallation is not recommended.


KNOWN ISSUES
------------
* Tracking ID: 4052877

SYMPTOM: After an upgrade from InfoScale 7.4.1 to InfoScale 7.4.2 Update3 on RHEL 8.4, I/O operations may become unresponsive at the secondary site.

WORKAROUND: Stop the workload that is being run at the primary site, reboot the nodes at the secondary site, and then resynchronize the RVG.



SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
NONE


Applies to the following product releases

Update files

File name Description Version Platform Size