Sign In
Forgot Password

Don’t have an account? Create One.

7.4.2 Update3 patch for SLES12 platform

Cumulative Patch

Abstract

7.4.2 Update3 patch for SLES12 platform

Description

SORT ID: 17408

 

Fixes the following incidents:

4037283,4018173,4018178,4039517,4046906,4046907,4046908,4047588,4047590,4047592,4047595,4047695,4047722,4018182,
4020207,4020438,4021238,4021240,4021346,4021366,4023095,4046515,4046520,4046526,4045605,4045606,4046521,4046525,
4048981,4007372,4007374,4012397,4019536,4046423,4006982,4007375,4007376,4007677,4046524,4046415,4046419,4013034,
4039475,4046200,4046420,4019535,4049572,4049693,4049416,4049522,4049440,4023556,4013420,4040238,4040608,4042686,
4044184,4046265,4046266,4046267,4046271,4046272,4046829,4047568,4049091,4049097,4012765,4014720,4015287,4015835,
4016721,4017282,4017818,4017820,4019877,4020055,4020056,4020912,4023553,4014719,4020528

 

Patch ID:

VRTSamf-7.4.2.2100-SLES12 for VRTSamf
  VRTSaslapm-7.4.2.2200-SLES12 for VRTSaslapm
  VRTSgab-7.4.2.2100-SLES12 for VRTSgab
  VRTSglm-7.4.2.1500-SLES12 for VRTSglm
  VRTSgms-7.4.2.1200-SLES12 for VRTSgms
  VRTSllt-7.4.2.2100-SLES12 for VRTSllt
  VRTSodm-7.4.2.2200-SLES12 for VRTSodm
  VRTSpython-3.7.4.35-SLES12 for VRTSpython
  VRTSsfmh-7.4.2.501-0 for VRTSsfmh
  VRTSvcs-7.4.2.2100-SLES12 for VRTSvcs
  VRTSvcsag-7.4.2.2100-SLES12 for VRTSvcsag
  VRTSvcsea-7.4.2.1100-SLES12 for VRTSvcsea
  VRTSvcswiz-7.4.2.2100-SLES12 for VRTSvcswiz
  VRTSvlic-4.01.742.300-SLES for VRTSvlic
  VRTSvxfen-7.4.2.2100-SLES12 for VRTSvxfen
  VRTSvxfs-7.4.2.2200-SLES12 for VRTSvxfs
  VRTSvxvm-7.4.2.2200-SLES12 for VRTSvxvm

                          * * * READ ME * * *
                      * * * InfoScale 7.4.2 * * *
                         * * * Patch 1900 * * *
                         Patch Date: 2021-10-04


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH
   * KNOWN ISSUES


PATCH NAME
----------
InfoScale 7.4.2 Patch 1900


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
SLES12 x86-64


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSamf
VRTSaslapm
VRTSgab
VRTSglm
VRTSgms
VRTSllt
VRTSodm
VRTSpython
VRTSsfmh
VRTSvcs
VRTSvcsag
VRTSvcsea
VRTSvcswiz
VRTSvlic
VRTSvxfen
VRTSvxfs
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * InfoScale Availability 7.4.2
   * InfoScale Enterprise 7.4.2
   * InfoScale Foundation 7.4.2
   * InfoScale Storage 7.4.2


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: VRTSvxvm-7.4.2.2200
* 4037283 (4021301) Data corruption issue observed in VxVM on RHEL8.
* 4018173 (3852146) Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options 
are
specified together
* 4018178 (3906534) After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device.
* 4039517 (4012763) IO hang may happen in VVR (Veritas Volume Replicator) configuration when SRL overflows for one rlink while another one rlink is in AUTOSYNC mode.
* 4046906 (3956607) vxdisk reclaim dumps core.
* 4046907 (4041001) In VxVM, system is getting hung when some nodes are rebooted.
* 4046908 (4038865) System panick at vxdmp module in IRQ stack.
* 4047588 (4044072) I/Os fail for NVMe disks with 4K block size on the RHEL 8.4 kernel.
* 4047590 (4045501) The VRTSvxvm and the VRTSaslapm packages fail to install on Centos 8.4 systems.
* 4047592 (3992040) bi_error - bi_status conversion map added for proper interpretation of errors at FS side.
* 4047595 (4009353) Post enabling dmp native support machine is going in to mantaince mode
* 4047695 (3911930) Provide a way to clear the PGR_FLAG_NOTSUPPORTED on the device instead of using
exclude/include commands
* 4047722 (4023390) Vxconfigd keeps dump core as invalid private region offset on a disk.
Patch ID: VRTSvxvm-7.4.2.1500
* 4018182 (4008664) System panic when signal vxlogger daemon that has ended.
* 4020207 (4018086) system hang was observed when RVG was in DCM resync with SmartMove as ON.
* 4020438 (4020046) DRL log plex gets detached unexpectedly.
* 4021238 (4008075) Observed with ASL changes for NVMe, This issue observed in reboot scenario. For every reboot machine was hitting panic And this was happening in loop.
* 4021240 (4010612) This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on. means every nvme/ssd disks names would be 
hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....
* 4021346 (4010207) System panicked due to hard-lockup due to a spinlock not released properly during the vxstat collection.
* 4021366 (4008741) VxVM device files are not correctly labeled to prevent unauthorized modification - device_t
* 4023095 (4007920) Control auto snapshot deletion when cache obj is full.
Patch ID: VRTSvcs-7.4.2.2100
* 4046515 (4040705) hacli hangs indefinitely when command exceeds character limit of 4096
* 4046520 (4040656) Gracefully restart HAD in occurrence of ENOMEM error
* 4046526 (4043700) While Online operation is in progress and the PreOnline trigger is already executing; Multiple PreOnline triggers can be executed on the same/different nodes in the cluster for failover/parallel/hybrid service groups.
Patch ID: VRTSvcsag-7.4.2.2100
* 4045605 (4038906) In case of ESXi 6.7, the VMwareDisks agent fails to perform a failover on a peer node.
* 4045606 (4042944) In a hardware replicated environment, a disk group resource may fail to import when the HARDWARE_MIRROR flag is set
* 4046521 (4030215) Azure agents now support azure-identity based credential methods
* 4046525 (4046286) Azure Cloud agents does not handle generic exceptions
* 4048981 (4048164) Cloud agents may report incorrect resource state in case cloud API hangs.
Patch ID: VRTSvcsag-7.4.2.1400
* 4007372 (4016624) When a disk group is forcibly imported with ClearClone enabled, different DGIDs are assigned to the associated disks.
* 4007374 (1837967) Application agent falsely detects an application as faulted, due to corruption caused by non-redirected STDOUT or STDERR.
* 4012397 (4012396) AzureDisk agent fails to work with latest Azure Storage SDK.
* 4019536 (4009761) A lower NFSRestart resoure fails to come online within the duration specified in OnlineTimeout when the share directory for NFSv4 lock state information contains millions of small files.
Patch ID: VRTSvxfen-7.4.2.2100
* 4046423 (4043619) OCPR failed from SCSI3 fencing to Customized mode
Patch ID: VRTSvxfen-7.4.2.1300
* 4006982 (3988184) The vxfen process cannot complete due to incomplete vxfentab file.
* 4007375 (4000745) The VxFEN process fails to start due to late discovery of the VxFEN disk group.
* 4007376 (3996218) In a customized fencing mode, the 'vxfenconfig -c' command creates a new vxfend process even if VxFen is already configured.
* 4007677 (3970753) Freeing uninitialized/garbage memory causes panic in vxfen.
Patch ID: VRTSamf-7.4.2.2100
* 4046524 (4041596) A cluster node panics when the arguments passed to a process that is registered with AMF exceeds 8K characters.
Patch ID: VRTSgab-7.4.2.2100
* 4046415 (4046413) gab node count/fencing quorum not getting updated properly
* 4046419 (4046418) gab startup does not fail even if llt is not configured
Patch ID: VRTSgab-7.4.2.1300
* 4013034 (4011683) The GAB module failed to start and the system log messages indicate failures with the mknod command.
Patch ID: VRTSllt-7.4.2.2100
* 4039475 (4045607) Performance improvement of the UDP multiport feature of LLT on 1500 MTU-based networks.
* 4046200 (4046199) llt over udp configuration now accepts any link tag name
* 4046420 (3989372) When the CPU load and memory consumption is high in a VMware environment, some nodes in an InfoScale cluster may get fenced out.
Patch ID: VRTSllt-7.4.2.1300
* 4019535 (4018581) The LLT module fails to start and the system log messages indicate missing IP address.
Patch ID: VRTSvcswiz-7.4.2.2100
* 4049572 (4049573) Veritas High Availability Configuration Wizard (HA-Plugin) is not supported on VMWare vCenter HTML based UI.
Patch ID: VRTSpython-3.7.4.35
* 4049693 (4049692) VRTSpython package has been updated with more python modules to support Licensing component.
Patch ID: VRTSvlic-4.01.742.300
* 4049416 (4049416) Migrate Telemetry Collector from Java to Python.
Patch ID: VRTSsfmh-vom-HF0742501
* 4049522 (4049521) VIOM Agent for InfoScale 7.4.2 Update3
Patch ID: VRTSodm-7.4.2.2200
* 4049440 (4049438) VRTSodm driver will not load with 7.4.2.2200 VRTSvxfs patch.
Patch ID: VRTSodm-7.4.2.1500
* 4023556 (4023555) Unable to load the vxodm module on linux.
Patch ID: VRTSvxfs-7.4.2.2200
* 4013420 (4013139) The abort operation on an ongoing online migration from the native file system to VxFS on RHEL 8.x systems.
* 4040238 (4035040) vfradmin stats command failed to show all the fields in the command output in-case job paused and resume.
* 4040608 (4008616) fsck command got hung.
* 4042686 (4042684) ODM resize fails for size 8192.
* 4044184 (3993140) Compclock was not giving accurate results.
* 4046265 (4037035) Added new tunable "vx_ninact_proc_threads" to control the number of inactive processing threads.
* 4046266 (4043084) panic in vx_cbdnlc_lookup
* 4046267 (4034910) Asynchronous access/updatation of global list large_dirinfo  can corrupt its values in multi-threaded execution.
* 4046271 (3993822) fsck stops running on a file system
* 4046272 (4017104) Deleting a lot of files can cause resource starvation, causing panic or momentary hangs.
* 4046829 (3993943) The fsck utility hit the coredump due to segmentation fault in get_dotdotlst()
* 4047568 (4046169) On RHEL8, while doing a directory move from one FS (ext4 or vxfs) to migration VxFS, the migration can fail and FS will be disable.
* 4049091 (4035057) On RHEL8, IOs done on FS, while other FS to VxFS migration is in progress can cause panic.
* 4049097 (4049096) Dalloc change ctime in background while extent allocation
Patch ID: VRTSvxfs-7.4.2.1600
* 4012765 (4011570) WORM attribute replication support in VxFS.
* 4014720 (4011596) Multiple issues were observed during glmdump using hacli for communication
* 4015287 (4010255) "vfradmin promote" fails to promote target FS with selinux enabled.
* 4015835 (4015278) System panics during vx_uiomove_by _hand.
* 4016721 (4016927) For multi cloud tier scenario, system panic with NULL pointer dereference when we try to remove second cloud tier
* 4017282 (4016801) filesystem mark for fullfsck
* 4017818 (4017817) VFR performance enhancement changes.
* 4017820 (4017819) Adding cloud tier operation fails while trying to add AWS GovCloud.
* 4019877 (4019876) Remove license library dependency from vxfsmisc.so library
* 4020055 (4012049) Documented "metasave" option and added one new option in fsck binary.
* 4020056 (4012049) Documented "metasave" option and added one new option in fsck binary.
* 4020912 (4020758) Filesystem mount or fsck with -y may see hang during log replay
Patch ID: VRTSgms-7.4.2.1200
* 4023553 (4023552) Unable to load the vxgms module on linux.
Patch ID: VRTSglm-7.4.2.1500
* 4014719 (4011596) Multiple issues were observed during glmdump using hacli for communication
Patch ID: VRTSvcsea-7.4.2.1100
* 4020528 (4001565) On Solaris 11.4, IMF fails to provide notifications when Oracle processes stop.


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: VRTSvxvm-7.4.2.2200

* 4037283 (Tracking ID: 4021301)

SYMPTOM:
Data corruption issue happened with the big size IO processed by Linux kernel IO split on RHEL8.

DESCRIPTION:
On RHEL8 or as of Linux kernel 3.13, it introduces some changes in Linux kernel block layer, new item of the bio iterator structure is used to represent the start offset of bio or bio vectors after the IO processed by Linux kernel IO split functions. Also, in recent version of vxfs, it can generate bio with larger size than the size limitation defined within Linux kernel block layer and VxVM, which lead the IO from vxfs could be split by Linux kernel. For such split IOs, VxVM does not take the new item of the bio iterator into account while process them, which caused the data is written to wrong position of volume/disk. Hence, data corruption.

RESOLUTION:
Code changes have been made to bypass the Linux kernel IO split functions, which seems redundant for VxVM IO processing.

* 4018173 (Tracking ID: 3852146)

SYMPTOM:
In a CVM cluster, when importing a shared diskgroup specifying both -c and -o
noreonline options, the following error may be returned: 
VxVM vxdg ERROR V-5-1-10978 Disk group <dgname>: import failed: Disk for disk
group not found.

DESCRIPTION:
The -c option will update the disk ID and disk group ID on the private region
of the disks in the disk group being imported. Such updated information is not
yet seen by the slave because the disks have not been re-onlined (given that
noreonline option is specified). As a result, the slave cannot identify the
disk(s) based on the updated information sent from the master, causing the
import to fail with the error Disk for disk group not found.

RESOLUTION:
The code is modified to handle the working of the "-c" and "-o noreonline"
options together.

* 4018178 (Tracking ID: 3906534)

SYMPTOM:
After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device.

DESCRIPTION:
Currently /boot is mounted on top of OS (Operating System) device. When DMP
Native support is enabled, only VG's (Volume Groups) are migrated from OS 
device to DMP device.This is the reason /boot is not migrated to DMP device.
With this if OS device path is not available then system becomes unbootable 
since /boot is not available. Thus it becomes necessary to mount /boot on DMP
device to provide multipathing and resiliency.

RESOLUTION:
Code changes have been done to migrate /boot on top of DMP device when DMP
Native support is enabled.
Note - The code changes are currently implemented for RHEL-6 only. For other
linux platforms, /boot will still not be mounted on the DMP device

* 4039517 (Tracking ID: 4012763)

SYMPTOM:
IO hang may happen in VVR (Veritas Volume Replicator) configuration when SRL overflows for one rlink while another one rlink is in AUTOSYNC mode.

DESCRIPTION:
In VVR, if the SRL overflow happens for rlink (R1) and some other rlink (R2) is ongoing the AUTOSYNC, then AUTOSYNC is aborted for R2, R2 gets detached and DCM mode is activated on R1 rlink.

However, due to a race condition in code handling AUTOSYNC abort and DCM activation in parallel, the DCM could not be activated properly and IO which caused DCM activation gets queued incorrectly, this results in a IO hang.

RESOLUTION:
The code has been modified to fix the race issue in handling the AUTOSYNC abort and DCM activation at same time.

* 4046906 (Tracking ID: 3956607)

SYMPTOM:
When removing a VxVM disk using vxdg-rmdisk operation, the following error occurs requesting a disk reclaim.
VxVM vxdg ERROR V-5-1-0 Disk <device_name> is used by one or more subdisks which are pending to be reclaimed.
Use "vxdisk reclaim <device_name>" to reclaim space used by these subdisks, and retry "vxdg rmdisk" command.
Note: reclamation is irreversible. But when issuing vxdisk-reclaim as advised, the command dumps core.

DESCRIPTION:
In the disk-reclaim code path, memory allocation can fail at realloc() but the failure 
not detected, causing an invalid address to be referenced and a core dump results.

RESOLUTION:
The disk-reclaim code path now handles failure of realloc properly.

* 4046907 (Tracking ID: 4041001)

SYMPTOM:
When some nodes are rebooted in the system, nodes cannot join back as disk attach transactions are not
happening.

DESCRIPTION:
In VxVM, when some nodes are rebooted, some plexes of volume are detached. It may happen that all plexes
of volume are disabled. In this case, if all plexes of some DCO volume become inaccessible, that DCO
volume state should be marked as BADLOG.

If state is not marked BADLOG, transactions fail with following error.
VxVM ERROR V-5-1-10128  DCO experienced IO errors during the operation. Re-run the operation after ensuring that DCO is accessible

As the transactions are failing, system goes in hang state and nodes cannot join.

RESOLUTION:
The code is fixed to mark DCO state as BADLOG when all the plexes of DCO becomes inaccessible during IO load.

* 4046908 (Tracking ID: 4038865)

SYMPTOM:
System panick at vxdmp module with following calltrace in IRQ stack.
native_queued_spin_lock_slowpath
queued_spin_lock_slowpath
_raw_spin_lock_irqsave7
dmp_get_shared_lock
gendmpiodone
dmpiodone
bio_endio
blk_update_request
scsi_end_request
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
do_IRQ
 <IRQ stack>

DESCRIPTION:
A deadlock issue can happen between inode_hash_lock and DMP shared lock, when one process holding inode_hash_lock but acquires the DMP shared lock in IRQ context, in the mean time other process holding the DMP shared lock may acquire inode_hash_lock.

RESOLUTION:
Code changes have been done to avoid the deadlock issue.

* 4047588 (Tracking ID: 4044072)

SYMPTOM:
I/Os fail for NVMe disks with 4K block size on the RHEL 8.4 kernel.

DESCRIPTION:
This issue occurs only in the case of disks of the 4K block size. I/Os complete successfully when the disks of 512 block size are used. If disks of the 4K block size are used, the following error messages are logged:
[ 51.228908] VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 201/0x10
[ 51.230070] blk_update_request: operation not supported error, dev nvme1n1, sector 240 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
[ 51.240861] blk_update_request: operation not supported error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0

RESOLUTION:
Updated the VxVM and the VxDMP modules to address this issue. The logical block size is now set to 4096 bytes, which is the same as the physical block size.

* 4047590 (Tracking ID: 4045501)

SYMPTOM:
The following errors occur during the installation of the VRTSvxvm and the VRTSaslapm packages on CentOS 8.4 systems:
~
Verifying packages...
Preparing packages...
This release of VxVM is for Red Hat Enterprise Linux 8
and CentOS Linux 8.
Please install the appropriate OS
and then restart this installation of VxVM.
error: %prein(VRTSvxvm-7.4.1.2500-RHEL8.x86_64) scriptlet failed, exit status 1
error: VRTSvxvm-7.4.1.2500-RHEL8.x86_64: install failed
cat: 9: No such file or directory
~

DESCRIPTION:
The product installer reads the /etc/centos-release file to identify the Linux distribution. This issue occurs because the file has changed for CentOS 8.4.

RESOLUTION:
The product installer is updated to identify the correct Linux distribution so that the VRTSvxvm and the VRTSaslapm packages get installed successfully.

* 4047592 (Tracking ID: 3992040)

SYMPTOM:
CFS-Stress-l2 hits assert f:vx_dio_bio_done:2

DESCRIPTION:
In RHEL8.0/SLES15 kernel code, The value in bi_status isn't a standard error code at and there are completely separate set of values that are all small positive integers (for example, BLK_STS_OK and BLK_STS_IOERROR) while actual errors sent by VM are different hence VM should send proper bi_status to FS with newer kernel.

RESOLUTION:
Code changes are done to have a map for bi_status and bi_error conversion( as it's been there in Linux Kernel code - blk-core.c)

* 4047595 (Tracking ID: 4009353)

SYMPTOM:
After the command, vxdmpadm settune dmp_native_support=on, machine goes into maintenance mode. Issue is produced on physical setup with root lvm disk

DESCRIPTION:
If there is '-' in native vgname, then the script is taking an inaccurate vgname.

RESOLUTION:
Code changes have been made to fix the issue.

* 4047695 (Tracking ID: 3911930)

SYMPTOM:
Valid PGR operations sometimes fail on a dmpnode.

DESCRIPTION:
As part of the PGR operations, if the inquiry command finds that PGR is not
supported on the dmpnode node, a flag PGR_FLAG_NOTSUPPORTED is set on the
dmpnode.
Further PGR operations check this flag and issue PGR commands only if this flag
is
NOT set.
This flag remains set even if the hardware is changed so as to support PGR.

RESOLUTION:
A new command (namely enablepr) is provided in the vxdmppr utility to clear this
flag on the specified dmpnode.

* 4047722 (Tracking ID: 4023390)

SYMPTOM:
Vxconfigd crashes as a disk contains invalid privoffset(160), which is smaller than minimum required offset(VTOC 265, GPT 208).

DESCRIPTION:
There may have disk label corruption or stale information residents on the disk header, which caused unexpected label written.

RESOLUTION:
Add a assert when updating CDS label to ensure the valid privoffset written to disk header.

Patch ID: VRTSvxvm-7.4.2.1500

* 4018182 (Tracking ID: 4008664)

SYMPTOM:
System panic occurs with the following stack:

void genunix:psignal+4()
void vxio:vol_logger_signal_gen+0x40()
int vxio:vollog_logentry+0x84()
void vxio:vollog_logger+0xcc()
int vxio:voldco_update_rbufq_chunk+0x200()
int vxio:voldco_chunk_updatesio_start+0x364()
void vxio:voliod_iohandle+0x30()
void vxio:voliod_loop+0x26c((void *)0)
unix:thread_start+4()

DESCRIPTION:
Vxio keeps vxloggerd proc_t that is used to send a signal to vxloggerd. In case vxloggerd has been ended for some reason, the signal may be sent to an unexpected process, which may cause panic.

RESOLUTION:
Code changes have been made to correct the problem.

* 4020207 (Tracking ID: 4018086)

SYMPTOM:
vxiod with ID as 128 was stuck with below stack:

 #2 [] vx_svar_sleep_unlock at [vxfs]
 #3 [] vx_event_wait at [vxfs]
 #4 [] vx_async_waitmsg at [vxfs]
 #5 [] vx_msg_send at [vxfs]
 #6 [] vx_send_getemapmsg at [vxfs]
 #7 [] vx_cfs_getemap at [vxfs]
 #8 [] vx_get_freeexts_ioctl at [vxfs]
 #9 [] vxportalunlockedkioctl at [vxportal]
 #10 [] vxportalkioctl at [vxportal]
 #11 [] vxfs_free_region at [vxio]
 #12 [] vol_ru_start_replica at [vxio]
 #13 [] vol_ru_start at [vxio]
 #14 [] voliod_iohandle at [vxio]
 #15 [] voliod_loop at [vxio]

DESCRIPTION:
With SmartMove feature as ON, it can happen vxiod with ID as 128 starts replication where RVG was in DCM mode, this vxiod is waiting for filesystem's response if a given region is used by filesystem or not. Filesystem will trigger MDSHIP IO on logowner. Due to a bug in code, MDSHIP IO always gets queued in vxiod with ID as 128. Hence a dead lock situation.

RESOLUTION:
Code changes have been made to avoid handling MDSHIP IO in vxiod whose ID is bigger than 127.

* 4020438 (Tracking ID: 4020046)

SYMPTOM:
The following IO errors are reported on VxVM sub-disks result in DRL log detached without any SCSI errors detected.

VxVM vxio V-5-0-1276 error on Subdisk [xxxx] while writing volume [yyyy][log] offset 0 length [zzzz]
VxVM vxio V-5-0-145 DRL volume yyyy[log] is detached

DESCRIPTION:
DRL plexes detached as an atomic write flag (BIT_ATOMIC) was set on BIO unexpectedly. The BIT_ATOMIC flag gets set on bio only if VOLSIO_BASEFLAG_ATOMIC_WRITE flag is set on SUBDISK SIO and its parent MVWRITE SIO's sio_base_flags. When generating MVWRITE SIO,  it's sio_base_flags was copied from a gio structure, because the gio structure memory isn't initialized it may contain gabarge values, hence the issue.

RESOLUTION:
Code changes have been made to fix the issue.

* 4021238 (Tracking ID: 4008075)

SYMPTOM:
Observed with ASL changes for NVMe, This issue observed in reboot scenario. For every reboot machine was hitting panic And this was happening in loop.

DESCRIPTION:
panic was hitting for such splitted bios, root cause for this is RHEL8 introduced a new field named as __bi_remaining.
where __bi_remaining is maintanins the count of chained bios, And for every endio that __bi_remaining gets atomically decreased in bio_endio() function.
While decreasing __bi_remaining OS checks that the __bi_remaining 'should not <= 0' and in our case __bi_remaining was always 0 and we were hitting OS
BUG_ON.

RESOLUTION:
>>> For scsi devices maxsize is 4194304,
[   26.919333] DMP_BIO_SIZE(orig_bio) : 16384, maxsize: 4194304
[   26.920063] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 4194304

>>>and for NVMe devices maxsize is 131072
[  153.297387] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 131072
[  153.298057] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 131072

* 4021240 (Tracking ID: 4010612)

SYMPTOM:
$ vxddladm set namingscheme=ebn lowercase=no
This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on. means every nvme/ssd disks names would be 
hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....

DESCRIPTION:
$ vxddladm set namingscheme=ebn lowercase=no
This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on.
means every nvme/ssd disks names would be hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....
eg.
smicro125_nvme0_0 <--- disk1
smicro125_nvme1_0 <--- disk2

for lowercase=no our current code is suppressing the suffix digit of enclosurname and hence multiple disks gets same name and it is showing udid_mismatch 
because whatever udid of private region is not matching with ddl. ddl database showing wrong info because of multiple disks gets same name.

smicro125_nvme_0 <--- disk1   <<<<<<<-----here suffix digit of nvme enclosure suppressed
smicro125_nvme_0 <--- disk2

RESOLUTION:
Append the suffix integer while making da_name

* 4021346 (Tracking ID: 4010207)

SYMPTOM:
System panic occurred with the below stack:

native_queued_spin_lock_slowpath()
queued_spin_lock_slowpath()
_raw_spin_lock_irqsave()
volget_rwspinlock()
volkiodone()
volfpdiskiodone()
voldiskiodone_intr()
voldmp_iodone()
bio_endio()
gendmpiodone()
dmpiodone()
bio_endio()
blk_update_request()
scsi_end_request()
scsi_io_completion()
scsi_finish_command()
scsi_softirq_done()
blk_done_softirq()
__do_softirq()
call_softirq()

DESCRIPTION:
As part of collecting the IO statistics collection, the vxstat thread acquires a spinlock and tries to copy data to the user space. During the data copy, if some page fault happens, then the thread would relinquish the CPU and provide the same to some other thread. If the thread which gets scheduled on the CPU requests the same spinlock which vxstat thread had acquired, then this results in a hard lockup situation.

RESOLUTION:
Code has been changed to properly release the spinlock before copying out the data to the user space during vxstat collection.

* 4021366 (Tracking ID: 4008741)

SYMPTOM:
VxVM device files appears to have device_t SELinux label.

DESCRIPTION:
If an unauthorized or modified device is allowed to exist on the system, there is the possibility the system may perform unintended or unauthorized operations.
eg: ls -LZ
...
...
/dev/vx/dsk/testdg/vol1   system_u:object_r:device_t:s0
/dev/vx/dmpconfig         system_u:object_r:device_t:s0
/dev/vx/vxcloud           system_u:object_r:device_t:s0

RESOLUTION:
Code changes made to change the device labels to misc_device_t, fixed_disk_device_t.

* 4023095 (Tracking ID: 4007920)

SYMPTOM:
vol_snap_fail_source tunable is set still largest and oldest snapshot automatically deleted when cache object becomes full

DESCRIPTION:
If vol_snap_fail_source tunable is set then oldest snapshot should not be deleted in case of cache object full. Flex requires these snapshots for rollback.

RESOLUTION:
Added fix to stop auto snapshot deletion in vxcached

Patch ID: VRTSvcs-7.4.2.2100

* 4046515 (Tracking ID: 4040705)

SYMPTOM:
hacli hangs indefinitely when command exceeds character limit of 4096.

DESCRIPTION:
hacli hangs indefinitely when '-cmd' option value exceeds character limit 4096. Instead of returning proper error message hacli indefinitely waits for reply from vcs engine.

RESOLUTION:
Increased character limit of hacli '-cmd' option value. Now it's 7680. Also handled validations of different options of hacli. So when '-cmd' option value will exceed this new limit it will give proper error message instead of hanging.

* 4046520 (Tracking ID: 4040656)

SYMPTOM:
In result of ENOMEM error HAD restart with '-restart' option

DESCRIPTION:
When ENOMEM error occurs, HAD retries for some max limit and still if we get ENOMEM error then HAD exits. Then hashadow daemon restarts HAD with '-restart' option. So it doesn't allows to Austostart of failover SG in cluster as it considers as one of the node is in restarting mode.

RESOLUTION:
In nonoccurence of ENOMEM error HAD will gracefully exit and hashadow daemon will restart HAD without '-restart' option. So that node will not be considered as restarted and Autostart of failover SG will be triggered.

* 4046526 (Tracking ID: 4043700)

SYMPTOM:
While Online operation is in progress and the PreOnline trigger is already executing; Multiple PreOnline triggers can be executed on the same/different nodes in the cluster for failover/parallel/hybrid service groups.

DESCRIPTION:
In-progress execution of the PreOnline trigger was not accounted. Thus subsequent online operations can be accepted while there is a PreOnline trigger already executing. Hence multiple PreOnline trigger instances were executed.

RESOLUTION:
While validating an online operation in progress PreOnline triggers were also considered and subsequent online operations were rejected. This fix ensures only one execution of the PreOnline trigger for failover groups.

Patch ID: VRTSvcsag-7.4.2.2100

* 4045605 (Tracking ID: 4038906)

SYMPTOM:
In case of ESXi 6.7, the VMwareDisks agent fails to perform a failover on a peer node.

DESCRIPTION:
The VMwareDisks agent faults when you try to bring the related service group online or to fail over the service group on a peer node. This issue occurs due to the change in the behavior of the API on ESXi 6.7 that is used to attach VMware disks.

RESOLUTION:
The VMWareDisks agent is updated to support the changed behavior of the API on ESXi 6.7. The agent can now bring the service group online or perform a failover on a peer node successfully.

* 4045606 (Tracking ID: 4042944)

SYMPTOM:
In a hardware replicated environment, a disk group resource may fail to import when the HARDWARE_MIRROR flag is set

DESCRIPTION:
After the VCS hardware replication agent resource fails over control to the secondary site, the DiskGroup agent does not rescan all the required device paths in 
case of a multi-pathing configuration. 
The vxdg import operation fails, as the hardware device characteristics for all the paths are not refreshed.

RESOLUTION:
This hotfix introduces of a new resource attribute for DiskGroup agent called ScanDisks. The ScanDisks attributes enables the user to perform a selective 
devices scan for all disk paths associated with a VxVM disk group. The VxVM and DMP disks attributes are refreshed before attempting to importing hardware clone 
or replicated devices. The default value of ScanDisks is 0, which indicates a selective device scan is not performed. Even when set 0, if the disk group fails 
with an error string containing HARDWARE MIRROR during the first disk group import attempt, the DiskGroup agent will then perform a selective device scan to 
increase of the chances of a successful import.
Sample resource configurations:
For Hardware Clone DiskGroups

DiskGroup tc_dg (
DiskGroup = datadg
DGOptions = "-o useclonedev=on -o updateid"
ForceImport = 0
ScanDisks = 1
)

For Hardware Replicated DiskGroups

DiskGroup tc_dg (
DiskGroup = datadg
ForceImport = 0
ScanDisks = 1
)

* 4046521 (Tracking ID: 4030215)

SYMPTOM:
Azure agents now support azure-identity based credential methods

DESCRIPTION:
Azure credential system is revamped. The new system is available in azure-identity library.

RESOLUTION:
Azure agents now support azure-identity based credential method. With this enhancement, Azure agents  will support following Azure Python SDK versions:

azure-common==1.1.25
azure-core==1.10.0
azure-identity==1.4.1
azure-mgmt-compute==19.0.0
azure-mgmt-core==1.2.2
azure-mgmt-dns==8.0.0
azure-mgmt-network==17.1.0
azure-storage-blob==12.8.0
msrestazure==0.6.4

* 4046525 (Tracking ID: 4046286)

SYMPTOM:
Azure Cloud agents does not handle generic exceptions

DESCRIPTION:
Azure agents are handling only CloudError of Azure APIs, but there can be other error that may occur during certain failure conditions.

RESOLUTION:
Azure agents are enhanced to handle API failure conditions.

* 4048981 (Tracking ID: 4048164)

SYMPTOM:
Cloud agents may report incorrect resource state in case cloud API hangs.

DESCRIPTION:
In case Cloud SDK API/CLI hang, the monitor function of cloud agents times out. This results in un-wanted failover of service group.

RESOLUTION:
The default value of FaultOnMonitorTimeout attribute of all cloud agents are set to 0. This helps in avoiding un-wanted failover because of Cloud SDK API/CLI hang.

Patch ID: VRTSvcsag-7.4.2.1400

* 4007372 (Tracking ID: 4016624)

SYMPTOM:
When a disk group is forcibly imported with ClearClone enabled, different DGIDs are assigned to the associated disks.

DESCRIPTION:
When the ForceImport option is used, a disk group gets imported with the available disks, regardless of whether all the required disks are available or not. In such a scenario, if the ClearClone attribute is enabled, the available disks are successfully imported, but their DGIDs are updated to new values. Thus, the disks within the same disk group end up with different DGIDs, which may cause issues with the functioning of the storage configuration.

RESOLUTION:
The DiskGroup agent is updated to allow the ForceImport and the ClearClone attributes to be set to the following values as per the configuration requirements. ForceImport can be set to 0 or 1. ClearClone can be set to 0, 1, or 2. ClearClone is disabled when set to 0 and enabled when set to 1 or 2. ForceImport is disabled when set to 0 and is ignored when ClearClone is set to 1. To enable both, ClearClone and ForceImport, set ClearClone to 2 and ForceImport to 1.

* 4007374 (Tracking ID: 1837967)

SYMPTOM:
Application agent falsely detects an application as faulted, due to corruption caused by non-redirected STDOUT or STDERR.

DESCRIPTION:
This issue can occur when the STDOUT and STDERR file descriptors of the program to be started and monitored are not redirected to a specific file or to /dev/null. In this case, an application that is started by the Online entry point inherits the STDOUT and STDERR file descriptors from the entry point. Therefore, the entry point and the application, both, read from and write to the same file, which may lead to file corruption and cause the agent entry point to behave unexpectedly.

RESOLUTION:
The Application agent is updated to identify whether STDOUT and STDERR for the configured application are already redirected. If not, the agent redirects them to /dev/null.

* 4012397 (Tracking ID: 4012396)

SYMPTOM:
AzureDisk agent fails to work with latest Azure Storage SDK.

DESCRIPTION:
Latest Python SDK for Azure doesn't work with InfoScale AzureDisk agent.

RESOLUTION:
AzureDisk agent now supports latest Azure Storage Python SDK.

* 4019536 (Tracking ID: 4009761)

SYMPTOM:
A lower NFSRestart resoure fails to come online within the duration specified in OnlineTimeout when the share directory for NFSv4 lock state information contains millions of small files.

DESCRIPTION:
As part of the Online operation, the NFSRestart agent copies the NFSv4 state data of clients from the shared storage to the local path. However, if the source location contains millions of files, some of which may be stale, their movement may not be completed before the operation times out.

RESOLUTION:
A new action entry point named "cleanup" is provided, which removes stale files. The usage of the entry point is as follows:
$ hares -action <resname> cleanup -actionargs <days> -sys <sys>
  <days>: number of days, deleting files that are <days> old
Example:
$ hares -action NFSRestart_L cleanup -actionargs 30 -sys <sys>
The cleanup action ensures that files older than the number of days specified in the -actionargs option are removed; the minimum expected duration is 30 days. Thus, only the relevant files to be moved remain, and the Online operation is completed in time.

Patch ID: VRTSvxfen-7.4.2.2100

* 4046423 (Tracking ID: 4043619)

SYMPTOM:
OCPR failed from SCSI3 fencing to Customized mode

DESCRIPTION:
Online Coordination Point Replacement (OCPR) was broken for SCSI3 to Customized mode based fencing. This was due to a regression due to a change in vxfend invocation

RESOLUTION:
OCPR from SCSI3 to Customized mode is working again with this fix

Patch ID: VRTSvxfen-7.4.2.1300

* 4006982 (Tracking ID: 3988184)

SYMPTOM:
The vxfen process cannot complete due to incomplete vxfentab file.

DESCRIPTION:
When I/O fencing starts, the vxfen startup script creates the /etc/vxfentab file on each node. If the coordination disk discovery is slow, the vxfen startup script fails to include all the coordination points in the vxfentab file. As a result, the vxfen startup script gets stuck in a loop.

RESOLUTION:
The vxfen startup process is modified to exit from the loop if it gets stuck while configuring 'vxfenconfig -c'. On exiting from the loop, systemctl starts vxfen again and tries to use the updated vxfentab file.

* 4007375 (Tracking ID: 4000745)

SYMPTOM:
The VxFEN process fails to start due to late discovery of the VxFEN disk group.

DESCRIPTION:
When I/O fencing starts, the VxFEN startup script creates this /etc/vxfentab file on each node. During disk-based fencing, the VxVM module may take longer time to discover the VxFEN disk group. Because of this delay, the 'generate disk list' opreration times out. Therefore, the VxFEN process fails to start and reports the following error: 'ERROR: VxFEN cannot generate vxfentab because vxfendg does not exist'

RESOLUTION:
A new tunable, getdisks_timeout, is introduced to specify the timeout value for the VxFEN disk group discovery. The maximum and the default value for this tunable is 600 seconds. You can set the value of this tunable by adding an getdisks_timeout=<time_in_sec> entry in the /etc/vxfenmode file.

* 4007376 (Tracking ID: 3996218)

SYMPTOM:
In a customized fencing mode, the 'vxfenconfig -c' command creates a new vxfend process even if VxFen is already configured.

DESCRIPTION:
When you configure fencing in the customized mode and run the 'vxfenconfig -c' command, the vxfenconfig utility reports the 'VXFEN ERROR V-11-1-6 vxfen already configured...' error. Moreover, it also creates a new vxfend process even if VxFen is already configured. Such redundant processes may impact the performance of the system.

RESOLUTION:
The vxfenconfig utility is modified so that it does not create a new vxfend process when VxFen is already configured.

* 4007677 (Tracking ID: 3970753)

SYMPTOM:
Freeing uninitialized/garbage memory causes panic in vxfen.

DESCRIPTION:
Freeing uninitialized/garbage memory causes panic in vxfen.

RESOLUTION:
Veritas has modified the VxFen kernel module to fix the issue by initializing the object before attempting to free it.
 .

Patch ID: VRTSamf-7.4.2.2100

* 4046524 (Tracking ID: 4041596)

SYMPTOM:
A cluster node panics when the arguments passed to a process that is registered with AMF exceeds 8K characters.

DESCRIPTION:
This issue occurs due to improper parsing and handling of argument lists that are passed to processes registered with AMF.

RESOLUTION:
AMF is updated to correctly parse and handle argument lists for processes.

Patch ID: VRTSgab-7.4.2.2100

* 4046415 (Tracking ID: 4046413)

SYMPTOM:
After node addition/node deletion gab node count is not updated properly

DESCRIPTION:
gabconfig -m <node count> command displays error despite providing a correct node count

RESOLUTION:
There was a parsing issue which has been resolved by this fix

* 4046419 (Tracking ID: 4046418)

SYMPTOM:
gab startup does not fail even if llt is not configured

DESCRIPTION:
Since gab service depends on llt service, if llt service fails to start/is not configured, gab should not start

RESOLUTION:
This fix will prevent gab to start if llt is not configured

Patch ID: VRTSgab-7.4.2.1300

* 4013034 (Tracking ID: 4011683)

SYMPTOM:
The GAB module failed to start and the system log messages indicate failures with the mknod command.

DESCRIPTION:
The mknod command fails to start the GAB module because its format is invalid. If the names of multiple drivers in an environment contain the value "gab" as a substring, all their major device numbers get passed on to the mknod command. Instead, the command must contain the major device number for the GAB driver only.

RESOLUTION:
This hotfix addresses the issue so that the GAB module starts successfully even when other driver names in the environment contain "gab" as a substring.

Patch ID: VRTSllt-7.4.2.2100

* 4039475 (Tracking ID: 4045607)

SYMPTOM:
LLT over UDP support for transmission and reception of data over 1500 MTU networks.

DESCRIPTION:
The UDP multiport feature in LLT performs poorly in case of 1500 MTU-based networks. Data packets larger than 1500 bytes cannnot be transmitted over 1500 MTU-based networks, so the IP layer fragments them appropriately for transmission. The loss of a single fragment from the set leads to a total packet (I/O) loss. LLT then retransmits the same packet repeatedly until the transmission is successful. Eventually, you may encounter issues with the Flexible Storage Sharing (FSS) feature. For example, the vxprint process or the disk group creation process may stop responding, or the I/O-shipping performance may degrade severely.

RESOLUTION:
The UDP multiport feature of LLT is updated to fragment the packets such that they can be accommodated in the 1500-byte network frame. The fragments are rearranged on the receiving node at the LLT layer. Thus, LLT can track every fragment to the destination, and in case of transmission failures, retransmit the lost fragments based on the current RTT time.

* 4046200 (Tracking ID: 4046199)

SYMPTOM:
llt over udp configuration now accepts any link tag name

DESCRIPTION:
Previously for llt over udp configuration, the tag field in link definition had to be the ethernet interface name. With this fix any string can be used as a tag name

RESOLUTION:
Any string can be used as link tag name with this fix

* 4046420 (Tracking ID: 3989372)

SYMPTOM:
When the CPU load and memory consumption is high in a VMware environment, some nodes in an InfoScale cluster may get fenced out.

DESCRIPTION:
Occasionally, in a VMware environment, the operating system may not schedule LLT contexts on time. Consequently, heartbeats from some of the cluster nodes may be lost, and those nodes may get fenced out. This situation typically occurs when the CPU load or the memory usage is high or when the VMDK snapshot or vMotion operations are in progress.

RESOLUTION:
This fix attempts to make clusters more resilient to transient issues by heartbeating using threads bound to every vCPU.

Patch ID: VRTSllt-7.4.2.1300

* 4019535 (Tracking ID: 4018581)

SYMPTOM:
The LLT module fails to start and the system log messages indicate missing IP address.

DESCRIPTION:
When only the low priority LLT links are configured over UDP, UDPBurst mode must be disabled. UDPBurst mode must only be enabled when the high priority LLT links are configured over UDP. If the UDPBurst mode gets enabled while configuring the low priority links, the LLT module fails to start and logs the following error: "V-14-2-15795 missing ip address / V-14-2-15800 UDPburst:Failed to get link info".

RESOLUTION:
This hotfix updates the LLT module to not enable the UDPBurst mode when only the low priority LLT links are configured over UDP.

Patch ID: VRTSvcswiz-7.4.2.2100

* 4049572 (Tracking ID: 4049573)

SYMPTOM:
Veritas High Availability Configuration Wizard (HA-Plugin) is not supported on VMWare vCenter HTML based UI.

DESCRIPTION:
Veritas HA-Plugin was based on Adobe Flex. HA-Plugin fails to work because Flex is now deprecated.

RESOLUTION:
Veritas HA-Plugin now supports VMWare vCenter HTML based UI.

Patch ID: VRTSpython-3.7.4.35

* 4049693 (Tracking ID: 4049692)

SYMPTOM:
In order to support Licensing module, VRTSpython must include additional modules in it.

DESCRIPTION:
Licensing module utilizes VRTSpython package which needs additional modules to be added in VRTSpython.

RESOLUTION:
VRTSpython packages has been updated to include additional python modules in it.

Patch ID: VRTSvlic-4.01.742.300

* 4049416 (Tracking ID: 4049416)

SYMPTOM:
Frequent Security vulnerabilities reported in JRE.

DESCRIPTION:
There are many vulnerabilities reported in JRE every quarter. To overcome this vulnerabilities issue migrate Telemetry Collector from Java to Python.
All other behavior of Telemetry Collector will remain the same.

RESOLUTION:
Migrated Telemetry Collector from Java to Python.

Patch ID: VRTSsfmh-vom-HF0742501

* 4049522 (Tracking ID: 4049521)

SYMPTOM:
N/A

DESCRIPTION:
VIOM Agent for InfoScale 7.4.2 Update3

RESOLUTION:
N/A

Patch ID: VRTSodm-7.4.2.2200

* 4049440 (Tracking ID: 4049438)

SYMPTOM:
VRTSodm driver will not load with 7.4.2.2200 VRTSvxfs patch.

DESCRIPTION:
Need recompilation of VRTSodm due to recent changes in VRTSvxfs.

RESOLUTION:
Recompiled the VRTSodm with new changes in VRTSvxfs.

Patch ID: VRTSodm-7.4.2.1500

* 4023556 (Tracking ID: 4023555)

SYMPTOM:
VRTSodm module is not able to load on linux.

DESCRIPTION:
Need recompilation of VRTSodm due to recent changes in VRTSodm 
due to which some symbols are not being resolved.

RESOLUTION:
Recompiled the VRTSodm to load vxodm module.

Patch ID: VRTSvxfs-7.4.2.2200

* 4013420 (Tracking ID: 4013139)

SYMPTOM:
The abort operation on an ongoing online migration from the native file system to VxFS on RHEL 8.x systems.

DESCRIPTION:
The following error messages are logged when the abort operation fails:
umount: /mnt1/lost+found/srcfs: not mounted
UX:vxfs fsmigadm: ERROR: V-3-26835:  umount of source device: /dev/vx/dsk/testdg/vol1 failed, with error: 32

RESOLUTION:
The fsmigadm utility is updated to address the issue with the abort operation on an ongoing online migration.

* 4040238 (Tracking ID: 4035040)

SYMPTOM:
After replication job paused and resumed some of the fields got missed in stats command output and never shows missing fields on onward runs.

DESCRIPTION:
rs_start for the current stat initialized to the start time of the replication and default value of rs_start is zero.
Stat don't show some fields in-case rc_start is zero.

        if (rs->rs_start && dis_type == VX_DIS_CURRENT) {
                if (!rs->rs_done) {
                        diff = rs->rs_update - rs->rs_start;
                }
                else {
                        diff = rs->rs_done - rs->rs_start;
                }

                /*
                 * The unit of time is in seconds, hence
                 * assigning 1 if the amount of data
                 * was too small
                 */

                diff = diff ? diff : 1;
                rate = rs->rs_file_bytes_synced /
                        (diff - rs->rs_paused_duration);
                printf("\t\tTransfer Rate: %s/sec\n", fmt_bytes(h,rate));
        }

In replication we initialize the rs_start to zero and update with the start time but we don't save the stats to disk. That small window leave a case where
in-case, we pause the replication and start again we always see the rs_start to zero.

Now after initializing the rs_start we write to disk in the same function. In-case in resume case we found rs_start to zero, we again initialize the rs_start 
field to current replication start time.

RESOLUTION:
Write rs_start to disk and added a check in resume case to initialize rs_start value in-case found 0.

* 4040608 (Tracking ID: 4008616)

SYMPTOM:
fsck command got hung.

DESCRIPTION:
fsck got stuck due to deadlock when a thread which marked buffer aliased is waiting for itself for the reference drain, while
getting block code was called with NOBLOCK flag.

RESOLUTION:
honour NOBLOCK flag

* 4042686 (Tracking ID: 4042684)

SYMPTOM:
Command fails to resize the file.

DESCRIPTION:
There is a window where a parallel thread can clear IDELXWRI flag which it should not.

RESOLUTION:
setting the delayed extending write flag incase any parallel thread has cleared it.

* 4044184 (Tracking ID: 3993140)

SYMPTOM:
In every 60 seconds, compclock was lagging behind approximate 1.44 seconds from actual time elapsed.

DESCRIPTION:
In every 60 seconds, compclock was lagging behind approximate 1.44 seconds from actual time elapsed.

RESOLUTION:
Made adjustment to logic responsible for calculating and updating compclock timer.

* 4046265 (Tracking ID: 4037035)

SYMPTOM:
Added new tunable "vx_ninact_proc_threads" to control the number of inactive processing threads.

DESCRIPTION:
On high end servers, heavy lock contention was seen during inactive removal processing, which was caused by the large number of inactive worker threads spawned by VxFS. To avoid the contention, new tunable "vx_ninact_proc_threads" was added so that customer can adjust the number of inactive processing threads based on their server config and workload.

RESOLUTION:
Added new tunable "vx_ninact_proc_threads" to control the number of inactive processing threads.

* 4046266 (Tracking ID: 4043084)

SYMPTOM:
panic in vx_cbdnlc_lookup

DESCRIPTION:
Panic observed in the following stack trace:
vx_cbdnlc_lookup+000140 ()
vx_int_lookup+0002C0 ()
vx_do_lookup2+000328 ()
vx_do_lookup+0000E0 ()
vx_lookup+0000A0 ()
vnop_lookup+0001D4 (??, ??, ??, ??, ??, ??)
getFullPath+00022C (??, ??, ??, ??)
getPathComponents+0003E8 (??, ??, ??, ??, ??, ??, ??)
svcNameCheck+0002EC (??, ??, ??, ??, ??, ??, ??)
kopen+000180 (??, ??, ??)
syscall+00024C ()

RESOLUTION:
Code changes to handle memory pressure while changing FC connectivity

* 4046267 (Tracking ID: 4034910)

SYMPTOM:
Garbage values inside global list large_dirinfo.

DESCRIPTION:
Garbage values inside global list large_dirinfo, which will lead to fsck failure.

RESOLUTION:
Make access/updataion to global list large_dirinfo synchronous throughout the fsck binary, so that garbage values due to race condition can be avoided.

* 4046271 (Tracking ID: 3993822)

SYMPTOM:
running fsck on a file system core dumps

DESCRIPTION:
buffer was marked as busy without taking buffer lock while getting buffer from freelist in 1 thread and there was another thread 
that was accessing this buffer through its local variable

RESOLUTION:
marking buffer busy within the buffer lock while getting free buffer.

* 4046272 (Tracking ID: 4017104)

SYMPTOM:
Deleting a huge number of inodes can consume a lot of system resources during inactivations which cause hangs or even panic.

DESCRIPTION:
Delicache inactivations dumps all the inodes in its inventory, all at once for inactivation. This causes a surge in the resource consumptions due to which other processes can starve.

RESOLUTION:
Gradually process the inode inactivation.

* 4046829 (Tracking ID: 3993943)

SYMPTOM:
The fsck utility hit the coredump due to segmentation fault in get_dotdotlst().

Below is stack trace of the issue.

get_dotdotlst 
check_dotdot_tbl 
iproc_do_work
start_thread 
clone ()

DESCRIPTION:
Due to a bug in fsck utility the coredump was generated while running the fsck on the filesystem. The fsck operation aborted in between due to the coredump.

RESOLUTION:
Code changes are done to fix this issue

* 4047568 (Tracking ID: 4046169)

SYMPTOM:
On RHEL8, while doing a directory move from one FS (ext4 or vxfs) to migration VxFS, the migration can fail and FS will be disable. In debug testing, the issue was caught by internal assert, with following stack trace.

panic
ted_call_demon
ted_assert
vx_msgprint
vx_mig_badfile
vx_mig_linux_removexattr_int
__vfs_removexattr
__vfs_removexattr_locked
vfs_removexattr
removexattr
path_removexattr
__x64_sys_removexattr
do_syscall_64

DESCRIPTION:
Due to different implementation of "mv" operation in RHEL8 (as compared to RHEL7), there is a removexattr call on the target FS - which in migration case will be migration VxFS. In this removexattr call, kernel asks "system.posix_acl_default" attribute to be removed from the directory to be moved. But since the directory is not present on the target side yet (and hence no extended attributes for the directory), the code returns ENODATA. When code in vx_mig_linux_removexattr_int() encounter this error, it disables the FS and in debug pkg calls assert.

RESOLUTION:
The fix is to ignore ENODATA error and not assert or disable the FS.

* 4049091 (Tracking ID: 4035057)

SYMPTOM:
On RHEL8, IOs done on FS, while other FS to VxFS migration is in progress can cause panic, with following stack trace.
 machine_kexec
 __crash_kexec
 crash_kexec
 oops_end
 no_context
 do_page_fault
 page_fault
 [exception RIP: memcpy+18]
 _copy_to_iter
 copy_page_to_iter
 generic_file_buffered_read
 new_sync_read
 vfs_read
 kernel_read
 vx_mig_read
 vfs_read
 ksys_read
 do_syscall_64

DESCRIPTION:
- As part of RHEL8 support changes, vfs_read, vfs_write calls were replaced with kernel_read, kernel_write as the vfs_ calls are no longer exported. The kernel_read, kernel_write calls internally set the memory segment of the thread to KERNEL_DS and expects the buffer passed to have been allocated in kernel space.
- In migration code, if the read/write operation cannot be completed using target FS (VxFS), then the IO is redirected to source FS. And in doing so, the code passes the same buffer - which is a user buffer to kernel call. This worked well with vfs_read, vfs_write calls. But is does not work with kernel_read, kernel_write calls, causing a panic.

RESOLUTION:
- Fix is to use vfs_iter_read, vfs_iter_write calls, which work with user buffer. To use these methods the user buffer needs to passed as part of struct iovec.iov_base

* 4049097 (Tracking ID: 4049096)

SYMPTOM:
Tar command errors out with 1 throwing warnings.

DESCRIPTION:
This is happening due to dalloc which is changing the ctime of the file after allocating the extents `(worklist thread)->vx_dalloc_flush -> vx_dalloc_off` in between the 2 fsstat calls in tar.

RESOLUTION:
Avoiding changing ctime while allocating delayed extents in background.

Patch ID: VRTSvxfs-7.4.2.1600

* 4012765 (Tracking ID: 4011570)

SYMPTOM:
WORM attribute replication support in VxFS.

DESCRIPTION:
WORM attribute replication is not supported in VFR. Modified code to replicate WORM attribute during attribute processing in VFR.

RESOLUTION:
Code is modified to replicate WORM attributes in VFR.

* 4014720 (Tracking ID: 4011596)

SYMPTOM:
It throws error saying "No such file or directory present"

DESCRIPTION:
Bug observed during parallel communication between all the nodes. Some required temp files were not present on other nodes.

RESOLUTION:
Fixed to have consistency maintained while parallel node communication. Using hacp for transferring temp files.

* 4015287 (Tracking ID: 4010255)

SYMPTOM:
"vfradmin promote" fails to promote target FS with selinux enabled.

DESCRIPTION:
During promote operation, VxFS remounts FS at target. When remounting FS to remove "protected on" flag from target, VxFS first fetch current mount options. With Selinux enabled (either in permissive mode/enabled), OS adds default "seclable" option to mount. When VxFS fetch current mount options, "seclabel" was not recognized by VxFS. Hence it fails to mount FS.

RESOLUTION:
Code is modified to remove "seclabel" mount option during mount processing on target.

* 4015835 (Tracking ID: 4015278)

SYMPTOM:
System panics during vx_uiomove_by _hand

DESCRIPTION:
During uiomove, VxFS get the pages from OS through get_user_pages() to copy user data. Oracle use hugetablfs internally for performance reason. This can allocate hugepages. Under low memory condition, it is possible that get_user_pages() might return VxFS compound pages. In case of compound pages, only head page has valid mapping set and all other pages are mapped as TAIL_MAPPING. In case of uiomove, if VxFS gets compound page, then it try to check writable mapping for all pages from this compound page. This can result into dereferencing illegal address (TAIL_MAPPING) which was causing panic in  stack. VxFS doesn't support huge pages but it is possible that compound page is present on the system and VxFS might get one through get_user_pages.

RESOLUTION:
Code is modified to get head page in case of tail pages from compound page when VxFS checks writeable mapping.

* 4016721 (Tracking ID: 4016927)

SYMPTOM:
Remove tier command panics the system, crash has panic reason "BUG: unable to handle kernel NULL pointer dereference at 0000000000000150"

DESCRIPTION:
When fsvoladm removes device all devices are not moved. Number of device count also remains same unless it is the last device in the array. So check for free slot before trying to access device.

RESOLUTION:
In the device list check for free slot before accessing the device in that slot.

* 4017282 (Tracking ID: 4016801)

SYMPTOM:
filesystem mark for fullfsck

DESCRIPTION:
In cluster environment, some operation can be perform on primary node only. When such operations are executed from secondary node, message is 
passed to primary node. During this, it may possible sender node has some transaction and not yet reached to disk. In such scenario, if sender node rebooted 
then primary node can see stale data.

RESOLUTION:
Code is modified to make sure transactions are flush to log disk before sending message to primary.

* 4017818 (Tracking ID: 4017817)

SYMPTOM:
NA

DESCRIPTION:
In order to increase the overall throughput of VFR, code changes have been done
to replicate files parallelly.

RESOLUTION:
Code changes have been done to replicate file's data & metadata parallely over
multiple socket connections.

* 4017820 (Tracking ID: 4017819)

SYMPTOM:
Cloud tier add operation fails when user is trying to add the AWS GovCloud.

DESCRIPTION:
Adding AWS GovCloud as a cloud tier was not supported in InfoScale. With these changes, user will be able to add AWS GovCloud type of cloud.

RESOLUTION:
Added support for AWS GovCloud

* 4019877 (Tracking ID: 4019876)

SYMPTOM:
vxfsmisc.so is publicly shared library for samba and doesn't require infoscale license for its usage

DESCRIPTION:
vxfsmisc.so is publicly shared library for samba and doesn't require infoscale license for its usage

RESOLUTION:
Removed license dependency in vxfsmisc library

* 4020055 (Tracking ID: 4012049)

SYMPTOM:
"fsck" supports the "metasave" option but it was not documented anywhere.

DESCRIPTION:
"fsck" supports the "metasave" option while executing with the "-y" option. but it is not documented anywhere. Also, it tries to store metasave in a particular location. The user doesn't have the option to specify the location. If that location doesn't have enough space, "fsck" fails to take the metasave and it continues to change filesystem state.

RESOLUTION:
Code changes have been done to add one new option with which the user can specify the location to store metasave. "metasave" and "target", these two options have been added in the "usage" message of "fsck" binary.

* 4020056 (Tracking ID: 4012049)

SYMPTOM:
"fsck" supports the "metasave" option but it was not documented anywhere.

DESCRIPTION:
"fsck" supports the "metasave" option while executing with the "-y" option. but it is not documented anywhere. Also, it tries to store metasave in a particular location. The user doesn't have the option to specify the location. If that location doesn't have enough space, "fsck" fails to take the metasave and it continues to change filesystem state.

RESOLUTION:
Code changes have been done to add one new option with which the user can specify the location to store metasave. "metasave" and "target", these two options have been added in the "usage" message of "fsck" binary.

* 4020912 (Tracking ID: 4020758)

SYMPTOM:
Filesystem mount or fsck with -y may see hang during log replay

DESCRIPTION:
fsck utility is used to perform the log replay. This log replay is performed during mount operation or during filesystem check with -y option, if needed. In certain cases if there are lot of logs that needs to be replayed then it end up into consuming entire buffer cache. This results into out of buffer scenario and results into hang.

RESOLUTION:
Code is modified to make sure enough buffers are always available.

Patch ID: VRTSgms-7.4.2.1200

* 4023553 (Tracking ID: 4023552)

SYMPTOM:
VRTSgms module is not able to load on linux.

DESCRIPTION:
Need recompilation of VRTSgms due to recent changes in VRTSgms 
due to which some symbols are not being resolved.

RESOLUTION:
Recompiled the VRTSgms to load vxgms module.

Patch ID: VRTSglm-7.4.2.1500

* 4014719 (Tracking ID: 4011596)

SYMPTOM:
It throws error saying "No such file or directory present"

DESCRIPTION:
Bug observed during parallel communication between all the nodes. Some required temp files were not present on other nodes.

RESOLUTION:
Fixed to have consistency maintained while parallel node communication. Using hacp for transferring temp files.

Patch ID: VRTSvcsea-7.4.2.1100

* 4020528 (Tracking ID: 4001565)

SYMPTOM:
On Solaris 11.4, IMF fails to provide notifications when Oracle processes stop.

DESCRIPTION:
On Solaris 11.4, when Oracle processes stop, IMF provides notification to Oracle agent, but the monitor is not scheduled. As as result, agent fails intelligent monitoring.

RESOLUTION:
Oracle agent now provides notifications when Oracle processes stop.



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch infoscale-sles12_x86_64-Patch-7.4.2.1900.tar.gz to /tmp
2. Untar infoscale-sles12_x86_64-Patch-7.4.2.1900.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/infoscale-sles12_x86_64-Patch-7.4.2.1900.tar.gz
    # tar xf /tmp/infoscale-sles12_x86_64-Patch-7.4.2.1900.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/hf
    # ./installVRTSinfoscale742P1900 [<host1> <host2>...]

You can also install this patch together with 7.4.2 base release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.4.2 directory and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory
    # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
Manual installation is not recommended.


REMOVING THE PATCH
------------------
Manual uninstallation is not recommended.


KNOWN ISSUES
------------
* Tracking ID: 4052877

SYMPTOM: After an upgrade from InfoScale 7.4.1 to InfoScale 7.4.2 Update3 on RHEL 8.4, I/O operations may become unresponsive at the secondary site.

WORKAROUND: Stop the workload that is being run at the primary site, reboot the nodes at the secondary site, and then resynchronize the RVG.



SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
NONE


Applies to the following product releases

Update files

File name Description Version Platform Size