7.4.2 U8 Component Patch on RHEL8

Translation Notice

Please note that this content includes text that has been machine-translated from English. Veritas does not guarantee the accuracy regarding the completeness of the translation. You may also refer to the English Version of this knowledge base article for up-to-date information.

7.4.2 U8 Component Patch on RHEL8

Patch

Update ID: UPD454549

Version: 7.4.2.5800

Platform: Linux

Release date: 2025-05-27

Abstract

Infoscale 7.4.2 U8 Component Patch for RHEL8 platform

Description

This is a 7.4.2 U8 component patch on RHEL8 platform

SORT ID: 22180

Pre-requisite:

This patch should be installed on IS-7.4.2GA + latest cumulative patch released on IS-7.4.2

(IS-7.4.2 GA + IS-7.4.2.5600)

PATCH NAME:

InfoScale 7.4.2 Patch 5800
(RHEL8 Support on IS 7.4.2)

Patch ID's:

VRTSspt-7.4.2.1600-0038_RHEL8 for VRTSspt
VRTSvxvm-7.4.2.5600-RHEL8 for VRTSvxvm

SPECIAL NOTES:

1. In case the internet is not available, Installation of the patch must be performed concurrently with the latest CPI patch downloaded from the Download Center.

                          * * * READ ME * * *
                      * * * InfoScale 7.4.2 * * *
                         * * * Patch 5800 * * *
                         Patch Date: 2025-05-26


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH


PATCH NAME
----------
InfoScale 7.4.2 Patch 5800


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
RHEL8 x86-64


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSspt
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * InfoScale Availability 7.4.2
   * InfoScale Enterprise 7.4.2
   * InfoScale Foundation 7.4.2
   * InfoScale Storage 7.4.2


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: VRTSvxvm-7.4.2.5600
* 4189446 (4183777) System log is flooding with the fake alarms "VxVM vxio V-5-0-0 read/write on disk: xxx took longer to complete".
* 4189726 (4189725) DMP reported a false warning 'read or write on disk: xxx took longer to complete' when there was no path to serve IO.
Patch ID: VRTSvxvm-7.4.2.5500
* 4189294 (4189295) Inconsistent build environments are causing module compatibility issues on RHEL8.
Patch ID: VRTSvxvm-7.4.2.5400
* 4188895 (4188763) Stale and incorrect symbolic links to VxDMP devices in "/dev/disk/by-uuid".
Patch ID: VRTSvxvm-7.4.2.5300
* 4157992 (4154121) add a new tunable use_hw_replicatedev to enable Volume Manager to import the hardware replicated disk group.
* 4164820 (4159403) add clearclone option automatically when import the hardware replicated disk group.
* 4164822 (4160883) clone_flag was set on srdf-r1 disks after reboot.
* 4168114 (4161827) RHEL8.10 Platform Support in VxVM
Patch ID: VRTSvxvm-7.4.2.5200
* 4160884 (4160883) clone_flag was set on srdf-r1 disks after reboot.
Patch ID: VRTSvxvm-7.4.2.5100
* 4150574 (4077944) In VVR environment, application I/O operation may get hung.
* 4152117 (4142054) primary master got panicked with ted assert during the run.
* 4152732 (4111978) Replication failed to start due to vxnetd threads not running on secondary site.
* 4155720 (4154921) system is stuck in zio_wait() in FC-IOV environment after reboot the primary control domain when dmp_native_support is on.
* 4157891 (4130393) vxencryptd crashed repeatedly due to segfault.
* 4157992 (4154121) add a new tunable use_hw_replicatedev to enable Volume Manager to import the hardware replicated disk group.
* 4158080 (4106254) Nodes crashed in shared-nothing (Flexible Shared Storage) environment if node reboot followed by NVME disk failure is executed
* 4158081 (4085477) Settag operation fails due to an incorrect disk getting picked up for operation.
* 4158082 (3989340) EC: Volume state Tutil flag not getting cleared for cascaded disk fail / cluster reboot
* 4158083 (4005719) For encrypted volumes, the disk reclaim operation gets hung.
* 4158084 (4024140) In VVR environments, in case of disabled volumes, the DCM read operation does not complete, resulting in application IO hang.
* 4158085 (4046560) vxconfigd aborts on Solaris if device's hardware path is too long.
* 4158086 (4142772) Error mask NM_ERR_DCM_ACTIVE on rlink may not be cleared resulting in the rlink being unable to get into DCM again.
* 4158087 (4011582) Display minimum and maximum read/write time it takes for the I/O under VxVM layer using vxstat utility.
* 4158088 (4089801) Cluster went in hanged state after rebooting 6 slave nodes
* 4158089 (3972770) Longevity:RHEL7.6:DV_adaptive_sync:Pri master node got panic during hastop -all/hastart, "voldco_get_mapid+0x5b/0xd0 [vxio]"
* 4158090 (4019380) vxcloudd daemon dumps core .
* 4158091 (4058266) Add an option to ignore 0 stats entries for objects.
* 4158092 (4100037) Error in vxstat statics display
Patch ID: VRTSvxvm-7.4.2.4800
* 4076321 (4076320) AVID, reclaim_cmd_nv, extattr_nv, old_udid_nv are not generated for HPE 3PAR/Primera/Alletra 9000 ALUA array.
* 4134887 (4020942) Data corruption/loss on erasure code (EC) volumes post rebalance/disk movement operations while active application IO in progress.
* 4135142 (4040043) Warnings in dmesg/ kernel logs  for violating memory usage/handling  protocols.
* 4135248 (4129663) Generate and add changelog in vxvm and aslapm rpm
* 4136240 (4040695) vxencryptd getting coredump because of static buffer size.
* 4140562 (4134305) Collecting ilock stats for admin SIO causes buffer overrun.
* 4140572 (4080124) Data corruption on mirrored volume in shared-nothing (Flexible Shared Storage) environment during failure of VxVM configuration update.
* 4140589 (4120068) A standard disk was added to a cloned diskgroup successfully which is not expected.
* 4140690 (4100547) Full volume resync happens(~9hrs) post last node reboot at secondary site in a NBFS DR cluster.
Patch ID: VRTSvxvm-7.4.2.4700
* 4134888 (4105204) Node not able to join the cluster after iLO "press and hold" scenario in loop
* 4134889 (4107401) Replication stopped after VVR logowner reboot
* 4136239 (4069940) FS mount failed during Cluster configuration on 24-node physical HP BOM2 setup.
* 4136316 (4098144) vxtask list shows the parent process without any sub-tasks which never progresses for SRL volume
* 4136482 (4132799) No detailed error messages while joining CVM fail.
* 4137008 (4133793) vxsnap restore failed with DCO IO errors during the operation when run in loop for multiple VxVM volumes.
* 4139447 (4139448) RHEL8.9 Platform Support in VxVM
Patch ID: VRTSvxvm-7.4.2.4600
* 4069525 (4065490) VxVM udev rules consumes more CPU and appears in "top" output when system has thousands of storage devices attached.
* 4074816 (4066785) create new option usereplicatedev=only to import the replicated LUN only.
* 4084386 (4073653) VxVM commands get hung after pause-resume and resync operation in CVR setup.
* 4116576 (3972344) vxrecover returns an error - 'ERROR V-5-1-11150'  Volume <vol_name> not found'
* 4128868 (4128867) Security vulnerabilities exists in third party component OpenSSL.
* 4128885 (4115193) Data corruption observed after the node fault and cluster restart in DR environment
* 4131718 (4088941) Panic observed at scsi_queue_rq in SLES15SP3.
* 4134702 (4122396) When using KillMode=control-group, stopping the vxvm-recover.service results in a failed state.
* 4134875 (4130642) node failed to rejoin the cluster after this node switched from master to slave due to the failure of the replicated diskgroup import.
* 4134877 (4128451) A hardware replicated disk group fails to be auto-imported after reboot.
* 4134885 (4134023) vxconfigrestore(Diskgroup configuration restoration) for H/W Replicated diskgroup failed.
* 4135150 (4114867) systemd-udevd[2224]: invalid key/value pair in file /etc/udev/rules.d/41-VxVM-selinux.rules on line 20, starting at character 103 ('D')
Patch ID: VRTSvxvm-7.4.2.4500
* 4128868 (4128867) Security vulnerabilities exists in third party component OpenSSL.
Patch ID: VRTSvxvm-7.4.2.4400
* 4092002 (4081740) vxdg flush command slow due to too many luns needlessly access /proc/partitions.
* 4111010 (4108475) vxfentsthdw script failed with "Expect no writes for disks ... "
* 4113327 (4102439) Volume Manager Encryption EKM Key Rotation (vxencrypt rekey) Operation Fails on IS 7.4.2/rhel7
* 4115231 (4090772) vxconfigd/vx commands hung if fdisk opened secondary volume and secondary logowner panic'd
* 4116422 (4111254) vradmind dumps core while associating a rlink to rvg because of NULL pointer reference.
* 4116427 (4108913) Vradmind dumps core because of memory corruption.
* 4116435 (4034741) The current fix from limits IO load on secondary causing deadlock situtaion
* 4116437 (4072862) Stop cluster hang because of RVGLogowner and CVMClus resources fail to offline.
* 4116576 (3972344) vxrecover returns an error - 'ERROR V-5-1-11150'  Volume <vol_name> not found'
* 4117899 (4055159) vxdisk list showing incorrect value of LUN_SIZE for nvme disks
* 4117989 (4085145) EBSvol agent error in attach disk : RHEL 7.9 + Infoscale 8.0 on AWS instance type c6i.large with NVME devices.
* 4118256 (4028439) Updating mediatype tages through disk online event.
* 4119951 (4119950) Security vulnerabilities exists in third party components [curl and libxml].
* 4120540 (4102532) /etc/default/vxsf file gets world write permission when "vxtune storage_connectivity asymmetric" is run.
* 4120545 (4090826) system panic at vol_page_offsetlist_sort
* 4120547 (4093067) System panic occurs because of NULL pointer in block device structure.
* 4120720 (4086063) semodule policy is installed in %post stage during vxvm upgrade and then gets removed in %preun stage.
* 4120722 (4021816) semodule of upgraded VxVM package gets removed in %preun stage of install script during package upgrade.
* 4120724 (3995831) System hung: A large number of SIOs got queued in FMR.
* 4120728 (4090476) SRL is not draining to secondary.
* 4120769 (4014894) Disk attach is done one by one for each disk creating transactions for each disk
* 4120876 (4081434) VVR kernel panic dring process the ACK message at VVR Primary side.
* 4120899 (4116024) machine panic due to access illegal address.
* 4120903 (4100775) vxconfigd was hung as VxDMP doesn't support chained BIO on rhel7.
* 4120916 (4112687) DLE (Dynamic Lun Expansion) of single path GPT disk may corrupt disk public region.
* 4121075 (4100069) One of standard disk groups fails to auto-import with 'Disk for disk group not found' error when those disk groups co-exist with the cloned disk group.
* 4121081 (4098965) Crash at memset function due to invalid memory access.
* 4121083 (4105953) system panic due to VVR accessed a NULL pointer.
* 4121222 (4095718) some tasks kept waiting for IO drain and caused system IO hung.
* 4121243 (4101588) vxtune shows incorrect vol_rvio_maxpool_sz and some other tunables when they're over 4g.
* 4121254 (4115078) vxconfigd hung was observed when reboot all nodes of the primary site.
* 4121681 (3995731) vxconfigd dumping core due to NULL pointer.
* 4121763 (3995308) vxtask status hang due to incorrect values getting copied into task status information.
* 4121767 (4117568) vradmind dumps core due to invalid memory access.
* 4121875 (4090943) VVR Primary RLink cannot connect as secondary reports SRL log is full.
* 4123313 (4114927) Failed to mount /boot on dmp device after enabling dmp_native_support.
* 4124324 (4098582) Permission are changing to 644 of ddl.log after logrotation, customer need it to be 640 as per security compliance
* 4126041 (4124223) Core dump is generated for vxconfigd in TC execution.
* 4127473 (4089626) Create XFS on VxDMP devices hang as VxDMP doesn't support chained BIO.
* 4127475 (4114601) Panic: in dmp_process_errbp() for disk pull scenario.
Patch ID: VRTSvxvm-7.4.2.4300
* 4119951 (4119950) Security vulnerabilities exists in third party components [curl and libxml].
Patch ID: VRTSvxvm-7.4.2.4100
* 4116348 (4112433) Security vulnerabilities exists in third party components [openssl, curl and libxml].
Patch ID: VRTSvxvm-7.4.2.3900
* 4110560 (4104927) Changing the attributes in vxvm-boot.service for SLES15 is causing regression in RHEL versions.
* 4113324 (4113323) VxVM Support on RHEL 8.8
* 4113661 (4091076) SRL gets into pass-thru mode because of head error.
* 4113663 (4095163) system panic due to a race freeing VVR update.
* 4113664 (4091390) vradmind service has dump core and stopped on few nodes
* 4113666 (4064772) After enabling slub debug, system could hang with IO load.
Patch ID: VRTSvxvm-7.4.2.3800
* 4110666 (4110665) A security vulnerability exists in the third-party component libcurl.
* 4110766 (4112033) A security vulnerability exists in the third-party component libxml2.
Patch ID: VRTSvxvm-7.4.2.3700
* 4105752 (4107924) VxVM rpm Support on RHEL 8.7 minor kernel 4.18.0-425.10.1.el8_7.x86_64
* 4106001 (4102501) A security vulnerability exists in the third-party component libcurl.
* 4107223 (4107802) Fix for calculating best-fit module for upcoming RHEL8.7 minor kernels (higher than 4.18.0-425.10.1.el8_7.x86_64).
Patch ID: VRTSvxvm-7.4.2.3600
* 4102424 (4103350) vxvm-encrypted.service going into failed state on secondary site on performing "vradmind -g <dg> -encrypted addsec <rvg> <prim_ip> <sec_ip>" command.
Patch ID: VRTSvxvm-7.4.2.3500
* 4012176 (3996206) Update Lun Serial Number for 3par disks
* 4013169 (4011691) High CPU consumption on the VVR secondary nodes because of high pending IO load.
* 4052119 (4045871) vxconfigd crashed at ddl_get_disk_given_path.
* 4086043 (4072241) vxdiskadm functionality is failing due to changes in dmpdr script
* 4090311 (4039690) Change the logger files size and do the gzip on logger files.
* 4090411 (4054685) In case of CVR environment, RVG recovery gets hung in linux platforms.
* 4090415 (4071345) Unplanned fallback synchronisation is unresponsive
* 4090442 (4078537) Connection to s3-fips bucket is failing
* 4090541 (4058166) Increase DCM log size based on volume size without exceeding region size limit of 4mb.
* 4090599 (4080897) Performance drop on raw VxVM volume in RHEL 8.x compared to RHEL7.X
* 4090604 (4044529) DMP is unable to display PWWN details for some LUNs by "vxdmpadm getportids".
* 4090932 (3996634) System boots slow since Linux lsblk command return within long time.
* 4090946 (4023297) Smartmove functionality was not being used after VVR Rlink was paused and resumed during VVR initial sync or DCM resync operation.
* 4090960 (4087770) NBFS: Data corruption due to skipped full-resync of detached mirrors of volume after DCO repair operation
* 4090970 (4017036) After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device when Linux is booting with systemd.
* 4091248 (4040808) df command hung in clustered environment
* 4091588 (3966157) SRL batching feature is broken
* 4091910 (4090321) Increase timeout for vxvm-boot systemd service
* 4091911 (4090192) Increase number of DDL threads for faster discovery
* 4091912 (4090234) Volume Manager Boot service is failing after reboot the system.
* 4091963 (4067191) In CVR environment after rebooting Slave node, Master node may panic
* 4091989 (4090930) [NBFS-3.1]: MASTER FS corruption is seen in loop reboot (-f) test
* 4092002 (4081740) vxdg flush command slow due to too many luns needlessly access /proc/partitions.
* 4092838 (4101128) VxVM rpm Support on RHEL 8.7 kernel
* 4099550 (4065145) multivolume and vset not able to overwrite encryption tags on secondary.
Patch ID: VRTSvxvm-7.4.2.3300
* 4083792 (4082799) A security vulnerability exists in the third-party component libcurl.
Patch ID: VRTSvxvm-7.4.2.3200
* 4011971 (3991668) In a Veritas Volume Replicator (VVR) configuration where secondary logging is enabled, data inconsistency is reported after the "No IBC message arrived" error is encountered.
* 4013169 (4011691) High CPU consumption on the VVR secondary nodes because of high pending IO load.
* 4037288 (4034857) VxVM support on SLES 15 SP2
* 4054311 (4040701) Some warnings are observed while installing vxvm package.
* 4056919 (4056917) Import of disk group in Flexible Storage Sharing (FSS) with missing disks can lead to data corruption.
* 4058873 (4057526) Adding check for init while accessing /var/lock/subsys/ path in vxnm-vxnetd.sh script.
* 4060839 (3975667) Softlock in vol_ioship_sender kernel thread
* 4060962 (3915202) Reporting repeated disk failures & DCPA events for other internal disks
* 4060966 (3959716) System may panic with sync replication with VVR configuration, when the RVG is in DCM mode.
* 4061004 (3993242) vxsnap prepare command when run on vset sometimes fails.
* 4061036 (4031064) Master switch operation is hung in VVR secondary environment.
* 4061055 (3999073) The file system corrupts when the cfsmount group goes into offline state.
* 4061057 (3931583) Node may panic while unloading the vxio module due to race condition.
* 4061298 (3982103) I/O hang is observed in VVR.
* 4061317 (3925277) DLE (Dynamic Lun Expansion) of single path GPT disk may corrupt disk public region.
* 4061509 (4043337) logging fixes for VVR
* 4062461 (4066785) create new option usereplicatedev=only to import the replicated LUN only.
* 4062577 (4062576) hastop -local never finishes on Rhel8.4 and RHEL8.5 servers with latest minor kernels due to hang in vxdg deport command.
* 4062746 (3992053) Data corruption may happen with layered volumes due to some data not re-synced while attaching a plex.
* 4062747 (3943707) vxconfigd reconfig hang when joing a cluster
* 4062751 (3989185) In a Veritas Volume Manager(VVR) environment vxrecover command can hang.
* 4062755 (3978453) Reconfig hang during master takeover
* 4063374 (4005121) Application IOPS drop in DCM mode with DCO-integrated DCM
* 4064523 (4049082) I/O read error is displayed when remote FSS node rebooting.
* 4066930 (3951527) Data loss on DR site seen while upgrading from Infoscale 7.3.1 or before to 7.4.x or later versions.
* 4067706 (4060462) Nidmap information is not cleared after a node leaves, resulting in add node failure subsequently.
* 4067710 (4064208) Node failed to join the existing cluster after bits are upgraded to a newer version.
* 4067712 (3868140) VVR primary site node might panic if the rlink disconnects while some data is getting replicated to secondary.
* 4067713 (3997531) Fail to start the VVR replication as vxnetd threads are not running
* 4067715 (4008740) Access to freed memory
* 4067717 (4009151) Auto-import of diskgroup on system reboot fails with error 'Disk for diskgroup not found'.
* 4067914 (4037757) Add a tunable to control auto start VVR services on boot up.
* 4067915 (4059134) Resync takes too long on raid-5 volume
* 4069522 (4043276) vxattachd is onlining previously offlined disks.
* 4069523 (4056751) Import read only cloned disk corrupts private region
* 4069524 (4056954) Vradmin addsec failures when encryption is enabled over wire
* 4070099 (3159650) Implemented vol_vvr_use_nat tunable support for vxtune.
* 4070253 (3911930) Provide a way to clear the PGR_FLAG_NOTSUPPORTED flag on the device instead of using exclude and include commands.
* 4071131 (4071605) A security vulnerability exists in the third-party component libxml2.
* 4072874 (4046786) FS becomes NOT MOUNTED after powerloss/poweron on all nodes.
Patch ID: VRTSvxvm-7.4.2.2400
* 4018181 (3995474) VxVM sub-disks IO error occurs unexpectedly on SLES12SP3.
* 4051701 (4031597) vradmind generates a core dump in __strncpy_sse2_unaligned.
* 4051702 (4019182) In case of a VxDMP configuration, an InfoScale server panics when applying a patch.
* 4051705 (4049371) DMP unable to display all WWN details when running "vxdmpadm getctlr all".
* 4051706 (4046007) disk private region gets corrupted after cluster name change in FSS environment.
* 4053228 (4053230) VxVM support for RHEL 8.5
* 4055211 (4052191) Unexcepted scripts or commands got launched from vxvm-configure script because of incorrect comments format.
Patch ID: VRTSvxvm-7.4.2.2200
* 4018173 (3852146) Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options 
are
specified together
* 4018178 (3906534) After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device.
* 4031342 (4031452) vxesd core dump in esd_write_fc()
* 4037283 (4021301) Data corruption issue observed in VxVM on RHEL8.
* 4042038 (4040897) Add support for HPE MSA 2060 arrays in the current ASL.
* 4046906 (3956607) vxdisk reclaim dumps core.
* 4046907 (4041001) In VxVM, system is getting hung when some nodes are rebooted.
* 4046908 (4038865) System panick at vxdmp module in IRQ stack.
* 4047588 (4044072) I/Os fail for NVMe disks with 4K block size on the RHEL 8.4 kernel.
* 4047590 (4045501) The VRTSvxvm and the VRTSaslapm packages fail to install on Centos 8.4 systems.
* 4047592 (3992040) bi_error - bi_status conversion map added for proper interpretation of errors at FS side.
* 4047695 (3911930) Provide a way to clear the PGR_FLAG_NOTSUPPORTED on the device instead of using
exclude/include commands
* 4047722 (4023390) Vxconfigd keeps dump core as invalid private region offset on a disk.
* 4049268 (4044583) A system goes into the maintenance mode when DMP is enabled to manage native devices.
Patch ID: VRTSvxvm-7.4.2.1900
* 4020207 (4018086) system hang was observed when RVG was in DCM resync with SmartMove as ON.
* 4039510 (4037915) VxVM 7.4.1 support for RHEL 8.4 compilation errors
* 4039511 (4037914) BUG: unable to handle kernel NULL pointer dereference
* 4039512 (4017334) vxio stack trace warning message kmsg_mblk_to_msg can be seen in systemlog
* 4039517 (4012763) IO hang may happen in VVR (Veritas Volume Replicator) configuration when SRL overflows for one rlink while another one rlink is in AUTOSYNC mode.
Patch ID: VRTSvxvm-7.4.2.1500
* 4018182 (4008664) System panic when signal vxlogger daemon that has ended.
* 4020207 (4018086) system hang was observed when RVG was in DCM resync with SmartMove as ON.
* 4021238 (4008075) Observed with ASL changes for NVMe, This issue observed in reboot scenario. For every reboot machine was hitting panic And this was happening in loop.
* 4021240 (4010612) This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on. means every nvme/ssd disks names would be 
hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....
* 4021346 (4010207) System panicked due to hard-lockup due to a spinlock not released properly during the vxstat collection.
* 4021359 (4010040) A security issue occurs during Volume Manager configuration.
* 4021366 (4008741) VxVM device files are not correctly labeled to prevent unauthorized modification - device_t
* 4021428 (4020166) Vxvm Support on RHEL8 Update3
* 4021748 (4020260) Failed to activate/set tunable dmp native support for Centos 8
Patch ID: VRTSvxvm-7.4.2.1400
* 4018182 (4008664) System panic when signal vxlogger daemon that has ended.
* 4020207 (4018086) system hang was observed when RVG was in DCM resync with SmartMove as ON.
* 4021346 (4010207) System panicked due to hard-lockup due to a spinlock not released properly during the vxstat collection.
* 4021428 (4020166) Vxvm Support on RHEL8 Update3
* 4021748 (4020260) Failed to activate/set tunable dmp native support for Centos 8
Patch ID: VRTSvxvm-7.4.2.1300
* 4008606 (4004455) Instant restore failed for a snapshot created on older version DG.
* 4010892 (4009107) CA chain certificate verification fails in SSL context.
* 4011866 (3976678) vxvm-recover:  cat: write error: Broken pipe error encountered in syslog.
* 4011971 (3991668) Veritas Volume Replicator (VVR) configured with sec logging reports data inconsistency when hit "No IBC message arrived" error.
* 4012485 (4000387) VxVM support on RHEL 8.2
* 4012848 (4011394) Performance enhancement for cloud tiering.
* 4013155 (4010458) In VVR (Veritas Volume replicator), the rlink might inconsistently disconnect due to unexpected transactions.
* 4013169 (4011691) High CPU consumption on the VVR secondary nodes because of high pending IO load.
* 4013718 (4008942) Docker infoscale plugin is failing to unmount the filesystem, if the cache object is full
Patch ID: VRTSspt-7.4.2.1600
* 4189440 (4189526) Firstlook to honor collection of VVR logs in a VVR environment
* 4189700 (4189723) Log Collection improvement in FirstLook related to threadlist collection.


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: VRTSvxvm-7.4.2.5600

* 4189446 (Tracking ID: 4183777)

SYMPTOM:
System log is flooding with the fake alarms "VxVM vxio V-5-0-0 read/write on disk: xxx took longer to complete".

DESCRIPTION:
When vol_ioship_stats_enable is disabled, volume layer uses jiffies to initialize IO's start time. Later DMP uses the current time of day to reset the start time. This causes the big discrepancy when compare the different time formats, hence the issue.

RESOLUTION:
Code changes have been done to set the IO's start and end time using the same format.

* 4189726 (Tracking ID: 4189725)

SYMPTOM:
DMP reported a false warning as below when there was no path to serve IO.
VxVM vxio V-5-0-0 write on disk: xxxx  took longer to complete (off: 256 len: 8 error: -2147483648 latency: 18444996496765 threshold: 40)

DESCRIPTION:
When calculating the latency of an I/O, the end time is expected to be set in the iodone callback function. However, when DMP fails to find a path to deliver the I/O, it does not set the end time. As a result, a false warning is generated later, leading to the issue.

RESOLUTION:
Code changes has been made to avoid the false warning.

Patch ID: VRTSvxvm-7.4.2.5500

* 4189294 (Tracking ID: 4189295)

SYMPTOM:
Symbol mismatch errors when loading VxVM and Veki modules on RHEL8.4.

DESCRIPTION:
Certain older VxVM build environments for RHEL8 include a custom modification to the kernel structure used during module compilation. This modification is no longer relevant, leading to inconsistencies and compatibility issues.

RESOLUTION:
Remove the outdated customization from the affected VxVM build environments to align with standard configurations and ensure module compatibility.

Patch ID: VRTSvxvm-7.4.2.5400

* 4188895 (Tracking ID: 4188763)

SYMPTOM:
Stale and incorrect symbolic links to VxDMP devices in "/dev/disk/by-uuid".

DESCRIPTION:
On some of the systems with Infoscale installed there can be stale symbolic links of /boot , /boot/efi to "VxDMP" devices instead of "SD" devices.
DMP uses "blkid" command" to get the O.S device based on UUID. But on some systems "blkid" command is taking long time for its completion.
In this scenario there can be stale symbolic link to VxDMP device.

RESOLUTION:
Code changes are done to use "udevadm info" command instead of "blkid".

Patch ID: VRTSvxvm-7.4.2.5300

* 4157992 (Tracking ID: 4154121)

SYMPTOM:
When the replicated disks are in SPLIT mode, importing its disk group on target node failed with "Device is a hardware mirror".

DESCRIPTION:
When the replicated disks are in SPLIT mode, which are readable and writable, importing its disk group on target node failed with "Device is a hardware mirror". Third party doesn't expose disk attribute to show when it's in SPLIT mode. With this new enhancement, the replicated disk group can be imported when enable use_hw_replicatedev.

RESOLUTION:
The code is enhanced to import the replicated disk group on target node when enable use_hw_replicatedev.

* 4164820 (Tracking ID: 4159403)

SYMPTOM:
When the replicated disks are in SPLIT mode and use_hw_replicatedev is on, disks are marked as cloned disks after the hardware replicated disk group gets imported.

DESCRIPTION:
add clearclone option automatically when import the hardware replicated disk group to clear the cloned flag on disks.

RESOLUTION:
The code is enhanced to import the replicated disk group with clearclone option.

* 4164822 (Tracking ID: 4160883)

SYMPTOM:
clone_flag was set on srdf-r1 disks after reboot.

DESCRIPTION:
Clean clone got reset in case of AUTOIMPORT, which misled the clone_flag got set on the disk in the end.

RESOLUTION:
Code change has been made to correct the behavior of setting clone_flag on a disk.

* 4168114 (Tracking ID: 4161827)

SYMPTOM:
RHEL8.10 Platform Support in VxVM

DESCRIPTION:
Few changes and compilation with RHEL8.10 kernel is required.

RESOLUTION:
Necessary changes have been done to make VxVM compatible with RHEL8.10

Patch ID: VRTSvxvm-7.4.2.5200

* 4160884 (Tracking ID: 4160883)

SYMPTOM:
clone_flag was set on srdf-r1 disks after reboot.

DESCRIPTION:
Clean clone got reset in case of AUTOIMPORT, which misled the clone_flag got set on the disk in the end.

RESOLUTION:
Code change has been made to correct the behavior of setting clone_flag on a disk.

Patch ID: VRTSvxvm-7.4.2.5100

* 4150574 (Tracking ID: 4077944)

SYMPTOM:
In VVR environment, when I/O throttling gets activated and deactivated by VVR, it may result in an application I/O hang.

DESCRIPTION:
In case VVR throttles and unthrottles I/O, the diving of throttled I/O is not done in one of the cases.

RESOLUTION:
Resolved the issue by making sure the application throttled I/Os get driven in all the cases.

* 4152117 (Tracking ID: 4142054)

SYMPTOM:
System panicked in the following stack:

[ 9543.195915] Call Trace:
[ 9543.195938]  dump_stack+0x41/0x60
[ 9543.195954]  panic+0xe7/0x2ac
[ 9543.195974]  vol_rv_inactive+0x59/0x790 [vxio]
[ 9543.196578]  vol_rvdcm_flush_done+0x159/0x300 [vxio]
[ 9543.196955]  voliod_iohandle+0x294/0xa40 [vxio]
[ 9543.197327]  ? volted_getpinfo+0x15/0xe0 [vxio]
[ 9543.197694]  voliod_loop+0x4b6/0x950 [vxio]
[ 9543.198003]  ? voliod_kiohandle+0x70/0x70 [vxio]
[ 9543.198364]  kthread+0x10a/0x120
[ 9543.198385]  ? set_kthread_struct+0x40/0x40
[ 9543.198389]  ret_from_fork+0x1f/0x40

DESCRIPTION:
- From the SIO stack, we can see that it is a case of done being called twice. 
- Looking at vol_rvdcm_flush_start(), we can see that when child sio is created, it is being directly added to the the global SIO queue. 
- This can cause child SIO to start while vol_rvdcm_flush_start() is still in process of generating other child SIOs. 
- It means that, say the first child SIO gets done, it can find the children count going to zero and calls done.
- The next child SIO, also independently find children count to be zero and call done.

RESOLUTION:
The code changes have been done to fix the problem.

* 4152732 (Tracking ID: 4111978)

SYMPTOM:
Replication failed to start due to vxnetd threads not running on secondary site.

DESCRIPTION:
Vxnetd was waiting to start "nmcomudpsrv" and "nmcomlistenserver" threads. Due to a race condition of some resource between those two thread, vxnetd was stuck in a dead loop till max retry reached.

RESOLUTION:
Code changes have been made to add lock protection to avoid the race condition.

* 4155720 (Tracking ID: 4154921)

SYMPTOM:
system is stuck in zio_wait() in FC-IOV environment after reboot the primary control domain when dmp_native_support is on.

DESCRIPTION:
Due to the different reasons, DMP might disable its subpaths. In a particular scenario, DMP might fail to reset IO QUIESCES flag on its subpaths, which caused IOs got queued in DMP defer queue. In case the upper layer, like zfs, kept waiting for IOs to complete, this bug might cause whole system hang.

RESOLUTION:
Code changes have been made to reset IO quiesce flag properly after disabled dmp path.

* 4157891 (Tracking ID: 4130393)

SYMPTOM:
vxencryptd crashed repeatedly due to segfault.

DESCRIPTION:
Linux could pass large IOs with 2MB size to VxVM layer, however vxencryptd only expects IOs with maximum IO size 1MB from kernel and only pre-allocates 1MB buffer size for encryption/decryption. This would cause vxencryptd to crash when processing large IOs.

RESOLUTION:
Code changes have been made to allocate enough buffer.

* 4157992 (Tracking ID: 4154121)

SYMPTOM:
When the replicated disks are in SPLIT mode, importing its disk group on target node failed with "Device is a hardware mirror".

DESCRIPTION:
When the replicated disks are in SPLIT mode, which are readable and writable, importing its disk group on target node failed with "Device is a hardware mirror". Third party doesn't expose disk attribute to show when it's in SPLIT mode. With this new enhancement, the replicated disk group can be imported when enable use_hw_replicatedev.

RESOLUTION:
The code is enhanced to import the replicated disk group on target node when enable use_hw_replicatedev.

* 4158080 (Tracking ID: 4106254)

SYMPTOM:
Nodes crashed in shared-nothing (Flexible Shared Storage) environment if node reboot followed by NVME disk failure is executed

DESCRIPTION:
If congested functions are registered in Linux driver, those are called to check if next set of IOs can be issued on devices and device can handle those.
In this case for a given volume, vset related congestion function was getting called which caused node to panic.

RESOLUTION:
Congestion functions are deprecated in newer linux kernel versions and they are required for MD/DM devices NOT for vxvm.
So explicit callback functions are removed and congestion control now refers linux linux standard mechanism.

* 4158081 (Tracking ID: 4085477)

SYMPTOM:
Operations dependent on settag operation are unresponsive.

DESCRIPTION:
dm name already has a da with the same name in DG .

RESOLUTION:
Added a fix to correctly identify disk when da or dm name is provided as an argument.

* 4158082 (Tracking ID: 3989340)

SYMPTOM:
Recovery of volume is not triggered post reboot in shared nothing environment

DESCRIPTION:
In shared nothing environments i.e. FSS (Flexible Shared Storage) the node reboot makes storage associated with that node unavailable for IO operations. This results into IO failure on those disks leading to mirrors of volumes coming from those nodes getting DETACHED. When the faulted node(s) joins the cluster, the mirrors associated with those will be back online. The faulted mirrors needs to be recovered once storage connectivity is back. In certain conditions of cascaded reboot scenarios and disk failure situation, this automated recovery of mirrors did not start, leaving volumes with less fault tolerance. The volume recovery operations tag a temporary field on config of the volumes to avoid multiple parallel command trying to recover the same volume. This fields did not get clean up properly in cascaded reboot sequence, which lead to subsequent cluster reconfig not able to start volume recovery.

RESOLUTION:
Code changes made to properly cleanup the temporary fields on volume to ensure subsequent recovery operations are triggered when node/storage is back online.

* 4158083 (Tracking ID: 4005719)

SYMPTOM:
For encrypted volumes, the disk reclaim operation gets hung.

DESCRIPTION:
Reclaim request is not correctly handled for encrypted volumes resulting in a hang.

RESOLUTION:
Skip the encryption IO request path for reclaim requests.

* 4158084 (Tracking ID: 4024140)

SYMPTOM:
In VVR environments, in case of disabled volumes, DCM read operation does not complete, resulting in application IO hang.

DESCRIPTION:
If all volumes in the RVG have been disabled, then the read on the DCM does not complete. This results in an IO hang and blocks other operations such as transactions and diskgroup delete.

RESOLUTION:
If all the volumes in the RVG are found disabled, then fail the DCM read.

* 4158085 (Tracking ID: 4046560)

SYMPTOM:
vxconfigd aborts on Solaris if device's hardware path is more than 128 characters.

DESCRIPTION:
When vxconfigd started, it claims the devices exist on the node and updates VxVM device
database. During this process, devices which are excluded from vxvm gets excluded from VxVM device database.
To check if device to be excluded, we consider device's hardware full path. If hardware path length is
more than 128 characters, vxconfigd gets aborted. This issue occurred as code is unable to handle hardware
path string beyond 128 characters.

RESOLUTION:
Required code changes has been done to handle long hardware path string.

* 4158086 (Tracking ID: 4142772)

SYMPTOM:
In case SRL overflow frequently happens, SRL reaches 99% filled but the rlink is unable to get into DCM mode.

DESCRIPTION:
When starting DCM mode, need to check if the error mask NM_ERR_DCM_ACTIVE has been set to prevent duplicated triggers. This flag should have been reset after DCM mode was activated by reconnecting the rlink. As there's a racing condition, the rlink reconnect may be completed before DCM is activated, hence the flag isn't able to be cleared.

RESOLUTION:
The code changes have been made to fix the issue.

* 4158087 (Tracking ID: 4011582)

SYMPTOM:
In VxVM, minimum and maximum read/write time for the IO workload is not captured using vxstat utility.

DESCRIPTION:
Currently vxstat utility display only average read/write time it takes for the IO workload to complete Inder VxVM layer.

RESOLUTION:
Changes are done to existing vxstat utility to capture and display minimum and maximum read/write time.

* 4158088 (Tracking ID: 4089801)

SYMPTOM:
Cluster went in hanged state after rebooting 6 slave nodes

DESCRIPTION:
For shared DG transaction, clusterwide FMR related cleanup happens. This issues DCO meta read on dco volume which has connectivity issue and results in read failure. This mark certain flags incore on dco, but in certain part of transaction flag is ignored before issuing read TOC which is leading to disk detach transaction failure with error "DCO experienced IO errors during the operation. Re-run the operation after ensuring that DCO is accessible". This is causing subsequent node join failures.

RESOLUTION:
Fix is added to check flag at appropriate stages of transaction

* 4158089 (Tracking ID: 3972770)

SYMPTOM:
System panic with voldco_get_mapid() function in kernel stack trace during cluster stop/start operation.

DESCRIPTION:
VxVM triggers a config change operation for any kernel initiated or user initiated changes through transaction. When fast-mirror-sync (FMR) is configured on volumes, transaction like mirror attach/detach requires bitmap manipulation. This bitmap manipulation accesses the in-core metadata of FMR objects. During node stop/start operations, the FMR metadata of the a volume was getting updated in parallel by multiple threads in kernel processing transaction. This lead to two thread incorrectly accessing the metadata leading to panic in voodoo_get_mapid() function.

RESOLUTION:
Code changes are done in FMR transaction code path in kernel to avoid parallel processing of DCO object related information to avoid inconsistent information.

* 4158090 (Tracking ID: 4019380)

SYMPTOM:
vxcloudd daemon dumps core with below mentioned stack:
raise ()
abort ()
__libc_message ()
malloc_printerr ()
_int_free ()
CRYPTO_free ()
engine_pkey_meths_free ()
engine_free_util ()
ENGINE_finish ()
ssl_create_cipher_list ()
SSL_CTX_new ()
ossl_connect_step1 ()
ossl_connect_common ()
Curl_ssl_connect_nonblocking ()
https_connecting ()
Curl_http_connect ()
multi_runsingle ()
curl_multi_perform ()
curl_easy_perform ()
curl_send_request ()
curl_request_perform ()
amz_request_perform ()
amz_download_object ()
cloud_read ()
handle_s3_request ()
cloud_io_thread ()
start_thread ()
clone ()

DESCRIPTION:
vxcloudd daemon is muti-threaded application. OpenSSL is not completely thread safe. It requires thread callbacks to be set for consistency of some of the shared data structures.

RESOLUTION:
Implement the thread call back functions for Curl and OpenSSL.

* 4158091 (Tracking ID: 4058266)

SYMPTOM:
vxstat stats was flooded with 0 entries in case of no IO activity on objects.

DESCRIPTION:
In case of too many object in DG, printing stats using -i option generates too many entries. If few objects are not having any IO activity in given interval we still print 0 filled entries. Excluding those entries would be helpful with some option. iostat command has option "-z" which does the same in Linux. we should implement some similar option in vxstat

RESOLUTION:
Added an option "-Z" which can be used with another vxstat options to ignore 0 entries.

* 4158092 (Tracking ID: 4100037)

SYMPTOM:
Entries printed with vxstat was not printed properly

DESCRIPTION:
While displaying the stats using vxstat, entries was not displayed correclty, In few cases headers was printed multiple time. While in few headers and stats were not in sync.

RESOLUTION:
Handled the impacted options with code changes.

Patch ID: VRTSvxvm-7.4.2.4800

* 4076321 (Tracking ID: 4076320)

SYMPTOM:
Not Able to get ARRAY_VOLUME_ID, old_udid.
# vxdisk -p list 3pardata1_3 |grep -i ARRAY_VOLUME_ID
# vxdisk -p list 3pardata1_3 |grep -i old_udid.

DESCRIPTION:
AVID, reclaim_cmd_nv, extattr_nv, old_udid_nv is not generated for HPE 3PAR/Primera/Alletra 9000 ALUA array.

RESOLUTION:
Code changes added to generate AVID, reclaim_cmd_nv, extattr_nv, old_udid_nv for HPE 3PAR/Primera/Alletra 9000 ALUA array have been done.

* 4134887 (Tracking ID: 4020942)

SYMPTOM:
Data corruption/loss on erasure code (EC) volumes post rebalance/disk movement operations while active application IO in progress.

DESCRIPTION:
In erasure coded (EC) layout, an operation to move a column from one disk to another new disk as part of data rebalance operation uses VxFS smart move. This ensures only in-use blocks of the disk by file-system are moved. During this operation a bug in IO code path of erasure coded (EC) volume undergoing move operation caused new IO's spanning on the columns under move were not written to new column and it just got updated on old column. This caused data corruption post completion of data movement operation. The corruption is detected when application tried to access the data written and found that it is incorrect on new columns.

RESOLUTION:
Bug in erasure coded (EC) layout IO path during rebalance operation is fixed to ensure IO on the column under move is updated properly on both source (old) and destination (new) disk to ensure consistency of data post move operation.

* 4135142 (Tracking ID: 4040043)

SYMPTOM:
Warnings in dmesg/ kernel logs  for violating memory usage/handling  protocols.

DESCRIPTION:
using kmem_cache_alloc and copying this memory to user is giving warnings as :"kernel: Bad or missing usercopy whitelist? Kernel 
memory exposure attempt detected from SLUB object 'sgpool-128' (offset 0, size 4096)!"

RESOLUTION:
the earlier caches were created using kmem_cache_create(), and now the linux has introduced a new API to support the need of cache to be communicated to userspace.
the new API: kmem_cache_create_usercopy().

Implemented code to allocate user friendly memory as to avoid kernel warnings of "memory access violations". which could also get converted to PANIC in  next kernel versions.

* 4135248 (Tracking ID: 4129663)

SYMPTOM:
vxvm and aslapm rpm do not have changelog

DESCRIPTION:
Changelog in rpm will help to find missing incidents with respect to other version.

RESOLUTION:
Changelog is generated and added to vxvm and aslapm rpm.

* 4136240 (Tracking ID: 4040695)

SYMPTOM:
vxencryptd getting coredump.

DESCRIPTION:
Due to the static buffer size in vxencryptd code, 
if IOs comes with greater than buffer size then there was no handling for such scenario and we were hitting coredump.

RESOLUTION:
Making BUFFER_SIZE dynamic depend on the tunable values in current state. (tunable vol_maxio)

* 4140562 (Tracking ID: 4134305)

SYMPTOM:
Illegal memory access is detected when an admin SIO is trying to lock a volume.

DESCRIPTION:
While locking a volume, an admin SIO is converted to an incompatible SIO, on which collecting ilock stats causes memory overrun.

RESOLUTION:
The code changes have been made to fix the problem.

* 4140572 (Tracking ID: 4080124)

SYMPTOM:
Data corruption on mirrored volume in shared-nothing (Flexible Shared Storage) environment during failure of VxVM configuration update.

DESCRIPTION:
IN shared-nothing environment, node failure leads to disk IO failures for the disks connected to failed host. Error handling in volume manager layer updates the failures of mirrors/columns of volume in these situation. In cascaded reboot scenarios where a last disks where VxVM configuration is present failed and hence a config update failure occurred. VxVM layer is expected to fail IO in such situation cluster wide. Bug in data change object (DCO) error handling only made IO failure on few nodes instead of all the nodes of cluster, which leads to additional IOs post this condition on volume and those were not considered when the mirrors were re-attached. This caused data loss when application read the data from recovered mirrors.

RESOLUTION:
Error handling code inside DCO object is modified to ensure IO are failed across all nodes in cluster and no further IOs allowed on volume from any node, hence preventing corruption to happen in those cases.

* 4140589 (Tracking ID: 4120068)

SYMPTOM:
A standard disk was added to a cloned diskgroup successfully which is not expected.

DESCRIPTION:
When add a disk to a disk group, a pre-check will be made to avoid ending up with a mixed diskgroup. In a cluster, the local node might fail to use the 
latest record to do the pre-check, which caused a mixed diskgroup in the cluster, further caused node join failure.

RESOLUTION:
Code changes have been made to use latest record to do a mixed diskgroup pre-check.

* 4140690 (Tracking ID: 4100547)

SYMPTOM:
Full volume resync happens(~9hrs) post last node reboot at secondary site in a NBFS DR cluster.

DESCRIPTION:
Sub-volumes getting marked for rwbk SYNC during node reboots or during plex re-attached for layered volume in VVR environments (Primary and secondary both) which is not expected. This caused the long time resync.

RESOLUTION:
Code changes have been made to avoid marking sub-volumes for rwbk sync if they are under VVR config.

Patch ID: VRTSvxvm-7.4.2.4700

* 4134888 (Tracking ID: 4105204)

SYMPTOM:
Node not able to join the cluster after iLO "press and hold" scenario in loop

DESCRIPTION:
- Node is not able to join cluster because newly elected master and surviving slaves are stuck in previous reconfig
- This is one of Quorum loss/DG disable scenario
- During VCS cleanup of disabled DG,  dg deport is triggered which is stuck.
- Since dg is anyways disabled due to quorum loss, cluster reboot is needed to come out of situation.

- Following vxreconfd stack will be seen on new master and surviving slaves
PID: 8135   TASK: ffff9d3e32b05230  CPU: 5   COMMAND: "vxreconfd"
 #0 [ffff9d3e33c43748] __schedule at ffffffff8f1858da
 #1 [ffff9d3e33c437d0] schedule at ffffffff8f185d89
 #2 [ffff9d3e33c437e0] volsync_wait at ffffffffc349415f [vxio]
 #3 [ffff9d3e33c43848] _vol_syncwait at ffffffffc3939d44 [vxio]
 #4 [ffff9d3e33c43870] vol_rwsleep_rdlock_hipri at ffffffffc360e2ab [vxio]
 #5 [ffff9d3e33c43898] volopenter_hipri at ffffffffc361ae45 [vxio]
 #6 [ffff9d3e33c438a8] volcvm_ktrans_openter at ffffffffc33ba1e6 [vxio]
 #7 [ffff9d3e33c438c8] cvm_send_mlocks at ffffffffc33863f8 [vxio]
 #8 [ffff9d3e33c43910] volmvcvm_cluster_reconfig_exit at ffffffffc3407d1d [vxio]
 #9 [ffff9d3e33c43940] volcvm_master at ffffffffc33da1b8 [vxio]
#10 [ffff9d3e33c439c0] volcvm_vxreconfd_thread at ffffffffc33df481 [vxio]
#11 [ffff9d3e33c43ec8] kthread at ffffffff8eac6691
#12 [ffff9d3e33c43f50] ret_from_fork_nospec_begin at ffffffff8f192d24

RESOLUTION:
cluster reboot is needed to come out of situation

* 4134889 (Tracking ID: 4107401)

SYMPTOM:
SRL goes into passthru mode which causes system to run without replication

DESCRIPTION:
Issue is seen in FSS environment when new logowner selected after any reconfig is not contributing any storage. If SRL and data volume recovery use different plexes inconsistency is seen while reading SRL data.

RESOLUTION:
When SRL is recovered do read-write back so that all plexes are consistent.

* 4136239 (Tracking ID: 4069940)

SYMPTOM:
FS mount failed during Cluster configuration on 24-node physical BOM setup.

DESCRIPTION:
FS mount failed during Cluster configuration on 24-node physical BOM setup due to vxvm transactions were taking time more that vcs timeouts.

RESOLUTION:
Fix is added to reduce unnecessary transaction time on large node setup.

* 4136316 (Tracking ID: 4098144)

SYMPTOM:
vxtask list shows the parent process without any sub-tasks which never progresses for SRL volume

DESCRIPTION:
vxtask remains stuck since the parent process doesn't exit. It was seen that all childs are completed, but the parent is not able to exit.
(gdb) p active_jobs
$1 = 1
Active jobs are reduced as in when childs complete. Somehow one count is pending and we don't know which child exited without decrementing count. Instrumentation messages are added to capture the issue.

RESOLUTION:
Added a code that will create a log file in /etc/vx/log/. This file will be deleted when vxrecover exists successfully. The file will be present when vxtask parent hang issue is seen.

* 4136482 (Tracking ID: 4132799)

SYMPTOM:
If GLM is not loaded, start CVM fails with the following errors:
# vxclustadm -m gab startnode
VxVM vxclustadm INFO V-5-2-9687 vxclustadm: Fencing driver is in disabled mode - 
VxVM vxclustadm ERROR V-5-1-9743 errno 3

DESCRIPTION:
The error number but the error message is printed while joining CVM fails.

RESOLUTION:
The code changes have been made to fix the issue.

* 4137008 (Tracking ID: 4133793)

SYMPTOM:
DCO experience IO Errors while doing a vxsnap restore on vxvm volumes.

DESCRIPTION:
Dirty flag was getting set in context of an SIO with flag VOLSIO_AUXFLAG_NO_FWKLOG being set. This led to transaction errors while doing a vxsnap restore command in loop for vxvm volumes causing transaction abort. As a result, VxVM tries to cleanup by removing newly added BMs. Now, VxVM tries to access the deleted BMs. however it is not able to since they were deleted previously. This ultimately leads to DCO IO error.

RESOLUTION:
Skip first write klogging in the context of an IO with flag VOLSIO_AUXFLAG_NO_FWKLOG being set.

* 4139447 (Tracking ID: 4139448)

SYMPTOM:
RHEL8.9 Platform Support in VxVM

DESCRIPTION:
RHEL8.9 Platform Support in VxVM

RESOLUTION:
RHEL8.9 Platform Support in VxVM

Patch ID: VRTSvxvm-7.4.2.4600

* 4069525 (Tracking ID: 4065490)

SYMPTOM:
systemd-udev threads consumes more CPU during system bootup or device discovery.

DESCRIPTION:
During disk discovery when new storage devices are discovered, VxVM udev rules are invoked for creating hardware path
symbolic link and setting SELinux security context on Veritas device files. For creating hardware path symbolic link to each
storage device, "find" command is used internally which is CPU intensive operation. If too many storage devices are attached to
system, then usage of "find" command causes high CPU consumption.

Also, for setting appropriate SELinux security context on VxVM device files, restorecon is done irrespective of SELinux is enabled or disabled.

RESOLUTION:
Usage of "find" command is replaced with "udevadm" command. SELinux security context on VxVM device files is being set
only when SELinux is enabled on system.

* 4074816 (Tracking ID: 4066785)

SYMPTOM:
When the replicated disks are in SPLIT mode, importing its disk group failed with "Device is a hardware mirror".

DESCRIPTION:
When the replicated disks are in SPLIT mode, which are readable and writable, importing its disk group failed with "Device is a hardware mirror". Third party doesn't expose disk attribute to show when it's in SPLIT mode. With this new enhancement, the replicated disk group can be imported with option `-o usereplicatedev=only`.

RESOLUTION:
The code is enhanced to import the replicated disk group with option `-o usereplicatedev=only`.

* 4084386 (Tracking ID: 4073653)

SYMPTOM:
After configuring RVGs in async mode on CVR setup with shared storage it is observed that startrep for RVG is failed and vxconfigd hangs on primary master node.

DESCRIPTION:
0

RESOLUTION:
0

* 4116576 (Tracking ID: 3972344)

SYMPTOM:
After reboot of a node on a setup where multiple diskgroups / Volumes within diskgroups are present, sometimes in /var/log/messages an error 'vxrecover ERROR V-5-1-11150  Volume <volume_name> does not exist' is logged.

DESCRIPTION:
In volume_startable function (volrecover.c), dgsetup is called to set the current default diskgroup. This does not update the current_group variable leading to inappropriate mappings. Volumes are searched in an incorrect diskgroup which is logged in the error message.
The vxrecover command works fine if the diskgroup name associated with volume is specified. [vxrecover -g <dg_name> -s]

RESOLUTION:
Changed the code to use switch_diskgroup() instead of dgsetup. Current_group is updated and the current_dg is set. Thus vxrecover finds the Volume correctly.

* 4128868 (Tracking ID: 4128867)

SYMPTOM:
Vulnerabilities have been reported in third party component, OpenSSL that is used by VxVM.

DESCRIPTION:
Third party component OpenSSL in its current versions,  used by VxVM have been reported with security vulnerabilities which 
needs

RESOLUTION:
OpenSSL have been upgraded to newer versions in which the reported security vulnerabilities have been addressed.

* 4128885 (Tracking ID: 4115193)

SYMPTOM:
Data corruption on VVR primary with storage loss beyond fault tolerance level in replicated environment.

DESCRIPTION:
In Flexible Storage Sharing (FSS)  environment  any node fault can lead to storage failure. In VVR primary when last  mirror of SRL  (Storage Replicator Log) volume faulted while application writes are in progress replication is expected to go to pass-through mode.
This information is persistently recorded in the kernel log (KLOG). In the event of cascaded storage node failures, the KLOG updation protocol could not update quorum number of copies. This mis-match in on-disk v/s in-core state of VVR objects leading to data loss due to missing recovery when all storage faults are resolved.

RESOLUTION:
Code changes to handle the KLOG update failure in SRL IO failure handling is done to ensure configuration on-disk and in-core is consistent and subsequent application IO will not be allowed to avoid data corruption.

* 4131718 (Tracking ID: 4088941)

SYMPTOM:
While running DMP test suite , setup panics throwing below stack :
#7 [] scsi_queue_rq at [scsi_mod]
#8 [] blk_mq_dispatch_rq_list at 
#9 [] __blk_mq_sched_dispatch_requests at 
#10 [] blk_mq_sched_dispatch_requests at 
#11 [] __blk_mq_run_hw_queue at 
#12 [] __blk_mq_delay_run_hw_queue at 
#13 [] blk_mq_sched_insert_request at 
#14 [] blk_execute_rq at 
#15 [] dmp_send_scsi_work_fn at [vxdmp]
#16 [] process_one_work at 
#17 [] worker_thread at ffffffff8b8c1a9d

DESCRIPTION:
The kernel function used to create request from bios is not considering max_segment_size at the time of append, hence the issue is being observed.

RESOLUTION:
Appropriate logic has been added in the code to handle number of physical segment set correctly.

* 4134702 (Tracking ID: 4122396)

SYMPTOM:
vxvm-recover.service fails to start on linux platforms.

DESCRIPTION:
When using KillMode=control-group, stopping the vxvm-recover.service results in a failed state.
# systemctl status vxvm-boot.service
? vxvm-boot.service - VERITAS Volume Manager Boot service
     Loaded: loaded (/usr/lib/systemd/system/vxvm-boot.service; enabled; vendor preset: disabled)
     Active: failed (Result: timeout) since Thu 2023-06-15 12:41:47 IST; 52s ago

RESOLUTION:
Required code changes has been done to rectify the problem.

* 4134875 (Tracking ID: 4130642)

SYMPTOM:
node failed to rejoin the cluster after switched from master to slave due to the failure of the replicated diskgroup import.
The below error message could be found in /var/VRTSvcs/log/CVMCluster_A.log.
CVMCluster:cvm_clus:monitor:vxclustadm nodestate return code:[101] with output: [state: out of cluster
reason: Replicated dg record is found: retry to add a node failed]

DESCRIPTION:
The flag which shows the diskgroup was imported with usereplicatedev=only failed to be marked since the last time the diskgroup got imported. 
The missing flag caused the failure of the replicated diskgroup import, further caused node rejoin failure.

RESOLUTION:
The code changes have been done to flag the diskgroup after it got imported with usereplicatedev=only.

* 4134877 (Tracking ID: 4128451)

SYMPTOM:
A hardware replicated disk group fails to be auto-imported after reboot.

DESCRIPTION:
Currently the standard diskgroup and cloned diskgroup are supported with auto-import. Hardware replicated disk group isn't supported yet.

RESOLUTION:
Code changes have been made to support hardware replicated disk groups with autoimport.

* 4134885 (Tracking ID: 4134023)

SYMPTOM:
vxconfigrestore(Diskgroup configuration restoration) for H/W Replicated diskgroup failed with below error:
# vxconfigrestore -p LINUXSRDF
VxVM vxconfigrestore INFO V-5-2-6198 Diskgroup LINUXSRDF configuration restoration started ......
VxVM vxdg ERROR V-5-1-0 Disk group LINUXSRDF: import failed:
Replicated dg record is found.
Did you want to import hardware replicated LUNs?
Try vxdg [-o usereplicatedev=only] import option with -c[s]
Please refer to system log for details.
... ...
VxVM vxconfigrestore ERROR V-5-2-3706 Diskgroup configuration restoration for LINUXSRDF failed.

DESCRIPTION:
H/W Replicated diskgroup can be imported only with option "-o usereplicatedev=only". vxconfigrestore didn't do H/W Replicated diskgroup check, without giving the proper import option diskgroup import failed.

RESOLUTION:
The code changes have been made to do H/W Replicated diskgroup check in vxconfigrestore .

* 4135150 (Tracking ID: 4114867)

SYMPTOM:
Getting these error messages while adding new disks
[root@server101 ~]# cat /etc/udev/rules.d/41-VxVM-selinux.rules | tail -1
KERNEL=="VxVM*", SUBSYSTEM=="block", ACTION=="add", RUN+="/bin/sh -c 'if [ `/usr/sbin/getenforce` != "Disabled" -a `/usr/sbin/
[root@server101 ~]#
[root@server101 ~]# systemctl restart systemd-udevd.service
[root@server101 ~]# udevadm test /block/sdb 2>&1 | grep "invalid"
invalid key/value pair in file /etc/udev/rules.d/41-VxVM-selinux.rules on line 20, starting at character 104 ('D')

DESCRIPTION:
In /etc/udev/rules.d/41-VxVM-selinux.rules double quotation on Disabled and disable is the issue.

RESOLUTION:
Code changes have been made to correct the problem.

Patch ID: VRTSvxvm-7.4.2.4500

* 4128868 (Tracking ID: 4128867)

SYMPTOM:
Vulnerabilities have been reported in third party component, OpenSSL that is used by VxVM.

DESCRIPTION:
Third party component OpenSSL in its current versions,  used by VxVM have been reported with security vulnerabilities which 
needs

RESOLUTION:
OpenSSL have been upgraded to newer versions in which the reported security vulnerabilities have been addressed.

Patch ID: VRTSvxvm-7.4.2.4400

* 4092002 (Tracking ID: 4081740)

SYMPTOM:
vxdg flush command slow due to too many luns needlessly access /proc/partitions.

DESCRIPTION:
Linux BLOCK_EXT_MAJOR(block major 259) is used as extended devt for block devices. When partition number of one device is more than 15, the partition device gets assigned under major 259 to solve the sd limitations (16 minors per device), by which more partitions are allowed for one sd device. During "vxdg flush", for each lun in the disk group, vxconfigd reads file /proc/partitions line by line through fgets() to find all the partition devices with major number 259, which would cause vxconfigd to respond sluggishly if there are large amount of luns in the disk group.

RESOLUTION:
Code has been changed to remove the needless access on /proc/partitions for the luns without using extended devt.

* 4111010 (Tracking ID: 4108475)

SYMPTOM:
vxfentsthdw script failed with "Expect no writes for disks.."

DESCRIPTION:
In dmp_return_io() function DMP_SET_BP_ERROR() macro sets DKE_EACCES error on errbp but it is not reflected in errbp->orig_bp.
Because orig_bp is not a VxIO buffer because IO is not coming from VxIO here. 
DMP_BIODONE() is a macro that checks whether the IO buffer (errbp->orig_bp) is a VxIO buffer, if not then it return success even if the IO error occurred here.

RESOLUTION:
Need to handle this condition to fix this issue, added 2more iodone functions as VxIO signatures to identify the VxIO buffer in vxdmp driver.
handled non VxIO buffer case by setting proper error code on the io buffer.

* 4113327 (Tracking ID: 4102439)

SYMPTOM:
Customer observed failure When trying to run the vxencrypt rekey operation on an encrypted volume (to perform key rotation).

DESCRIPTION:
KMS token is of size 64 bytes, we are restricting the token size to 63 bytes and throw an error if the token size is more than 63.

RESOLUTION:
The issue is resolved by setting the assumption of token size to be size of KMS token, which is 64 bytes.

* 4115231 (Tracking ID: 4090772)

SYMPTOM:
vxconfigd/vx commands hang on secondary site in a CVR environment.

DESCRIPTION:
Due to a window with unmatched SRL positions, if any application (e.g. fdisk) trying
to open the secondary RVG volume will acquire a lock and wait for SRL positions to match.
During this if any vxvm transaction kicked in will also have to wait for same lock.
Further logowner node panic'd which triggered logownership change protocol which hung
as earlier transaction was stuck. As logowner change protocol could not complete,
in absence of valid logowner SRL position could not match and caused deadlock. That lead
to vxconfigd and vx command hang.

RESOLUTION:
Added changes to allow read operation on volume even if SRL positions are
unmatched. We are still blocking write IOs and just allowing open() call for read-only
operations, and hence there will not be any data consistency or integrity issues.

* 4116422 (Tracking ID: 4111254)

SYMPTOM:
vradmind dumps core with the following stack:

#3  0x00007f3e6e0ab3f6 in __assert_fail () from /root/cores/lib64/libc.so.6
#4  0x000000000045922c in RDS::getHandle ()
#5  0x000000000056ec04 in StatsSession::addHost ()
#6  0x000000000045d9ef in RDS::addRVG ()
#7  0x000000000046ef3d in RDS::createDummyRVG ()
#8  0x000000000044aed7 in PriRunningState::update ()
#9  0x00000000004b3410 in RVG::update ()
#10 0x000000000045cb94 in RDS::update ()
#11 0x000000000042f480 in DBMgr::update ()
#12 0x000000000040a755 in main ()

DESCRIPTION:
vradmind was trying to access a NULL pointer (Remote Host Name) in a rlink object, as the Remote Host attribute of the rlink hasn't been set.

RESOLUTION:
The issue has been fixed by making code changes.

* 4116427 (Tracking ID: 4108913)

SYMPTOM:
Vradmind dumps core with the following stacks:
#3  0x00007f2c171be3f6 in __assert_fail () from /root/coredump/lib64/libc.so.6
#4  0x00000000005d7a90 in VList::concat () at VList.C:1017
#5  0x000000000059ae86 in OpMsg::List2Msg () at Msg.C:1280
#6  0x0000000000441bf6 in OpMsg::VList2Msg () at ../../include/Msg.h:389
#7  0x000000000043ec33 in DBMgr::processStatsOpMsg () at DBMgr.C:2764
#8  0x00000000004093e9 in process_message () at srvmd.C:418
#9  0x000000000040a66d in main () at srvmd.C:733

#0  0x00007f4d23470a9f in raise () from /root/core.Jan18/lib64/libc.so.6
#1  0x00007f4d23443e05 in abort () from /root/core.Jan18/lib64/libc.so.6
#2  0x00007f4d234b3037 in __libc_message () from /root/core.Jan18/lib64/libc.so.6
#3  0x00007f4d234ba19c in malloc_printerr () from /root/core.Jan18/lib64/libc.so.6
#4  0x00007f4d234bba9c in _int_free () from /root/core.Jan18/lib64/libc.so.6
#5  0x00000000005d5a0a in ValueElem::_delete_val () at Value.C:491
#6  0x00000000005d5990 in ValueElem::~ValueElem () at Value.C:480
#7  0x00000000005d7244 in VElem::~VElem () at VList.C:480
#8  0x00000000005d8ad9 in VList::~VList () at VList.C:1167
#9  0x000000000040a71a in main () at srvmd.C:743

#0  0x000000000040b826 in DList::head () at ../include/DList.h:82
#1  0x00000000005884c1 in IpmHandle::send () at Ipm.C:1318
#2  0x000000000056e101 in StatsSession::sendUCastStatsMsgToPrimary () at StatsSession.C:1157
#3  0x000000000056dea1 in StatsSession::sendStats () at StatsSession.C:1117
#4  0x000000000046f610 in RDS::collectStats () at RDS.C:6011
#5  0x000000000043f2ef in DBMgr::collectStats () at DBMgr.C:2799
#6  0x00007f98ed9131cf in start_thread () from /root/core.Jan26/lib64/libpthread.so.0
#7  0x00007f98eca4cdd3 in clone () from /root/core.Jan26/lib64/libc.so.6

DESCRIPTION:
There is a race condition in vradmind that may cause memory corruption and unpredictable result. Vradmind periodically forks a child thread to collect VVR statistic data and send them to the remote site. While the main thread may also be sending data using the same handler object, thus member variables in the handler object are accessed in parallel from multiple threads and may become corrupted.

RESOLUTION:
The code changes have been made to fix the issue.

* 4116435 (Tracking ID: 4034741)

SYMPTOM:
Due to a common RVIOmem pool being used by multiple RVG, a deadlock scenario gets created, causing high load average and system hang.

DESCRIPTION:
The current fix limits IO load on secondary by retaining the updates in NMCOM pool until the DV write done, by which RVIOMEM pool became easy to fill up and 
deadlock situtaion may occur, esp. when high work load on multiple RVGs or cross direction RVGs.Currently all RVGs share the same RVIOMEM pool, while NMCOM 
pool, RDBACK pool and network/dv update list are all per-RVGs, so the RVIOMEM pool becomes the bottle neck on secondary, which is easy to full and run into 
deadlock situation.

RESOLUTION:
Code changes to honor per-RVG RVIOMEM pool to resolve the deadlock issue.

* 4116437 (Tracking ID: 4072862)

SYMPTOM:
In case RVGLogowner resources get onlined on slave nodes, stop the whole cluster may fail and RVGLogowner resources goes in to offline_propagate state.

DESCRIPTION:
While stopping whole cluster, the racing may happen between CVM reconfiguration and RVGLogowner change SIO.

RESOLUTION:
Code changes have been made to fix these racings.

* 4116576 (Tracking ID: 3972344)

SYMPTOM:
After reboot of a node on a setup where multiple diskgroups / Volumes within diskgroups are present, sometimes in /var/log/messages an error 'vxrecover ERROR V-5-1-11150  Volume <volume_name> does not exist' is logged.

DESCRIPTION:
In volume_startable function (volrecover.c), dgsetup is called to set the current default diskgroup. This does not update the current_group variable leading to inappropriate mappings. Volumes are searched in an incorrect diskgroup which is logged in the error message.
The vxrecover command works fine if the diskgroup name associated with volume is specified. [vxrecover -g <dg_name> -s]

RESOLUTION:
Changed the code to use switch_diskgroup() instead of dgsetup. Current_group is updated and the current_dg is set. Thus vxrecover finds the Volume correctly.

* 4117899 (Tracking ID: 4055159)

SYMPTOM:
vxdisk list showing incorrect value of LUN_SIZE for nvme disks

DESCRIPTION:
vxdisk list showing incorrect value of LUN_SIZE for nvme disks.

RESOLUTION:
Code changes have been done to show correct LUN_SIZE for nvme devices.

* 4117989 (Tracking ID: 4085145)

SYMPTOM:
The issue we are discussing is with AWS environment, on-prim physical/vm host this issue does not exist.( as ioctl and sysfs is giving same values)

DESCRIPTION:
The UDID value in case of Amazon EBS devices was going beyond its limit (read from sysfs as ioctl is not supported by AWS)

RESOLUTION:
Did code changes to fetch LSN through IOCTL as we have fix for intermittent ioctl failure.

* 4118256 (Tracking ID: 4028439)

SYMPTOM:
Not able to create cached volume due to SSD tag missing

DESCRIPTION:
Disk mediatype flag was not propagated previously, now updated during disk online.

RESOLUTION:
Code changes have been done to make mediatype tags visible during disk online

* 4119951 (Tracking ID: 4119950)

SYMPTOM:
Vulnerabilities have been reported in third party components, [curl and libxml] that are used by VxVM.

DESCRIPTION:
Third party components [curl and libxml] in their current versions,  used by VxVM have been reported with security vulnerabilities which 
needs

RESOLUTION:
[curl and libxml] have been upgraded to newer versions in which the reported security vulnerabilities have been addressed.

* 4120540 (Tracking ID: 4102532)

SYMPTOM:
/etc/default/vxsf file gets world write permission when "vxtune storage_connectivity asymmetric" is run.

DESCRIPTION:
umask for daemon process vxconfigd is 0 and not 0022. This is required for functionality to run properly. For this reason, any file created by vxconfigd gets world-write permission. When "vxtune storage_connectivity asymmetric" is run, a temporary file is created and then it is renamed to vxsf. So vxsf gets world write permission.

RESOLUTION:
Code changes done so that, instead of default permissions, specific permissions are set to file when file is created. So vxsf does not get world write permission.

* 4120545 (Tracking ID: 4090826)

SYMPTOM:
system panic at vol_page_offsetlist_sort with below stack:

vpanic()
kmem_error+0x5f0()
vol_page_offsetlist_sort+0x164()
volpage_freelist+0x278()
vol_cvol_shadow2_done+0xb8()

DESCRIPTION:
Due to a bug in sorting the large offset, the code overwrote the boundary of the allocated memory and caused the panic.

RESOLUTION:
The code change has been made to sort the large offset correctly.

* 4120547 (Tracking ID: 4093067)

SYMPTOM:
System panicked in the following stack:

#9  [] page_fault at  [exception RIP: bdevname+26]
#10 [] get_dip_from_device  [vxdmp]
#11 [] dmp_node_to_dip at [vxdmp]
#12 [] dmp_check_nonscsi at [vxdmp]
#13 [] dmp_probe_required at [vxdmp]
#14 [] dmp_check_disabled_policy at [vxdmp]
#15 [] dmp_initiate_restore at [vxdmp]
#16 [] dmp_daemons_loop at [vxdmp]

DESCRIPTION:
After got block_device from OS, DMP didn't do the NULL pointer check against block_device->bd_part. This NULL pointer further caused system panic when bdevname() was called.

RESOLUTION:
The code changes have been done to fix the problem.

* 4120720 (Tracking ID: 4086063)

SYMPTOM:
VxVM package uninstallation fails as no semodule policy is installed.

DESCRIPTION:
semodule policy gets loaded in %post stage of new package. After package upgrade no semodule policy is loaded, as the %preun stage removes the policy of upgraded package. While uninstalling the package the %preun stage fails as it tries to remove policy of upgraded package which was already removed while upgrading the package.

RESOLUTION:
Add the policy installation part to %posttrans stage of install script. This way policy installation is shifted to last stage of package upgrade. And the uninstallation is done successfully.

* 4120722 (Tracking ID: 4021816)

SYMPTOM:
VxVM package uninstallation fails after upgrade as semodule policy was removed during package upgrade.

DESCRIPTION:
After a VxVM package upgrade no semodule policy is loaded, the %preun stage is used to uninstall semodule policy and is followed in case of upgrade and uninstall of package. It is necessary for the %preun stage to be used only in case of uninstallation of package, as the stage is for old package.

RESOLUTION:
Code change to use %preun stage only in case of uninstallation of package.

* 4120724 (Tracking ID: 3995831)

SYMPTOM:
System hung: A large number of SIOs got queued in FMR.

DESCRIPTION:
When IO load is high, there may be not enough chunks available. In that case, DRL flushsio needs to drive fwait queue which may get some available chunks. Due a race condition and a bug inside DRL, DRL may queue the flushsio and fail to trigger flushsio again, then DRL ends in a permanent hung situation, not able to flush the dirty regions. The queued SIOs fails to be driven further hence system hung.

RESOLUTION:
Code changes have been made to drive SIOs which got queued in FMR.

* 4120728 (Tracking ID: 4090476)

SYMPTOM:
Storage Replicator Log (SRL) is not draining to secondary. rlink status shows the outstanding writes never got reduced in several hours.

VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRL
VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRL
VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRL
VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRL
VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRL
VxVM VVR vxrlink INFO V-5-1-4640 Rlink xxx has 239346 outstanding writes, occupying 2210892 Kbytes (0%) on the SRL

DESCRIPTION:
In poor network environment, VVR seems not syncing. Another reconfigure happened before VVR state became clean, VVR atomic window got set to a large size. VVR couldnt complete all the atomic updates before the next reconfigure. VVR ended kept sending atomic updates from VVR pending position. Hence VVR appears to be stuck.

RESOLUTION:
Code changes have been made to update VVR pending position accordingly.

* 4120769 (Tracking ID: 4014894)

SYMPTOM:
Disk attach taking long time with reboot/hastop in FSS environment.

DESCRIPTION:
Current code in vxattachd is calling 'vxdg -k add disk' command for each disk separately and this is serialised. This means number of transaction are  initiated to add disk  and can impact application IO multiple times due to  IO quiesce/drain activity.

RESOLUTION:
code changes to add all disks in a single command, thus generating less transactions and execution time.

* 4120876 (Tracking ID: 4081434)

SYMPTOM:
VVR panic with below stack:

 #2 [ffff9683fa0efcc8] panic at ffffffff90d802cc
 #3 [ffff9683fa0efd48] vol_rv_service_message_start at ffffffffc2eeae0c [vxio]
 #4 [ffff9683fa0efe48] voliod_iohandle at ffffffffc2d2e276 [vxio]
 #5 [ffff9683fa0efe88] voliod_loop at ffffffffc2d2e68c [vxio]
 #6 [ffff9683fa0efec8] kthread at ffffffff906c5e61

DESCRIPTION:
At VVR primary side, data ack is received on primary and it will search its corresponding nio in rp_ack_waitq to call done function. But the nio may has already freed within vol_rp_flush_ack_waitq() during disconnecting rlink, then it caused panic while accessing the nio. With replica connection changes handling, for rp disconnect case, the flag VOL_RPFLAG_ACK_WAITQ_FLUSHING was set within vol_rp_flush_ack_waitq(),  to avoid such issue. But the flag was cleared earlier just after creating rp ports during rlink connect.

RESOLUTION:
The fix clears the flag during handling replica connection changes, for rp connect case symmetrically.

* 4120899 (Tracking ID: 4116024)

SYMPTOM:
kernel panicked at gab_ifreemsg with following stack:
gab_ifreemsg
gab_freemsg
kmsg_gab_send
vol_kmsg_sendmsg
vol_kmsg_sender

DESCRIPTION:
In a CVR environment there is a RVG of > 600 data volumes, enabling vxvvrstatd daemon through service vxvm-recover. vxvvrstatd calls into ioctl(VOL_RV_APPSTATS) , the latter will generate a kmsg whose length is longer than 64k and trigger a kernel panic due to GAB/LLT no support any message longer than 64k.

RESOLUTION:
Code changes have been done to add a limitation to the maximum number of data volumes for which that ioctl(VOL_RV_APPSTATS) can request the VVR statistics.

* 4120903 (Tracking ID: 4100775)

SYMPTOM:
vxconfigd kept waiting for IO drain when removed dmpnodes. It was hung with below stack:
[] dmpsync_wait+0xa7/0xf0 [vxdmp]
[] dmp_destroy_mp_node+0x98/0x120 [vxdmp]
[] dmp_decode_destroy_dmpnode+0xd3/0x100 [vxdmp]
[] dmp_decipher_instructions+0x2d7/0x390 [vxdmp]
[] dmp_process_instruction_buffer+0x1be/0x1e0 [vxdmp]
[] dmp_reconfigure_db+0x5b/0xe0 [vxdmp]
[] gendmpioctl+0x76c/0x950 [vxdmp]
[] dmpioctl+0x39/0x80 [vxdmp]
[] dmp_ioctl+0x3a/0x70 [vxdmp]
[] blkdev_ioctl+0x28a/0xa20
[] block_ioctl+0x41/0x50
[] do_vfs_ioctl+0x3a0/0x5b0
[] SyS_ioctl+0xa1/0xc0

DESCRIPTION:
XFS utilizes chained BIO feature to send BIOs to VxDMP. While the chained BIO isn't supported by VxDMP, it caused VxDMP kept waiting for a completed BIO.

RESOLUTION:
Code changes have been made to support chained BIO on rhel7.

* 4120916 (Tracking ID: 4112687)

SYMPTOM:
vxdisk resize corrupts disk public region and causes file system mount fail.

DESCRIPTION:
For single path disk, during two transactions of resize operation, the private region IOs could be incorrectly sent to partition 3 of the GPT disk, which would cause 48 more sectors shift.  This may make the private region data written to public region and cause corruption.

RESOLUTION:
Code changes have been made to fix the problem.

* 4121075 (Tracking ID: 4100069)

SYMPTOM:
One of standard disk groups fails to auto-import when those disk groups co-exist with the cloned disk group. It failed with below error in syslog.
vxvm:vxconfigd[xxx]: V-5-1-569 Disk group <disk group name>, Disk <dmpnode name> Cannot auto-import group:
vxvm:vxconfigd[xxx]: #011Disk for disk group not found

DESCRIPTION:
The importflags wasn't reset before starting the next disk group import, further caused the next disk group import inherited all the flags from the last round of disk group import. The improper importflags caused the failure.

RESOLUTION:
Code changes have been made to reset importflags in every round of disk group import.

* 4121081 (Tracking ID: 4098965)

SYMPTOM:
Vxconfigd dumping Core when scanning IBM XIV Luns with following stack.

#0  0x00007fe93c8aba54 in __memset_sse2 () from /lib64/libc.so.6
#1  0x000000000061d4d2 in dmp_getenclr_ioctl ()
#2  0x00000000005c54c7 in dmp_getarraylist ()
#3  0x00000000005ba4f2 in update_attr_list ()
#4  0x00000000005bc35c in da_identify ()
#5  0x000000000053a8c9 in find_devices_in_system ()
#6  0x000000000053aab5 in mode_set ()
#7  0x0000000000476fb2 in ?? ()
#8  0x00000000004788d0 in main ()

DESCRIPTION:
This could cause 2 issues if there are more than 1 disk arrays connected:

1. If the incorrect memory address exceeds the range of valid virtual memory, it will trigger "Segmentation fault" and crash vxconfigd.
2. If  the incorrect memory address does not exceed the range of valid virtual memory, it will cause memory corruption issue but maybe not trigger vxconfigd crash issue.

RESOLUTION:
Code changes have been made to correct the problem.

* 4121083 (Tracking ID: 4105953)

SYMPTOM:
System panic with below stack in CVR environment.

 #9 [] page_fault at 
    [exception RIP: vol_ru_check_update_done+183]
#10 [] vol_rv_write2_done at [vxio]
#11 [] voliod_iohandle at [vxio]
#12 [] voliod_loop at [vxio]
#13 [] kthread at

DESCRIPTION:
In CVR environment, when IO is issued in writeack sync mode we ack to application when datavolwrite is done on either log client or logowner depending on 
where IO is issued on. it could happen that VVR freed the metadata I/O update after SRL write is done incase of writeack sync mode, but later after freeing the update, its accessed again and hence we end up in hitting NULL ptr deference.

RESOLUTION:
Code changes have been made to avoid the accessing NULL pointer.

* 4121222 (Tracking ID: 4095718)

SYMPTOM:
vxesd kept waiting for IO drain with below stack and other tasks like vxpath_links were in hung status too.

#0 [] __schedule at
#1 [] schedule at
#2 [] dmpsync_wait at [vxdmp]
#3 [] dmp_drain_path at [vxdmp]
#4 [] dmp_disable_path at [vxdmp]
#5 [] dmp_change_path_state at [vxdmp]
#6 [] gendmpioctl at [vxdmp]
#7 [] dmpioctl at [vxdmp]
#8 [] dmp_ioctl at [vxdmp]

DESCRIPTION:
Due to storage related activities, some subpaths have been changed to DISABLE which was triggered by vxesd. All the subpaths which belong to the same dmpnode were marked as QUIESCED too. In case any subpaths are handling error IO, vxesd needs to wait till the error process is finished. It might happen the error process failed to wake up vxesd due to a bug. The hung vxesd further caused all coming IOs against those dmpnodes got queued in DMP defer queue. As a result, all tasks who are waiting for IO to complete on those dmpnodes are in a permanent hung too.

RESOLUTION:
The code changes have been made to wake up the tasks who are waiting for DMP error handling to be done.

* 4121243 (Tracking ID: 4101588)

SYMPTOM:
vxtune displays vol_rvio_maxpool_sz as zero when it's over 4g.

DESCRIPTION:
The tunable vol_rvio_maxpool_sz is defined as size_t type that is 64 bit long in 64 bit binary, while vxtune displays it as 32 
bit unsigned int type, so it's showed as zero when it's over max unsigned int(4gb).

RESOLUTION:
The issue has been fixed by the code changes.

* 4121254 (Tracking ID: 4115078)

SYMPTOM:
vxconfigd hung was observed when reboot all nodes of the primary site.

DESCRIPTION:
When vvr logowner node wasn't configured on Master. VVR recovery was triggered by node leaving, in case data volume was in recovery, vvr logowner would send ilock request to Master node. Master granted the ilock request and sent a response to vvr logonwer. But due to a bug, ilock requesting node id mismatch was detected by vvr logowner. VVR logowner thought the ilock grant failed, mdship IO went into a permanent hang. vxconfigd was stuck and kept waiting for IO drain.

RESOLUTION:
Code changes have been made to correct the ilock requesting node id in the ilock request in such case.

* 4121681 (Tracking ID: 3995731)

SYMPTOM:
vxconfigd died with below stack info:

#0  in vfprintf () from /lib64/libc.so.6
#1  in vsnprintf () from /lib64/libc.so.6
#2  in msgbody_va ()
#3  in msg () at misc.c:1430
#4  in krecover_mirrorvol () at krecover.c:1244
#5  krecover_dg_objects_20 () at krecover.c:515
#6  krecover_dg_objects () at krecover.c:303
#7  in dg_import_start () at dgimport.c:7721
#8  in dg_reimport () at dgimport.c:3337
#9  in dg_recover_all () at dgimport.c:4885

DESCRIPTION:
the VVR objects were removed before upgrade. vxconfigd accessed the NULL object and died.

RESOLUTION:
Code changes have been made to access the valid record during recovery.

* 4121763 (Tracking ID: 3995308)

SYMPTOM:
vxtask status hang due to incorrect values getting copied into task status information.

DESCRIPTION:
When doing atomic-copy admin task, VxVM copy the entire request structure passed as response of the task status in local copy. This creates some issues of incorrect copy/overwrite of pointer.

RESOLUTION:
Code changes have been made to fix the problem.

* 4121767 (Tracking ID: 4117568)

SYMPTOM:
Vradmind dumps core with the following stack:

#1  std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=0x7ffdc380d810,
    __str=<error reading variable: Cannot access memory at address 0x3736656436303563>)
#2  0x000000000040e02b in ClientMgr::closeStatsSession
#3  0x000000000040d0d7 in ClientMgr::client_ipm_close
#4  0x000000000058328e in IpmHandle::~IpmHandle
#5  0x000000000057c509 in IpmHandle::events
#6  0x0000000000409f5d in main

DESCRIPTION:
After terminating vrstat, the StatSession in vradmind was closed and the corresponding Client object was deleted. When closing the IPM object of vrstat, try to access the removed Client, hence the core dump.

RESOLUTION:
Core changes have been made to fix the issue.

* 4121875 (Tracking ID: 4090943)

SYMPTOM:
On Primary, RLink is continuously getting connected/disconnected with below message seen in secondary syslog:
  VxVM VVR vxio V-5-3-0 Disconnecting replica <rlink_name> since log is full on secondary.

DESCRIPTION:
When RVG logowner node panic, RVG recovery happens in 3 phases.
At the end of 2nd phase of recovery in-memory and on-disk SRL positions remains incorrect
and during this time if there is logowner change then Rlink won't get connected.

RESOLUTION:
Handled in-memory and on-disk SRL positions correctly.

* 4123313 (Tracking ID: 4114927)

SYMPTOM:
After enabling dmp_native_support and taking reboot, /boot is not mounted VxDMP node.

DESCRIPTION:
When dmp_native_support is enabled, vxdmproot script is expected to modify the /etc/fstab entry for /boot so that on next boot up, /boot is mounted on dmp device instead of OS device. Also, this operation modifies SELinux context of file /etc/fstab. This causes the machine to go into maintenance mode because of a read permission denied error for /etc/fstab on boot up.

RESOLUTION:
Code changes have been done to make sure SELinux context is preserved for /etc/fstab file and /boot is mounted on dmp device when dmp_native_support is enabled.

* 4124324 (Tracking ID: 4098582)

SYMPTOM:
Customer have found the log file /var/log/vx/ddl.log which breaks one of the policies, as it keeps being created with 644 permissions

DESCRIPTION:
File permissions in /var/log/vx/ddl.log after logrotation to be 640 as per security compliance

RESOLUTION:
Appropriate code changes are done to handle scenario

* 4126041 (Tracking ID: 4124223)

SYMPTOM:
Core dump is generated for vxconfigd in TC execution.

DESCRIPTION:
TC creates a scenario where 0s are written in first block of disk. In such case, Null check is necessary in code before some variable is accessed. This Null check is missing which causes vxconfigd core dump in TC execution.

RESOLUTION:
Necessary Null checks is added in code to avoid vxconfigd core dump.

* 4127473 (Tracking ID: 4089626)

SYMPTOM:
On RHEL8.5, IO hang occurrs when creating XFS on VxDMP devices or writing file on mounted XFS from VxDMP devices.

DESCRIPTION:
XFS utilizes chained BIO feature to send BIOs to VxDMP. While the chained BIO isn't suported by VxDMP, hence the BIOs may struck in SCSI disk driver.

RESOLUTION:
Code changes have been made to support chained BIO.

* 4127475 (Tracking ID: 4114601)

SYMPTOM:
System gets panicked and rebooted

DESCRIPTION:
RCA:
Start the IO on volume device and pull out it's disk from the machine and hit below panic on RHEL8.

 dmp_process_errbp
 dmp_process_errbuf.cold.2+0x328/0x429 [vxdmp]
 dmpioctl+0x35/0x60 [vxdmp]
 dmp_flush_errbuf+0x97/0xc0 [vxio]
 voldmp_errbuf_sio_start+0x4a/0xc0 [vxio]
 voliod_iohandle+0x43/0x390 [vxio]
 voliod_loop+0xc2/0x330 [vxio]
 ? voliod_iohandle+0x390/0x390 [vxio]
 kthread+0x10a/0x120
 ? set_kthread_struct+0x50/0x50

As disk pulled out from the machine VxIO hit a IO error and it routed that IO to dmp layer via kernel-kernel IOCTL for error analysis.
following is the code path for IO routing,

voldmp_errbuf_sio_start()-->dmp_flush_errbuf()--->dmpioctl()--->dmp_process_errbuf()

dmp_process_errbuf() retrieves device number of the underlying path (os-device).
and it tries to get bdev (i.e. block_device) pointer from path-device number.
As path/os-device is removed by disk pull, linux returns fake bdev for the path-device number.
For this fake bdev there is no gendisk associated with it (bdev->bd_disk is NULL).

We are setting this NULL bdev->bd_disk to the IO buffer routed from vxio.
which leads a panic on dmp_process_errbp.

RESOLUTION:
If bdev->bd_disk found NULL then set DMP_CONN_FAILURE error on the IO buffer and return DKE_ENXIO to vxio driver

Patch ID: VRTSvxvm-7.4.2.4300

* 4119951 (Tracking ID: 4119950)

SYMPTOM:
Vulnerabilities have been reported in third party components, [curl and libxml] that are used by VxVM.

DESCRIPTION:
Third party components [curl and libxml] in their current versions,  used by VxVM have been reported with security vulnerabilities which 
needs

RESOLUTION:
[curl and libxml] have been upgraded to newer versions in which the reported security vulnerabilities have been addressed.

Patch ID: VRTSvxvm-7.4.2.4100

* 4116348 (Tracking ID: 4112433)

SYMPTOM:
Vulnerabilities have been reported in third party components, [openssl, curl and libxml] that are used by VxVM.

DESCRIPTION:
Third party components [openssl, curl and libxml] in their current versions,  used by VxVM have been reported with security vulnerabilities which needs

RESOLUTION:
[openssl, curl and libxml] have been upgraded to newer versions in which the reported security vulnerabilities have been addressed.

Patch ID: VRTSvxvm-7.4.2.3900

* 4110560 (Tracking ID: 4104927)

SYMPTOM:
vxvm-boot.service fails to start on linux platforms other than SLES15

DESCRIPTION:
SLES15 specific attribute changes causes vxvm-boot.service to fail to start on other linux platforms.

RESOLUTION:
A new vxvm-boot.service file to honour vxvm-boot.service for SLES15, the existing vxvm-boot.service file will serve for other linux platforms.

* 4113324 (Tracking ID: 4113323)

SYMPTOM:
Existing package failed to load on RHEL 8.8 server.

DESCRIPTION:
RHEL 8.8 is a new release and hence VxVM module is compiled with this new kernel along with few other changes.

RESOLUTION:
Compiled VxVM code against 8.8 kernel and made changes to make it compatible.

* 4113661 (Tracking ID: 4091076)

SYMPTOM:
SRL gets into pass-thru mode when it's about to overflow.

DESCRIPTION:
Primary initiated log search for the requested update sent from secondary. The search aborted with head error as a check condition isn't set correctly.

RESOLUTION:
Fixed the check condition to resolve the issue.

* 4113663 (Tracking ID: 4095163)

SYMPTOM:
System panic with below stack:
 #6 [] invalid_op at 
    [exception RIP: __slab_free+414]
 #7 [] kfree at 
 #8 [] vol_ru_free_update at [vxio]
 #9 [] vol_ru_free_updateq at  [vxio]
#10 [] vol_rv_write2_done at [vxio]
#11 [] voliod_iohandle at [vxio]
#12 [] voliod_loop at [vxio]

DESCRIPTION:
The update gets freed as a part of VVR recovery. At the same time, this update also gets freed in VVR second phase of write. Hence there is a race in freeing the updates and caused the system panic.

RESOLUTION:
Code changes have been made to avoid

* 4113664 (Tracking ID: 4091390)

SYMPTOM:
vradmind hit the core dump while accessing pHdr, which is already freed.

DESCRIPTION:
While processing the config message - CFG_UPDATE, we incorrectly freed the existing config message objects. Later, objects are accessed again which dumped the vradmind core.

RESOLUTION:
Changes are done to access the correct configuration objects.

* 4113666 (Tracking ID: 4064772)

SYMPTOM:
After enabling slub debug, system could hang with IO load.

DESCRIPTION:
When creating VxVM I/O memory, VxVM does not align the cache size. This unaligned length will be treated as an invalid I/O length in SCSI layer, which causes some I/O requests are stuck in an invalid state and results in the I/Os never being able to complete. Thus system hang could be observed, especially after cache slub debug is enabled.

RESOLUTION:
Code changes have been done to align the cache size.

Patch ID: VRTSvxvm-7.4.2.3800

* 4110666 (Tracking ID: 4110665)

SYMPTOM:
A security vulnerability exists in the third-party component libcurl.

DESCRIPTION:
VxVM uses a third-party component named libcurl in which a security vulnerability exists.

RESOLUTION:
VxVM is updated to use a newer version of libcurl in which the security vulnerability has been addressed.

* 4110766 (Tracking ID: 4112033)

SYMPTOM:
A security vulnerability exists in the third-party component libxml2.

DESCRIPTION:
VxVM uses a third-party component named libxml2in which a security vulnerability exists.

RESOLUTION:
VxVM is updated to use a newer version of libxml2 in which the security vulnerability has been addressed.

Patch ID: VRTSvxvm-7.4.2.3700

* 4105752 (Tracking ID: 4107924)

SYMPTOM:
Old VxVM rpm fails to load on RHEL8.7 minor kernel 4.18.0-425.10.1.el8_7.x86_64

DESCRIPTION:
RedHat did some critical changes in latest kernel which causing soft-lockup issue to VxVM kernel modules while installation.

RESOLUTION:
As suggested by RedHat (https://access.redhat.com/solutions/6985596) VxVM modules compiled with RHEL 8.7 minor kernel.

* 4106001 (Tracking ID: 4102501)

SYMPTOM:
A security vulnerability exists in the third-party component libcurl.

DESCRIPTION:
VxVM uses a third-party component named libcurl in which a security vulnerability exists.

RESOLUTION:
VxVM is updated to use a newer version of libcurl in which the security vulnerability has been addressed.

* 4107223 (Tracking ID: 4107802)

SYMPTOM:
vxdmp fails to load and system hangs.

DESCRIPTION:
This issue occurs due to changes in the RHEL8.7 minor kernel and incorrect module is calculated for best-fit.

RESOLUTION:
Modified existing modinst-vxvm script to calculate correct best-fit module.

Patch ID: VRTSvxvm-7.4.2.3600

* 4102424 (Tracking ID: 4103350)

SYMPTOM:
Following error message is seen on running vradmin -encrypted addsec command.

# vradmin -g enc_dg2 -encrypted addsec enc_dg2_rvg1 123.123.123.123 234.234.234.234
Message from Host 234.234.234.234:
Job for vxvm-encrypt.service failed.
See "systemctl status vxvm-encrypt.service" and "journalctl -xe" for details.
VxVM vxvol ERROR V-5-1-18863 Failed to start vxvm-encrypt service. Error:1.

DESCRIPTION:
"vradmin -encrypted addsec" command fails on primary because vxvm-encrypt.service goes into failed state on secondary site. On secondary master, vxvm-encrypt.service tries to restart 5 times and goes into failed state.

RESOLUTION:
Code changes have been done to prevent vxvm-encrypt.service from going into failed state.

Patch ID: VRTSvxvm-7.4.2.3500

* 4012176 (Tracking ID: 3996206)

SYMPTOM:
Paths from different 3PAR LUNs shown under single Volume Manager device (VxVM disk).

DESCRIPTION:
VxVM uses "LUN serial number" to identify a LUN unique. This "LUN serial number" is fetched by doing SCSI inquiry
on VPD page 0. The "LUN serial number" obtained from VPD page 0 doesn't always guarantee uniqueness of LUN. Due to this
when paths from 2 diffent 3PAR LUNs have same "LUN serial number" DMP adds them under a single device/disk.

RESOLUTION:
Changes have been done in 3PAR ASL to fetch "LUN serial number" from VPD page 0x83 that gaurentees unqiue number for LUN.

* 4013169 (Tracking ID: 4011691)

SYMPTOM:
Observed high CPU consumption on the VVR secondary nodes because of high pending IO load.

DESCRIPTION:
High replication related IO load on the VVR secondary and the requirement of maintaining write order fidelity with limited memory pools created  contention. This resulted in multiple VxVM kernel threads contending for shared resources and there by increasing the CPU consumption.

RESOLUTION:
Limited the way in which VVR consumes its resources so that a high pending IO load would not result into high CPU consumption.

* 4052119 (Tracking ID: 4045871)

SYMPTOM:
vxconfigd crashed at ddl_get_disk_given_path with following stacks:
ddl_get_disk_given_path
ddl_reconfigure_all
ddl_find_devices_in_system
find_devices_in_system
mode_set
setup_mode
startup
main
_start

DESCRIPTION:
Under some situations, duplicate paths can be added in one dmpnode in vxconfigd. If the duplicate paths are removed then the empty path entry can be generated for that dmpnode. Thus, later when vxconfigd accesses the empty path entry, it crashes due to NULL pointer reference.

RESOLUTION:
Code changes have been done to avoid the duplicate paths that are to be added.

* 4086043 (Tracking ID: 4072241)

SYMPTOM:
-bash-5.1# /usr/lib/vxvm/voladm.d/bin/dmpdr
Dynamic Reconfiguration Operations

WARN: Please Do not Run any Device Discovery Operations outside the Tool during Reconfiguration operations
INFO: The logs of current operation can be found at location /var/log/vx/dmpdr_20220420_1042.log
ERROR: Failed to open lock file for /usr/lib/vxvm/voladm.d/bin/dmpdr, No such file or directory. Exit.

Exiting the Current DMP-DR Run of the Tool

DESCRIPTION:
VxVM Log location for linux changed which impacted vxdiskadm functionality on solaris .

RESOLUTION:
Required changed have been done to make code changes work across platforms.

* 4090311 (Tracking ID: 4039690)

SYMPTOM:
Change the logger files size to collect the large amount of logs in the system.

DESCRIPTION:
Double the logger files size limit and improve on logger size footprint by using gzip on logger files.

RESOLUTION:
Completed required code changes to do this enhancement.

* 4090411 (Tracking ID: 4054685)

SYMPTOM:
RVG recovery gets hung in case of reconfiguration scenarios in CVR environments leading to vx commands hung on master node.

DESCRIPTION:
As a part of rvg recovery we perform DCM, datavolume recovery. But datavolume recovery takes long time due to wrong IOD handling done in linux platforms.

RESOLUTION:
Fix the IOD handling mechanism to resolve the rvg recovery handling.

* 4090415 (Tracking ID: 4071345)

SYMPTOM:
Replication is unresponsive after failed site is up.

DESCRIPTION:
Autosync and unplanned fallback synchronisation had issues in a mix of cloud and non-cloud Volumes in RVG.
After a cloud volume is found rest of the volumes were getting ignored for synchronisation

RESOLUTION:
Fixed condition to make it iterate over all Volumes.

* 4090442 (Tracking ID: 4078537)

SYMPTOM:
When connection to s3-fips bucket is made below error messages are observed :
2022-05-31 03:53:26 VxVM ERROR V-5-1-19512 amz_request_perform: PUT request failed, url: https://s3-fips.us-east-2.amazonaws.com/fipstier334f3956297c8040078280000d91ab70a/2.txt_27ffff625eff0600442d000013ffff5b_999_7_1077212296_0_1024_38, errno 11
2022-05-31 03:53:26 VxVM ERROR V-5-1-19333 amz_upload_object: amz_request_perform failed for obj:2.txt_27ffff625eff0600442d000013ffff5b_999_7_1077212296_0_1024_38
2022-05-31 03:53:26 VxVM WARNING V-5-1-19752 Try upload_object(fipstier334f3956297c8040078280000d91ab70a/2.txt_27ffff625eff0600442d000013ffff5b_999_7_1077212296_0_1024_38) again, number of requests attempted: 3.
2022-05-31 03:53:26 VxVM ERROR V-5-1-19358 curl_send_request: curl_easy_perform() failed: Couldn't resolve host name
2022-05-31 03:53:26 VxVM ERROR V-5-1-0 curl_request_perform: Error in curl_request_perform 6
2022-05-31 03:53:26 VxVM ERROR V-5-1-19357 curl_request_perform: curl_send_request failed with error: 6

DESCRIPTION:
For s3-fips bucket endpoints, AWS has made it mandatory to use the virtual-hosted style methods to connect to the s3-fips bucket instead of path-hosted style method which is currently used by Infoscale.

RESOLUTION:
Code changes are done to send cloud requestss3-fips bucket successfully.

* 4090541 (Tracking ID: 4058166)

SYMPTOM:
While setting up VVR/CVR on large size data volumes (size > 3TB) with filesystems mounted on them, initial autosync operation takes a lot of time to complete.

DESCRIPTION:
While performing autosync on VVR/CVR setup for a volume with filesystem mounted, if smartmove feature is enabled, the operation does smartsync by syncing only the regions dirtied by filesystem, instead of syncing entire volume, which completes faster than normal case. However, for large size volumes (size > 3TB),

smartmove feature does not get enabled, even with filesystem mounted on them and hence autosync operation syncs entire volume.

This behaviour is due to smaller size DCM plexes allocated for such large size volumes, autosync ends up performing complete volume sync,taking lot more time to complete.

RESOLUTION:
Increase the limit of DCM plex size (loglen) beyond 2MB so that smart move feature can be utilised properly.

* 4090599 (Tracking ID: 4080897)

SYMPTOM:
Observed Performance drop on raw VxVM volume in RHEL 8.x compared to RHEL7.X

DESCRIPTION:
There has been change in file_operations used for character devices from RHEL 7.X and RHEL8.X releases. In RHEL 7.X aio_read and aio_write function pointers are implemented whereas this has changed to read_iter and write_iter respectively in the latest release. In RHEL 8.X changes, VxVM code called generic_file_write_iter(). The problem here is that this function takes an inode-lock. And in the multi-thread write operation, this semaphore basically causes serial processing of IO submission leading to dropped performance.

RESOLUTION:
Use of __generic_file_write_iter() function helps to resolve the issue and vxvm_generic_write_sync() function is implemented which handles the SYNCing part of the write similar to functions like blkdev_write_iter() and generic_file_write_iter().

* 4090604 (Tracking ID: 4044529)

SYMPTOM:
DMP is unable to display PWWN details for some LUNs by "vxdmpadm getportids".

DESCRIPTION:
Udev rules file(/usr/lib/udev/rules.d/63-fc-wwpn-id.rules) from newer RHEL OS will generate an addtional hardware path for a FC device, hence there will be 2 hardware paths for the same device. However vxpath_links script only consider single hardware path for a FC device. In the case of 2 hardware paths, vxpath_links may not treat it as a FC device, thus fail to populate PWWN related information.

RESOLUTION:
Code changes have been done to make vxpath_links correctly detect FC device even there are multiple hardware paths.

* 4090932 (Tracking ID: 3996634)

SYMPTOM:
A system boot with large number of luns managed by VxDMP take a long time.

DESCRIPTION:
When a system has large number of luns managed by VxDMP which mounted on a primary partition or has formatted with some type of File system, during boot, the DMP device would be removed and UDEV trigger event against the OS device, the OS device would be read from lsblk command. The lsblk command is slow and if the lsblk commands issued against multiple devices in parallel, it may be stuck, then the system boot take a long time.

RESOLUTION:
Code has been changed to read the OS device from blkid command rather than lsblk command.

* 4090946 (Tracking ID: 4023297)

SYMPTOM:
Smartmove functionality was not being used after VVR Rlink was paused and resumed during VVR initial sync or DCM resync operation. This was resulting in more data transfer to VVR secondary site than needed.

DESCRIPTION:
The transactions for VVR pause and resume operations were being considered as phases after which smartmove is not necessary to be used. This was resulting in smartmove not being used after the resume operation.

RESOLUTION:
Fixed the condition so that smartmove continues to work beyond pause/resume operations.

* 4090960 (Tracking ID: 4087770)

SYMPTOM:
Data corruption post mirror attach operation seen after complete storage fault for DCO volumes.

DESCRIPTION:
DCO (data change object) tracks delta changes for faulted mirrors. During complete storage loss of DCO volume mirrors in, DCO object will be marked as BADLOG and becomes unusable for bitmap tracking.
Post storage reconnect (such as node rejoin in FSS environments) DCO will be re-paired for subsequent tracking. During this if VxVM finds any of the mirrors detached for data volumes, those are expected to be marked for full-resync as bitmap in DCO has no valid information. Bug in repair DCO operation logic prevented marking mirror for full-resync in cases where repair DCO operation is triggered before data volume is started. This resulted into mirror getting attached without any data being copied from good mirrors and hence reads serviced from such mirrors have stale data, resulting into file-system corruption and data loss.

RESOLUTION:
Code has been added to ensure repair DCO operation is performed only if volume object is enabled so as to ensure detached mirrors are marked for full-resync appropriately.

* 4090970 (Tracking ID: 4017036)

SYMPTOM:
After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device when Linux is booting with systemd.

DESCRIPTION:
Currently /boot is mounted on top of OS (Operating System) device. When DMP
Native support is enabled, only VG's (Volume Groups) are migrated from OS 
device to DMP device.This is the reason /boot is not migrated to DMP device.
With this if OS device path is not available then system becomes unbootable 
since /boot is not available. Thus it becomes necessary to mount /boot on DMP
device to provide multipathing and resiliency. 
The current fix can only work on configurations with single boot partition.

RESOLUTION:
Code changes have been done to migrate /boot on top of DMP device when DMP
Native support is enabled and when Linux is booting with systemd.

* 4091248 (Tracking ID: 4040808)

SYMPTOM:
df command hung in clustered environment

DESCRIPTION:
df command hung in clustered environment due to drl updates are not getting complete causing application IOs to hang.

RESOLUTION:
Fis is added to complete incore DRL updates and drive corresponding application IOs

* 4091588 (Tracking ID: 3966157)

SYMPTOM:
the feature of SRL batching was broken and we were not able to enable it as it might caused problems.

DESCRIPTION:
Batching of updates needs to be done as to get benefit of batching multiple updates and getting performance increased

RESOLUTION:
we have decided to simplify the working as we are now aligning each of the small update within a total batch to 4K size so that,

by default we will get the whole batch aligned one, and then there is no need of book keeping for last update and hence reducing the overhead of

different calculations.

we are padding individual updates to reduce overhead of book keeping things around last update in a batch,
by padding each updates to 4k, we will be having a batch of updates which is 4k aligned itself.

* 4091910 (Tracking ID: 4090321)

SYMPTOM:
vxvm-boot service startup failure

DESCRIPTION:
vxvm-boot service is taking long time to start and getting timed out. With more number of devices device discovery is taking more time to finish.

RESOLUTION:
Increase timeout for service so that discovery gets more time to finish

* 4091911 (Tracking ID: 4090192)

SYMPTOM:
vxvm-boot service startup failure

DESCRIPTION:
vxvm-boot service is taking long time to start and getting timed out. With more number of devices device discovery is taking more time to finish.

RESOLUTION:
Increase device discovery threads to range of 128 to 256 depending on CPUs available on system

* 4091912 (Tracking ID: 4090234)

SYMPTOM:
vxvm-boot service is taking long time to start and getting timed out in large LUNs setups.

DESCRIPTION:
Device discovery layer and infiniband devices(default 120s) are taking long time to discover
the devices which is cause for Volume Manager service timeout. 
Messages logged:
Jul 28 19:52:52 nb-appliance vxvm-boot[17711]: VxVM general startup...
Jul 28 19:57:51 nb-appliance systemd[1]: vxvm-boot.service: start operation timed out. Terminating.
Jul 28 19:57:51 nb-appliance vxvm-boot[17711]: Terminated
Jul 28 19:57:51 nb-appliance systemd[1]: vxvm-boot.service: Control process exited, code=exited status=100
Jul 28 19:59:22 nb-appliance systemd[1]: vxvm-boot.service: State 'stop-final-sigterm' timed out. Killing.
Jul 28 19:59:23 nb-appliance systemd[1]: vxvm-boot.service: Killing process 209714 (vxconfigd) with signal SIGKILL.
Jul 28 20:00:30 nb-appliance systemd[1]: vxvm-boot.service: Failed with result 'timeout'.
Jul 28 20:00:30 nb-appliance systemd[1]: Failed to start VERITAS Volume Manager Boot service.

RESOLUTION:
Completed required changes to fix this issue.

NBA:
https://jira.community.veritas.com/browse/STESC-7281

Flex:
https://jira.community.veritas.com/browse/FLEX-7003

We are suspecting vxvm-boot service timeout due to multiple issues
1. OS is taking long time to discover devices.
2. We have 120 seconds sleep in vxvm-startup when infiniband devices or controllers are present in setup.

Issue1: 
Issue here is vxvm-boot service is taking long time to start and getting timed out. Main issue lies in the device discovery layer which is taking more time. 
There are suspected issues from OS side as well where we have seen OS is also taking long time to discover devices

http://codereview.engba.veritas.com/r/42003/

Issue2:
Earlier in vxvm-startup, we are sleeping 120 seconds for infiniband devices or controller but now we are sleeping 120 seconds only if infinband devices claimed by ASL.

* 4091963 (Tracking ID: 4067191)

SYMPTOM:
In CVR environment after rebooting Slave node, Master node may panic with below stack:

Call Trace:
dump_stack+0x66/0x8b
panic+0xfe/0x2d7
volrv_free_mu+0xcf/0xd0 [vxio]
vol_ru_free_update+0x81/0x1c0 [vxio]
volilock_release_internal+0x86/0x440 [vxio]
vol_ru_free_updateq+0x35/0x70 [vxio]
vol_rv_write2_done+0x191/0x510 [vxio]
voliod_iohandle+0xca/0x3d0 [vxio]
wake_up_q+0xa0/0xa0
voliod_iohandle+0x3d0/0x3d0 [vxio]
voliod_loop+0xc3/0x330 [vxio]
kthread+0x10d/0x130
kthread_park+0xa0/0xa0
ret_from_fork+0x22/0x40

DESCRIPTION:
As part of CVM Master switch a rvg_recovery is triggered. In this step race
condition can occured between the VVR objects due to which the object value
is not updated properly and can cause panic.

RESOLUTION:
Code changes are done to handle the race condition between VVR objects.

* 4091989 (Tracking ID: 4090930)

SYMPTOM:
Relocation of failed data disk of mirror volume leads to data corruption.

DESCRIPTION:
However with existing volume having another faulted mirror and detached mirror being tracked in data change object (DCO) in detach map. At the same time VxVM relocation daemon when decides to relocate another failed disk of volume. This was expected to be full copy of data. Due to bug in relocation code the relocation operation was allowed even when volume is in DISABLED state. When volume became ENABLED the task to copy the data of new mirror incorrectly used detach map instead of full-sync and thus resulting into data loss for the new mirror.

RESOLUTION:
Code has been changed to block triggering relocation of disks when top-level volume is not in ENABLED state.



Mandatory details/instructions while reporting issues
 
1)	Problem

* 4092002 (Tracking ID: 4081740)

SYMPTOM:
vxdg flush command slow due to too many luns needlessly access /proc/partitions.

DESCRIPTION:
Linux BLOCK_EXT_MAJOR(block major 259) is used as extended devt for block devices. When partition number of one device is more than 15, the partition device gets assigned under major 259 to solve the sd limitations (16 minors per device), by which more partitions are allowed for one sd device. During "vxdg flush", for each lun in the disk group, vxconfigd reads file /proc/partitions line by line through fgets() to find all the partition devices with major number 259, which would cause vxconfigd to respond sluggishly if there are large amount of luns in the disk group.

RESOLUTION:
Code has been changed to remove the needless access on /proc/partitions for the luns without using extended devt.

* 4092838 (Tracking ID: 4101128)

SYMPTOM:
Old VxVM rpm fails to load on RHEL8.7

DESCRIPTION:
The RHEL8.7 is a new OS release and has multiple kernel changes which were making VxVM incompatible with OS kernel version 4.18.0-425.3.1

RESOLUTION:
Required code changes have been done. VxVM module compiled with RHEL 8.7 kernel.

* 4099550 (Tracking ID: 4065145)

SYMPTOM:
During addsec we were unable to processencrypted volume tags for multiple volumes and vsets.
Error we saw:

$ vradmin -g dg2 -encrypted addsec dg2_rvg1 10.210.182.74 10.210.182.75

Error: Duplicate tag name vxvm.attr.enckeytype provided in input.

DESCRIPTION:
The number of tags was not defined and we were processing all the tags at a time instead of processing max number of tags for a volume.

RESOLUTION:
Introduced a number of tags variable depend on the cipher method (CBC/GCM), as well fixed minor code issues.

Patch ID: VRTSvxvm-7.4.2.3300

* 4083792 (Tracking ID: 4082799)

SYMPTOM:
A security vulnerability exists in the third-party component libcurl.

DESCRIPTION:
VxVM uses a third-party component named libcurl in which a security vulnerability exists.

RESOLUTION:
VxVM is updated to use a newer version of libcurl in which the security vulnerability has been addressed.

Patch ID: VRTSvxvm-7.4.2.3200

* 4011971 (Tracking ID: 3991668)

SYMPTOM:
In a VVR configuration with secondary logging enabled, data inconsistency is reported after the "No IBC message arrived" error is encountered.

DESCRIPTION:
It might happen that the VVR secondary node handles updates with larger sequence IDs before the In-Band Control (IBC) update arrives. In this case, VVR drops the IBC update. Due to the updates with the larger sequence IDs than the one for the IBC update, data writes cannot be started, and they get queued. Data loss may occur after the VVR secondary receives an atomic commit and frees the queue. If this situation occurs, the "vradmin verifydata" command reports data inconsistency.

RESOLUTION:
VVR is modified to trigger updates as they are received in order to start data volume writes.

* 4013169 (Tracking ID: 4011691)

SYMPTOM:
Observed high CPU consumption on the VVR secondary nodes because of high pending IO load.

DESCRIPTION:
High replication related IO load on the VVR secondary and the requirement of maintaining write order fidelity with limited memory pools created  contention. This resulted in multiple VxVM kernel threads contending for shared resources and there by increasing the CPU consumption.

RESOLUTION:
Limited the way in which VVR consumes its resources so that a high pending IO load would not result into high CPU consumption.

* 4037288 (Tracking ID: 4034857)

SYMPTOM:
Current load of Vxvm modules were failing on SLES15 SP2(Kernel - 5.3.18-22.2-default).

DESCRIPTION:
With new kernel (5.3.18-22.2-default) below mentioned functions were depricated -
1. gettimeofday() 
2.struct timeval
3. bio_segments()
4. iov_for_each()
5.req filed in struct next_rq
Also, there was susceptible Data corruption with big size IO(>1M) processed by Linux kernel IO splitting.

RESOLUTION:
Code changes are mainly to support kernel 5.3.18 and to provide support for deprecated functions. 
Remove dependency on req->next_rq field in blk-mq code
And, changes related to bypassing the Linux kernel IO split functions, which seems redundant for VxVM IO processing.

* 4054311 (Tracking ID: 4040701)

SYMPTOM:
Below warnings are observed while installing the VXVM package.
WARNING: libhbalinux/libhbaapi is not installed. vxesd will not capture SNIA HBA API library events.
mv: '/var/adm/vx/cmdlog' and '/var/log/vx/cmdlog' are the same file
mv: '/var/adm/vx/cmdlog.1' and '/var/log/vx/cmdlog.1' are the same file
mv: '/var/adm/vx/cmdlog.2' and '/var/log/vx/cmdlog.2' are the same file
mv: '/var/adm/vx/cmdlog.3' and '/var/log/vx/cmdlog.3' are the same file
mv: '/var/adm/vx/cmdlog.4' and '/var/log/vx/cmdlog.4' are the same file
mv: '/var/adm/vx/ddl.log' and '/var/log/vx/ddl.log' are the same file
mv: '/var/adm/vx/ddl.log.0' and '/var/log/vx/ddl.log.0' are the same file
mv: '/var/adm/vx/ddl.log.1' and '/var/log/vx/ddl.log.1' are the same file
mv: '/var/adm/vx/ddl.log.10' and '/var/log/vx/ddl.log.10' are the same file
mv: '/var/adm/vx/ddl.log.11' and '/var/log/vx/ddl.log.11' are the same file

DESCRIPTION:
Some warnings are observed while installing vxvm package.

RESOLUTION:
Appropriate code changes are done to avoid the warnings.

* 4056919 (Tracking ID: 4056917)

SYMPTOM:
In Flexible Storage Sharing (FSS) environments, disk group import operation with few disks missing leads to data corruption.

DESCRIPTION:
In FSS environments, import of disk group with missing disks is not allowed. If disk with highest updated configuration information is not present during import, the import operation fired was leading incorrectly incrementing the config TID on remaining disks before failing the operation. When missing disk(s) with latest configuration came back, import was successful. But because of earlier failed transaction, import operation incorrectly choose wrong configuration to import the diskgroup leading to data corruption.

RESOLUTION:
Code logic in disk group import operation is modified to ensure failed/missing disks check happens early before attempting perform any on-disk update as part of import.

* 4058873 (Tracking ID: 4057526)

SYMPTOM:
Whenever vxnm-vxnetd is loaded, it reports "Cannot touch '/var/lock/subsys/vxnm-vxnetd': No such file or directory" in /var/log/messages.

DESCRIPTION:
New systemd update removed the support for "/var/lock/subsys/" directory. Thus, whenever vxnm-vxnetd is loaded on the systems supporting systemd, it 
reports "cannot touch '/var/lock/subsys/vxnm-vxnetd': No such file or directory"

RESOLUTION:
Added a check to validate if the /var/lock/subsys/ directory is supported in vxnm-vxnetd.sh

* 4060839 (Tracking ID: 3975667)

SYMPTOM:
NMI watchdog: BUG: soft lockup

DESCRIPTION:
When flow control on ioshipping channel is set there is window in code where vol_ioship_sender thread can go in tight loop.
This causes softlockup

RESOLUTION:
Relinquish CPU to schedule other process. vol_ioship_sender() thread will restart after some delay.

* 4060962 (Tracking ID: 3915202)

SYMPTOM:
vxconfigd hang in vxconfigd -k -r reset

DESCRIPTION:
vxconfigd hang is observed since all the file descriptors to the process have been utilized because of fd leak.
This issue was not integrated hence facing the issue.

RESOLUTION:
Appropriate code changes are done to handle scenario of the fd leak.

* 4060966 (Tracking ID: 3959716)

SYMPTOM:
System may panic with sync replication with VVR configuration, when VVR RVG is in DCM mode, with following panic stack:
volsync_wait [vxio]
voliod_iohandle [vxio]
volted_getpinfo [vxio]
voliod_loop [vxio]
voliod_kiohandle [vxio]
kthread

DESCRIPTION:
With sync replication, if ACK for data message is delayed from the secondary site, the 
primary site might incorrectly free the message from the waiting queue at primary site.
Due to incorrect handling of the message, a system panic may happen.

RESOLUTION:
Required code changes are done to resolve the panic issue.

* 4061004 (Tracking ID: 3993242)

SYMPTOM:
vxsnap prepare on vset might throw error : "VxVM vxsnap ERROR V-5-1-19171 Cannot perform prepare operation on cloud 
volume"

DESCRIPTION:
There were  some wrong volume-records entries being fetched for VSET and due to which required validations were failing and triggering the issue .

RESOLUTION:
Code changes have been done to resolve the issue .

* 4061036 (Tracking ID: 4031064)

SYMPTOM:
During master switch with replication in progress, cluster wide hang is seen on VVR secondary.

DESCRIPTION:
With application running on primary, and replication setup between VVR primary & secondary, when master switch operation is attempted on secondary, it gets hung permanently.

RESOLUTION:
Appropriate code changes are done to handle scenario of master switch operation and replication data on secondary.

* 4061055 (Tracking ID: 3999073)

SYMPTOM:
Data corruption occurred when the fast mirror resync (FMR) was enabled and the failed plex of striped-mirror layout was attached.

DESCRIPTION:
To determine and recover the regions of volumes using contents of detach, a plex attach operation with FMR tracking has been enabled.

For the given volume region, the DCO region size being higher than the stripe-unit of volume, the code logic in plex attached code path was incorrectly skipping the bits in detach maps. Thus, some of the regions (offset-len) of volume did not sync with the attached plex leading to inconsistent mirror contents.

RESOLUTION:
To resolve the data corruption issue, the code has been modified to consider all the bits for given region (offset-len) in plex attached code.

* 4061057 (Tracking ID: 3931583)

SYMPTOM:
Node may panic while uninstalling or upgrading the VxVM package or during reboot.

DESCRIPTION:
Due to a race condition in Volume Manager (VxVM), IO may be queued for processing while the vxio module is being unloaded. This results in VxVM acquiring and accessing a lock which is currently being freed and it may panic the system with the following backtrace:

 #0 [ffff88203da089f0] machine_kexec at ffffffff8105d87b
 #1 [ffff88203da08a50] __crash_kexec at ffffffff811086b2
 #2 [ffff88203da08b20] panic at ffffffff816a8665
 #3 [ffff88203da08ba0] nmi_panic at ffffffff8108ab2f
 #4 [ffff88203da08bb0] watchdog_overflow_callback at ffffffff81133885
 #5 [ffff88203da08bc8] __perf_event_overflow at ffffffff811727d7
 #6 [ffff88203da08c00] perf_event_overflow at ffffffff8117b424
 #7 [ffff88203da08c10] intel_pmu_handle_irq at ffffffff8100a078
 #8 [ffff88203da08e38] perf_event_nmi_handler at ffffffff816b7031
 #9 [ffff88203da08e58] nmi_handle at ffffffff816b88ec
#10 [ffff88203da08eb0] do_nmi at ffffffff816b8b1d
#11 [ffff88203da08ef0] end_repeat_nmi at ffffffff816b7d79
 [exception RIP: _raw_spin_unlock_irqrestore+21]
 RIP: ffffffff816b6575 RSP: ffff88203da03d98 RFLAGS: 00000283
 RAX: 0000000000000283 RBX: ffff882013f63000 RCX: 0000000000000080
 RDX: 0000000000000001 RSI: 0000000000000283 RDI: 0000000000000283
 RBP: ffff88203da03d98 R8: 00000000005d1cec R9: ffff8810e8ec0000
 R10: 0000000000000002 R11: ffff88203da03da8 R12: ffff88103af95560
 R13: ffff882013f630c8 R14: 0000000000000001 R15: 0000000000000ca5
 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#12 [ffff88203da03d98] _raw_spin_unlock_irqrestore at ffffffff816b6575
#13 [ffff88203da03da0] voliod_qsio at ffffffffc0fd14c3 [vxio]
#14 [ffff88203da03dd0] vol_sample_timeout at ffffffffc101d8df [vxio]
#15 [ffff88203da03df0] __voluntimeout at ffffffffc0fd34be [vxio]
#16 [ffff88203da03e18] voltimercallback at ffffffffc0fd3568 [vxio]
...
...

RESOLUTION:
Code changes made to handle the race condition and prevent the access of resources that are being freed.

* 4061298 (Tracking ID: 3982103)

SYMPTOM:
When the memory available is low in the system , I/O hang is seen.

DESCRIPTION:
In low memory situation, the memory allocated to some VVR IOs was NOT getting released properly due to which the new application IO
could NOT be served as the VVR memory pool gets fully utilized. This was resulting in IO hang type of situation.

RESOLUTION:
Code changes are done to properly release the memory in the low memory situation.

* 4061317 (Tracking ID: 3925277)

SYMPTOM:
vxdisk resize corrupts disk public region and causes file system mount fail.

DESCRIPTION:
While resizing single path disk with GPT label, update policy data according to the changes made to da/dmrec during two transactions of resize is missed, hence the private region IOs are sent to the old private region device which is on partition 3. This may make the private region data written to public region and cause corruption.

RESOLUTION:
Code changes have been made to fix  the problem.

* 4061509 (Tracking ID: 4043337)

SYMPTOM:
rp_rv.log file uses space for logging.

DESCRIPTION:
rp_rv log files needs to be removed and logger file should have 16 mb rotational log files.

RESOLUTION:
The code changes are implemented to disabel logging for rp_rv.log files

* 4062461 (Tracking ID: 4066785)

SYMPTOM:
When the replicated disks are in SPLIT mode, importing its disk group failed with "Device is a hardware mirror".

DESCRIPTION:
When the replicated disks are in SPLIT mode, which are readable and writable, importing its disk group failed with "Device is a hardware mirror". Third party doesn't expose disk attribute to show when it's in SPLIT mode. With this new enhancement, the replicated disk group can be imported with option `-o usereplicatedev=only`.

RESOLUTION:
The code is enhanced to import the replicated disk group with option `-o usereplicatedev=only`.

* 4062577 (Tracking ID: 4062576)

SYMPTOM:
When hastop -local is used to stop the cluster, dg deport command hangs. Below stack trace is observed in system logs :

#0 [ffffa53683bf7b30] __schedule at ffffffffa834a38d
 #1 [ffffa53683bf7bc0] schedule at ffffffffa834a868
 #2 [ffffa53683bf7bd0] blk_mq_freeze_queue_wait at ffffffffa7e4d4e6
 #3 [ffffa53683bf7c18] blk_cleanup_queue at ffffffffa7e433b8
 #4 [ffffa53683bf7c30] vxvm_put_gendisk at ffffffffc3450c6b [vxio]   
 #5 [ffffa53683bf7c50] volsys_unset_device at ffffffffc3450e9d [vxio]
 #6 [ffffa53683bf7c60] vol_rmgroup_devices at ffffffffc3491a6b [vxio]
 #7 [ffffa53683bf7c98] voldg_delete at ffffffffc34932fc [vxio]
 #8 [ffffa53683bf7cd8] vol_delete_group at ffffffffc3494d0d [vxio]
 #9 [ffffa53683bf7d18] volconfig_ioctl at ffffffffc3555b8e [vxio]
#10 [ffffa53683bf7d90] volsioctl_real at ffffffffc355fc8a [vxio]
#11 [ffffa53683bf7e60] vols_ioctl at ffffffffc124542d [vxspec]
#12 [ffffa53683bf7e78] vols_unlocked_ioctl at ffffffffc124547d [vxspec]
#13 [ffffa53683bf7e80] do_vfs_ioctl at ffffffffa7d2deb4
#14 [ffffa53683bf7ef8] ksys_ioctl at ffffffffa7d2e4f0
#15 [ffffa53683bf7f30] __x64_sys_ioctl at ffffffffa7d2e536

DESCRIPTION:
This issue is seen due to some updation from kernel side w.r.t to handling request queue.Existing VxVM code set the request handling area (make_request_fn) as vxvm_gen_strategy, this functionality is getting impacted.

RESOLUTION:
Code changes are added to handle the request queues using blk_mq_init_allocated_queue.

* 4062746 (Tracking ID: 3992053)

SYMPTOM:
Data corruption may happen with layered volumes due to some data not re-synced while attaching a plex. This is due to 
inconsistent data across the plexes after attaching a plex in layered volumes.

DESCRIPTION:
When a plex is detached in a layered volume, the regions which are dirty/modified are tracked in DCO (Data change object) map.
When the plex is attached back, the data corresponding to these dirty regions is re-synced to the plex being attached.
There was a defect in the code due to which the some particular regions were NOT re-synced when a plex is attached.
This issue only happens only when the offset of the sub-volume is NOT aligned with the region size of DCO (Data change object) volume.

RESOLUTION:
The code defect is fixed to correctly copy the data for dirty regions when the sub-volume offset is NOT aligned with the DCO region size.

* 4062747 (Tracking ID: 3943707)

SYMPTOM:
vxconfigd reconfig hang when joing a cluster with below stack:
volsync_wait [vxio]
_vol_syncwait [vxio]
voldco_await_shared_tocflush [vxio]
volcvm_ktrans_fmr_cleanup [vxio]
vol_ktrans_commit [vxio]
volconfig_ioctl [vxio]
volsioctl_real [vxio]
vols_ioctl [vxspec]
vols_unlocked_ioctl [vxspec]
vfs_ioctl  
do_vfs_ioctl

DESCRIPTION:
There is a race condition that caused the current seqno on tocsio does not get incremented on one of the nodes. While master and other slaves move to next stage with higher seqno, this slave drops the DISTRIBUTE message. The messages is retried from master and slave keeps on dropping, leading to hang.

RESOLUTION:
Code changes have been made to avoid the race condition.

* 4062751 (Tracking ID: 3989185)

SYMPTOM:
In a Veritas Volume Manager(VVR) environment vxrecover command can hang.

DESCRIPTION:
When vxrecover is triggered after storage failure it is possible that the vxrecover operation may hang.
This is because vxrecover does the RVG recovery. As part of this recovery dummy updates are written on SRL.
Due to a bug in the code these updated were written incorrectly on the SRL which led the flush operation from SRL to data volume hang.

RESOLUTION:
Code changes are done appropriately so that the dummy updates are written  correctly to the SRL.

* 4062755 (Tracking ID: 3978453)

SYMPTOM:
Reconfig hang during master takeover with below stack:
volsync_wait+0xa7/0xf0 [vxio]
volsiowait+0xcb/0x110 [vxio]
vol_commit_iolock_objects+0xd3/0x270 [vxio]
vol_ktrans_commit+0x5d3/0x8f0 [vxio]
volconfig_ioctl+0x6ba/0x970 [vxio]
volsioctl_real+0x436/0x510 [vxio]
vols_ioctl+0x62/0xb0 [vxspec]
vols_unlocked_ioctl+0x21/0x30 [vxspec]
do_vfs_ioctl+0x3a0/0x5a0

DESCRIPTION:
There is a hang in dcotoc protocol on slave and that is causing couple of slave nodes not respond with LEAVE_DONE to master, hence the issue.

RESOLUTION:
Code changes have been made to add handling of transaction overlapping with a shared toc update. Passing toc update sio flag from old to new object during transaction, to resume recovery if required.

* 4063374 (Tracking ID: 4005121)

SYMPTOM:
Application IOs appear hung or progress slowly until SRL to DCM flush finished.

DESCRIPTION:
When VVR SRL gets full DCM protection was triggered and application IO appear hung until SRL to DCM flush finished.

RESOLUTION:
Added fix that avoided duplicate DCM tracking through vol_rvdcm_log_update(), which reduced the IOPS drop comparatively.

* 4064523 (Tracking ID: 4049082)

SYMPTOM:
I/O read error is displayed when remote FSS node rebooting.

DESCRIPTION:
When rebooting remote FSS node, I/O read requests to a mirror volume that is scheduled on the remote disk from the FSS node should be redirected to the remaining plex. However, current vxvm does not handle this correctly. The retrying I/O requests could still be sent to the offline remote disk, which cause to final I/O read failure.

RESOLUTION:
Code changes have been done to schedule the retrying read request on the remaining plex.

* 4066930 (Tracking ID: 3951527)

SYMPTOM:
Data loss issue is seen because of incorrect version check handling done as a part of SRL 4k update alignment changes in 7.4 release.

DESCRIPTION:
On primary, rv_target_rlink field always is set to NULL which internally skips checking the 4k version  in VOL_RU_INIT_UPDATE macro. It causes SRL writes to be written in a 4k aligned manner even though remote rvg version is <= 7.3.1. This resulted in data loss.

RESOLUTION:
Changes are done to use rv_replicas rather than rv_target_rlink to check the version appropriately for all sites and not write SRL IO's in 4k aligned manner. 
Also, RVG version is not upgraded as part of diskgroup upgrades if rlinks are in attached state. RVG version can be upgraded using vxrvg upgrade command after detaching the rlinks and also when all sites are upgraded.

* 4067706 (Tracking ID: 4060462)

SYMPTOM:
System is unresponsive while adding new nodes.

DESCRIPTION:
After a node is removed, and adding node with  different node name is attempted; system turns
unresponsive. When a node leaves a cluster, in-memory information related to the node is not cleared due to the race condition.

RESOLUTION:
Fixed race condition to clear in-memory information of the node that leaves the cluster.

* 4067710 (Tracking ID: 4064208)

SYMPTOM:
Node is unresponsive while it gets added to the cluster.

DESCRIPTION:
While a node joins the cluster, if bits on the node are upgraded; size
of the object is interpreted incorrectly. Issue is observed when number of objects is higher and on
InfoScale 7.3.1 and above.

RESOLUTION:
Correct sizes are calculated for the data received from the master node.

* 4067712 (Tracking ID: 3868140)

SYMPTOM:
VVR primary site node might panic if the rlink disconnects while some data is getting replicated to secondary with below stack: 

dump_stack()
panic()
vol_rv_service_message_start()
update_curr()
put_prev_entity()
voliod_iohandle()
voliod_loop()
voliod_iohandle()

DESCRIPTION:
If rlink disconnects, VVR will clear some handles to the in-progress updates in memory, but if some IOs are still getting acknowledged from secondary to primary, then accessing updates for these IOs might result in panic at primary node.

RESOLUTION:
Code fix is implemented to correctly access the primary node updates in order to avoid the panic.

* 4067713 (Tracking ID: 3997531)

SYMPTOM:
VVR replication is not working, as vxnetd does not start properly.

DESCRIPTION:
If vxnetd restarts, a race condition blocks the completion of vxnetd start function after the shutdown process is completed.

RESOLUTION:
To avoid the race condition, the vxnetd start and stop functions are Synchronized.

* 4067715 (Tracking ID: 4008740)

SYMPTOM:
System panic

DESCRIPTION:
Due to a race condition there was code accessing freed VVR update which resulted in system panic

RESOLUTION:
Fixed race condition to avoid incorrect memory access

* 4067717 (Tracking ID: 4009151)

SYMPTOM:
Auto-import of diskgroup on system reboot fails with error:
"Disk for diskgroup not found"

DESCRIPTION:
When diskgroup is auto-imported, VxVM (Veritas Volume Manager) tries to find the disk with latest configuration copy. During this the DG import process searches through all the disks. The procedure also tries to find out if the DG contains clone disks or standard disks. While doing this calculation the DG import process incorrectly determines that current DG contains cloned disks instead of standard disks because of the stale value being there for the previous DG selected. Since VxVM incorrectly decides to import cloned disks instead of standard disks the import fails with "Disk for diskgroup not found" error.

RESOLUTION:
Code has been modified to accurately determine whether the DG contains standard or cloned disks and accordingly use those disks for DG import.

* 4067914 (Tracking ID: 4037757)

SYMPTOM:
VVR services always get started on boot up even if VVR is not being used.

DESCRIPTION:
VVR services get auto start as they are integrated in system init.d or similar framework.

RESOLUTION:
Added a tunable to not start VVR services on boot up

* 4067915 (Tracking ID: 4059134)

SYMPTOM:
Resync task takes too long on large size raid-5 volume

DESCRIPTION:
The resync of raid-5 volume will be done by small regions, and a check point will be setup for each region. If the size of raid-5 volume is large, it will be divided to large number of regions for resync, and check point setup will be issued against each region in loop. In each cycle, resync utility will open and connect vxconfigd daemon to do that,  each client would be created in vxconfigds context along with each region. As the number of created clients is large, it will take long time for vxconfigd which need to traverse the client list, thus, introduce the performance issue for resync.

RESOLUTION:
Code changes are made so that only one client created during the whole resync process, few time spent in client list traversing.

* 4069522 (Tracking ID: 4043276)

SYMPTOM:
If admin has offlined disks with "vxdisk offline <disk_name>" then vxattachd may brings the disk back to online state.

DESCRIPTION:
The "offline" state of VxVM disks is not stored persistently, it is recommended to use "vxdisk define" to persistently offline a disk.
For Veritas Netbackup Appliance there is a requirement that vxattachd shouldn't online the disks offlined with "vxdisk offline" operation.
To cater this request we have added tunable based enhancement to vxattachd for Netbackup Appliance use case.
The enhancement are done specifically so that Netback Appliance script can use it.
Following are the tunable details.
If skip_offline tunable is set then it will avoid offlined disk into online state. 
If handle_invalid_disk is set then it will offlined the "online invalid" SAN disks.
If remove_disable_dmpnode is set then it will cleanup stale entries from disk.info file and VxVM layer.
By default these tunables are off, we DONOT recommend InfoScale users to enable these vxattachd tunables.

RESOLUTION:
Code changes are done in vxattachd to cater Netbackup Appliance usecases.

* 4069523 (Tracking ID: 4056751)

SYMPTOM:
When importing a disk group containing all cloned disks with cloneoff option(-c) and some of disks are in read only status, the import fails and some of writable disks are removed from the disk group unexpectedly.

DESCRIPTION:
During disk group import with cloneoff option, a flag DA_VFLAG_ASSOC_DG gets set as update dgid is necessary. When associating da record of the read only disks, update private region TOC failed because of write failure, so the pending associations get aborted for all disks. During the aborting, for those of disks containing flag DA_VFLAG_ASSOC_DG would be removed from the dg and offline their config copy. Hence we can see a kind of private region corruption on writeable disks, actually they were removed from the dg.

RESOLUTION:
The issue has been fixed by failing the import at early stage if some of disks are read-only.

* 4069524 (Tracking ID: 4056954)

SYMPTOM:
When performing addsec using the VIPs with SSL enable the hang is observed

DESCRIPTION:
the issues comes when on primary side,  vradmin tries to create a local socket with endpoints as Local VIP & Interface IP and ends up calling SSL_accept and gets stuck infinitely.

RESOLUTION:
Appropriate code changes are done to handle scenario of vvr_islocalip() function to identify if the ip is local to the node.
So now by using the vvr_islocalip() the SSL_accept() func get called only if the ip is remote ip

* 4070099 (Tracking ID: 3159650)

SYMPTOM:
vxtune did not support vol_vvr_use_nat.

DESCRIPTION:
Platform specific methods were required to set vol_vvr_use_nat tunable, as its support for vxtune command was not present.

RESOLUTION:
Added vol_vvr_use_nat support for vxtune command.

* 4070253 (Tracking ID: 3911930)

SYMPTOM:
Valid PGR operations sometimes fail on a DMP node.

DESCRIPTION:
As part of the PGR operations, if the inquiry command finds that PGR is not
supported on the DMP node, the PGR_FLAG_NOTSUPPORTED flag is set on the node. Further PGR operations check this value and issue PGR commands only if the flag is not set. PGR_FLAG_NOTSUPPORTED remains set even if the hardware is changed so as to support PGR.

RESOLUTION:
A new command, enablepr, is provided in the vxdmppr utility to clear this flag on the specified DMP node.

* 4071131 (Tracking ID: 4071605)

SYMPTOM:
A security vulnerability exists in the third-party component libxml2.

DESCRIPTION:
VxVM uses a third-party component named libxml2 in which a security vulnerability exists.

RESOLUTION:
VxVM is updated to use a newer version of libxml2 in which the security vulnerability has been addressed.

* 4072874 (Tracking ID: 4046786)

SYMPTOM:
During reboot , nodes go out of cluster and FS is not mounted .

DESCRIPTION:
NVMe asl can some time give different UDID (difference with actual UDID would be absence of space characters in UDID) during discovery.

RESOLUTION:
usage of nvme ioctl to fetch data has been removed and sysfs will be used instead

Patch ID: VRTSvxvm-7.4.2.2400

* 4018181 (Tracking ID: 3995474)

SYMPTOM:
The following IO errors are reported on VxVM sub-disks result in DRL log detached on SLES12SP3 without any SCSI errors detected.

VxVM vxio V-5-0-1276 error on Subdisk [xxxx] while writing volume [yyyy][log] offset 0 length [zzzz]
VxVM vxio V-5-0-145 DRL volume yyyy[log] is detached

DESCRIPTION:
Following the Linux kernel changes since 4.4.68(SLES12SP3), VxVM stores the bio flags, including B_ERROR, to bi_flags_ext field from bi_flags, as bi_flags was reduced from long to short. As bi_flags_ext is a kind of hacking by modifying bio struct, it may take unknown problems. Checking bi_error instead as the

RESOLUTION:
Code changes have been made in the VxIO Disk IO done routine to fix the issue.

* 4051701 (Tracking ID: 4031597)

SYMPTOM:
vradmind generates a core dump in __strncpy_sse2_unaligned.

DESCRIPTION:
The following core dump is generated:
(gdb)bt
Thread 1 (Thread 0x7fcd140b2780 (LWP 90066)):
#0 0x00007fcd12b1d1a5 in __strncpy_sse2_unaligned () from /lib64/libc.so.6
#1 0x000000000059102e in IpmServer::accept (this=0xf21168, new_handlesp=0x0) at Ipm.C:3406
#2 0x0000000000589121 in IpmHandle::events (handlesp=0xf12088, new_eventspp=0x7ffc8e80a4e0, serversp=0xf120c8, new_handlespp=0x0, ms=100) at Ipm.C:613
#3 0x000000000058940b in IpmHandle::events (handlesp=0xfc8ab8, vlistsp=0xfc8938, ms=100) at Ipm.C:645
#4 0x000000000040ae2a in main (argc=1, argv=0x7ffc8e80e8e8) at srvmd.C:722

RESOLUTION:
vradmind is updated to properly handle getpeername(), which addresses this issue.

* 4051702 (Tracking ID: 4019182)

SYMPTOM:
In case of a VxDMP configuration, an InfoScale server panics when applying a patch. The following stack trace is generated:
unix:panicsys+0x40()
unix:vpanic_common+0x78()
unix:panic+0x1c()
unix:mutex_enter() - frame recycled
vxdmp(unloaded text):0x108b987c(jmpl?)()
vxdmp(unloaded text):0x108ab380(jmpl?)(0)
genunix:callout_list_expire+0x5c()
genunix:callout_expire+0x34()
genunix:callout_execute+0x10()
genunix:taskq_thread+0x42c()
unix:thread_start+4()

DESCRIPTION:
Some VxDMP functions create callouts. The VxDMP module may already be unloaded when a callout expires, which may cause the server to panic. VxDMP should cancel any previous timeout function calls before it unloads itself.

RESOLUTION:
VxDMP is updated to cancel any previous timeout function calls before unloading itself.

* 4051705 (Tracking ID: 4049371)

SYMPTOM:
DMP unable to display all WWN details when running "vxdmpadm getctlr all".

DESCRIPTION:
When udev discovers any sd device, vxvm will remove the old HBA info file and create a new one. During this period, vxesd could fail to obtain the HBA information.

RESOLUTION:
Code changes have been done to aviod missing the HBA info.

* 4051706 (Tracking ID: 4046007)

SYMPTOM:
disk private region gets corrupted after cluster name change in FSS environment.

DESCRIPTION:
Under some conditions, when vxconfigd tries to update TOC(table of contents) blocks of disk private region, the allocation maps may not be initialized in memory yet. This could make allocation maps incorrect and lead to corruption of disk private region.

RESOLUTION:
Code changes have been done to avoid corruption of disk private region.

* 4053228 (Tracking ID: 4053230)

SYMPTOM:
RHEL 8.5 support is to be provided with IS 7.4.1 and 7.4.2

DESCRIPTION:
RHEL 8.5 ZDS support is being provided with IS 7.4.1 and 7.4.2

RESOLUTION:
VxVM packages are available with RHEL 8.5 compatibility

* 4055211 (Tracking ID: 4052191)

SYMPTOM:
Any scripts or command files in the / directory may run unexpectedly at system startup.
and vxvm volumes will not be available until those scripts or commands are complete.

DESCRIPTION:
If this issue occurs, /var/svc/log/system-vxvm-vxvm-configure:default.log indicates that a script or command
located in the / directory has been executed.
example-
ABC Script ran!!
/lib/svc/method/vxvm-configure[241] abc.sh not found
/lib/svc/method/vxvm-configure[242] abc.sh not found
/lib/svc/method/vxvm-configure[243] abc.sh not found
/lib/svc/method/vxvm-configure[244] app/ cannot execute
In this example, abc.sh is located in the / directory and just echoes "ABC script ran !!". vxvm-configure launched abc.sh.

RESOLUTION:
Issue got fixed by changing the comments format in SunOS_5.11.vxvm-configure.sh script.

Patch ID: VRTSvxvm-7.4.2.2200

* 4018173 (Tracking ID: 3852146)

SYMPTOM:
In a CVM cluster, when importing a shared diskgroup specifying both -c and -o
noreonline options, the following error may be returned: 
VxVM vxdg ERROR V-5-1-10978 Disk group <dgname>: import failed: Disk for disk
group not found.

DESCRIPTION:
The -c option will update the disk ID and disk group ID on the private region
of the disks in the disk group being imported. Such updated information is not
yet seen by the slave because the disks have not been re-onlined (given that
noreonline option is specified). As a result, the slave cannot identify the
disk(s) based on the updated information sent from the master, causing the
import to fail with the error Disk for disk group not found.

RESOLUTION:
The code is modified to handle the working of the "-c" and "-o noreonline"
options together.

* 4018178 (Tracking ID: 3906534)

SYMPTOM:
After enabling DMP (Dynamic Multipathing) Native support, enable /boot to be
mounted on DMP device.

DESCRIPTION:
Currently /boot is mounted on top of OS (Operating System) device. When DMP
Native support is enabled, only VG's (Volume Groups) are migrated from OS 
device to DMP device.This is the reason /boot is not migrated to DMP device.
With this if OS device path is not available then system becomes unbootable 
since /boot is not available. Thus it becomes necessary to mount /boot on DMP
device to provide multipathing and resiliency.

RESOLUTION:
Code changes have been done to migrate /boot on top of DMP device when DMP
Native support is enabled.
Note - The code changes are currently implemented for RHEL-6 only. For other
linux platforms, /boot will still not be mounted on the DMP device

* 4031342 (Tracking ID: 4031452)

SYMPTOM:
Add node operation is failing with error "Error found while invoking '' in the new node, and rollback done in both nodes"

DESCRIPTION:
Stack showed a valid address for pointer ptmap2, but still it generated core.
It suggested that it might be a double-free case. Issue lies in freeing a pointer

RESOLUTION:
Added handling for such case by doing NULL assignment to pointers wherever they are freed

* 4037283 (Tracking ID: 4021301)

SYMPTOM:
Data corruption issue happened with the big size IO processed by Linux kernel IO split on RHEL8.

DESCRIPTION:
On RHEL8 or as of Linux kernel 3.13, it introduces some changes in Linux kernel block layer, new item of the bio iterator structure is used to represent the start offset of bio or bio vectors after the IO processed by Linux kernel IO split functions. Also, in recent version of vxfs, it can generate bio with larger size than the size limitation defined within Linux kernel block layer and VxVM, which lead the IO from vxfs could be split by Linux kernel. For such split IOs, VxVM does not take the new item of the bio iterator into account while process them, which caused the data is written to wrong position of volume/disk. Hence, data corruption.

RESOLUTION:
Code changes have been made to bypass the Linux kernel IO split functions, which seems redundant for VxVM IO processing.

* 4042038 (Tracking ID: 4040897)

SYMPTOM:
This is new array and we need to add support for claiming HPE MSA 2060 arrays.

DESCRIPTION:
HPE MSA 2060 is new array and current ASL doesn't support it. So it will not be claimed with current ASL. This array support has been now added in the current ASL.

RESOLUTION:
Code changes to support HPE MSA 2060 array have been done.

* 4046906 (Tracking ID: 3956607)

SYMPTOM:
When removing a VxVM disk using vxdg-rmdisk operation, the following error occurs requesting a disk reclaim.
VxVM vxdg ERROR V-5-1-0 Disk <device_name> is used by one or more subdisks which are pending to be reclaimed.
Use "vxdisk reclaim <device_name>" to reclaim space used by these subdisks, and retry "vxdg rmdisk" command.
Note: reclamation is irreversible. But when issuing vxdisk-reclaim as advised, the command dumps core.

DESCRIPTION:
In the disk-reclaim code path, memory allocation can fail at realloc() but the failure 
not detected, causing an invalid address to be referenced and a core dump results.

RESOLUTION:
The disk-reclaim code path now handles failure of realloc properly.

* 4046907 (Tracking ID: 4041001)

SYMPTOM:
When some nodes are rebooted in the system, nodes cannot join back as disk attach transactions are not
happening.

DESCRIPTION:
In VxVM, when some nodes are rebooted, some plexes of volume are detached. It may happen that all plexes
of volume are disabled. In this case, if all plexes of some DCO volume become inaccessible, that DCO
volume state should be marked as BADLOG.

If state is not marked BADLOG, transactions fail with following error.
VxVM ERROR V-5-1-10128  DCO experienced IO errors during the operation. Re-run the operation after ensuring that DCO is accessible

As the transactions are failing, system goes in hang state and nodes cannot join.

RESOLUTION:
The code is fixed to mark DCO state as BADLOG when all the plexes of DCO becomes inaccessible during IO load.

* 4046908 (Tracking ID: 4038865)

SYMPTOM:
System panick at vxdmp module with following calltrace in IRQ stack.
native_queued_spin_lock_slowpath
queued_spin_lock_slowpath
_raw_spin_lock_irqsave7
dmp_get_shared_lock
gendmpiodone
dmpiodone
bio_endio
blk_update_request
scsi_end_request
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
do_IRQ
 <IRQ stack>

DESCRIPTION:
A deadlock issue can happen between inode_hash_lock and DMP shared lock, when one process holding inode_hash_lock but acquires the DMP shared lock in IRQ context, in the mean time other process holding the DMP shared lock may acquire inode_hash_lock.

RESOLUTION:
Code changes have been done to avoid the deadlock issue.

* 4047588 (Tracking ID: 4044072)

SYMPTOM:
I/Os fail for NVMe disks with 4K block size on the RHEL 8.4 kernel.

DESCRIPTION:
This issue occurs only in the case of disks of the 4K block size. I/Os complete successfully when the disks of 512 block size are used. If disks of the 4K block size are used, the following error messages are logged:
[ 51.228908] VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 201/0x10
[ 51.230070] blk_update_request: operation not supported error, dev nvme1n1, sector 240 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
[ 51.240861] blk_update_request: operation not supported error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0

RESOLUTION:
Updated the VxVM and the VxDMP modules to address this issue. The logical block size is now set to 4096 bytes, which is the same as the physical block size.

* 4047590 (Tracking ID: 4045501)

SYMPTOM:
The following errors occur during the installation of the VRTSvxvm and the VRTSaslapm packages on CentOS 8.4 systems:
~
Verifying packages...
Preparing packages...
This release of VxVM is for Red Hat Enterprise Linux 8
and CentOS Linux 8.
Please install the appropriate OS
and then restart this installation of VxVM.
error: %prein(VRTSvxvm-7.4.1.2500-RHEL8.x86_64) scriptlet failed, exit status 1
error: VRTSvxvm-7.4.1.2500-RHEL8.x86_64: install failed
cat: 9: No such file or directory
~

DESCRIPTION:
The product installer reads the /etc/centos-release file to identify the Linux distribution. This issue occurs because the file has changed for CentOS 8.4.

RESOLUTION:
The product installer is updated to identify the correct Linux distribution so that the VRTSvxvm and the VRTSaslapm packages get installed successfully.

* 4047592 (Tracking ID: 3992040)

SYMPTOM:
VxFS Testing CFS Stress hits a kernel panic, f:vx_dio_bio_done:2

DESCRIPTION:
In RHEL8.0/SLES15 kernel code, The value in bi_status isn't a standard error code at and there are completely separate set of values that are all small positive integers (for example, BLK_STS_OK and BLK_STS_IOERROR) while actual errors sent by VM are different hence VM should send proper bi_status to FS with newer kernel. This fix avoids further kernel crashes.

RESOLUTION:
Code changes are done to have a map for bi_status and bi_error conversion( as it's been there in Linux Kernel code - blk-core.c)

* 4047695 (Tracking ID: 3911930)

SYMPTOM:
Valid PGR operations sometimes fail on a dmpnode.

DESCRIPTION:
As part of the PGR operations, if the inquiry command finds that PGR is not
supported on the dmpnode node, a flag PGR_FLAG_NOTSUPPORTED is set on the
dmpnode.
Further PGR operations check this flag and issue PGR commands only if this flag
is
NOT set.
This flag remains set even if the hardware is changed so as to support PGR.

RESOLUTION:
A new command (namely enablepr) is provided in the vxdmppr utility to clear this
flag on the specified dmpnode.

* 4047722 (Tracking ID: 4023390)

SYMPTOM:
Vxconfigd crashes as a disk contains invalid privoffset(160), which is smaller than minimum required offset(VTOC 265, GPT 208).

DESCRIPTION:
There may have disk label corruption or stale information residents on the disk header, which caused unexpected label written.

RESOLUTION:
Add a assert when updating CDS label to ensure the valid privoffset written to disk header.

* 4049268 (Tracking ID: 4044583)

SYMPTOM:
A system goes into the maintenance mode when DMP is enabled to manage native devices.

DESCRIPTION:
The "vxdmpadm gettune dmp_native_support=on" command is used to enable DMP to manage native devices. After you change the value of the dmp_native_support tunable, you need to reboot the system needs for the changes to take effect. However, the system goes into the maintenance mode after it reboots. The issue occurs due to the copying of the local liblicmgr72.so file instead of the original one while creating the vx_initrd image.

RESOLUTION:
Code changes have been made to copy the correct liblicmgr72.so file. The system successfully reboots without going into maintenance mode.

Patch ID: VRTSvxvm-7.4.2.1900

* 4020207 (Tracking ID: 4018086)

SYMPTOM:
vxiod with ID as 128 was stuck with below stack:

 #2 [] vx_svar_sleep_unlock at [vxfs]
 #3 [] vx_event_wait at [vxfs]
 #4 [] vx_async_waitmsg at [vxfs]
 #5 [] vx_msg_send at [vxfs]
 #6 [] vx_send_getemapmsg at [vxfs]
 #7 [] vx_cfs_getemap at [vxfs]
 #8 [] vx_get_freeexts_ioctl at [vxfs]
 #9 [] vxportalunlockedkioctl at [vxportal]
 #10 [] vxportalkioctl at [vxportal]
 #11 [] vxfs_free_region at [vxio]
 #12 [] vol_ru_start_replica at [vxio]
 #13 [] vol_ru_start at [vxio]
 #14 [] voliod_iohandle at [vxio]
 #15 [] voliod_loop at [vxio]

DESCRIPTION:
With SmartMove feature as ON, it can happen vxiod with ID as 128 starts replication where RVG was in DCM mode, this vxiod is waiting for filesystem's response if a given region is used by filesystem or not. Filesystem will trigger MDSHIP IO on logowner. Due to a bug in code, MDSHIP IO always gets queued in vxiod with ID as 128. Hence a dead lock situation.

RESOLUTION:
Code changes have been made to avoid handling MDSHIP IO in vxiod whose ID is bigger than 127.

* 4039510 (Tracking ID: 4037915)

SYMPTOM:
Getting compilation errors due to RHEL's source code changes

DESCRIPTION:
While compiling the RHEL 8.4 kernel (4.18.0-304) the build compilation fails due to certain RH source changes.

RESOLUTION:
Following changes have been fixed to work with VxVM 7.4.1
__bdevname - depreciated
Solution: Have a struct block_device and use bdevname

blkg_tryget_closest - placed under EXPORT_SYMBOL_GPL
Solution: Locally defined the function where compilation error was hit

sync_core - implicit declaration
The implementation of function sync_core() has been moved to header file sync_core.h, so including this header file fixes the error

* 4039511 (Tracking ID: 4037914)

SYMPTOM:
Crash while running VxVM cert.

DESCRIPTION:
While running the VM cert, there is a panic reported and the

RESOLUTION:
Setting bio and submitting to IOD layer in our own vxvm_gen_strategy() function

* 4039512 (Tracking ID: 4017334)

SYMPTOM:
VXIO call stack trace generated in /var/log/messages

DESCRIPTION:
This issue occurs due to a limitation in the way InfoScale interacts with the RHEL8.2 kernel.
 Call Trace:
 kmsg_sys_rcv+0x16b/0x1c0 [vxio]
 nmcom_get_next_mblk+0x8e/0xf0 [vxio]
 nmcom_get_hdr_msg+0x108/0x200 [vxio]
 nmcom_get_next_msg+0x7d/0x100 [vxio]
 nmcom_wait_msg_tcp+0x97/0x160 [vxio]
 nmcom_server_main_tcp+0x4c2/0x11e0 [vxio]

RESOLUTION:
Making changes in header files for function definitions if rhel version>=8.2
This kernel warning can be safely ignore as it doesn't have any functionality impact.

* 4039517 (Tracking ID: 4012763)

SYMPTOM:
IO hang may happen in VVR (Veritas Volume Replicator) configuration when SRL overflows for one rlink while another one rlink is in AUTOSYNC mode.

DESCRIPTION:
In VVR, if the SRL overflow happens for rlink (R1) and some other rlink (R2) is ongoing the AUTOSYNC, then AUTOSYNC is aborted for R2, R2 gets detached and DCM mode is activated on R1 rlink.

However, due to a race condition in code handling AUTOSYNC abort and DCM activation in parallel, the DCM could not be activated properly and IO which caused DCM activation gets queued incorrectly, this results in a IO hang.

RESOLUTION:
The code has been modified to fix the race issue in handling the AUTOSYNC abort and DCM activation at same time.

Patch ID: VRTSvxvm-7.4.2.1500

* 4018182 (Tracking ID: 4008664)

SYMPTOM:
System panic occurs with the following stack:

void genunix:psignal+4()
void vxio:vol_logger_signal_gen+0x40()
int vxio:vollog_logentry+0x84()
void vxio:vollog_logger+0xcc()
int vxio:voldco_update_rbufq_chunk+0x200()
int vxio:voldco_chunk_updatesio_start+0x364()
void vxio:voliod_iohandle+0x30()
void vxio:voliod_loop+0x26c((void *)0)
unix:thread_start+4()

DESCRIPTION:
Vxio keeps vxloggerd proc_t that is used to send a signal to vxloggerd. In case vxloggerd has been ended for some reason, the signal may be sent to an unexpected process, which may cause panic.

RESOLUTION:
Code changes have been made to correct the problem.

* 4020207 (Tracking ID: 4018086)

SYMPTOM:
vxiod with ID as 128 was stuck with below stack:

 #2 [] vx_svar_sleep_unlock at [vxfs]
 #3 [] vx_event_wait at [vxfs]
 #4 [] vx_async_waitmsg at [vxfs]
 #5 [] vx_msg_send at [vxfs]
 #6 [] vx_send_getemapmsg at [vxfs]
 #7 [] vx_cfs_getemap at [vxfs]
 #8 [] vx_get_freeexts_ioctl at [vxfs]
 #9 [] vxportalunlockedkioctl at [vxportal]
 #10 [] vxportalkioctl at [vxportal]
 #11 [] vxfs_free_region at [vxio]
 #12 [] vol_ru_start_replica at [vxio]
 #13 [] vol_ru_start at [vxio]
 #14 [] voliod_iohandle at [vxio]
 #15 [] voliod_loop at [vxio]

DESCRIPTION:
With SmartMove feature as ON, it can happen vxiod with ID as 128 starts replication where RVG was in DCM mode, this vxiod is waiting for filesystem's response if a given region is used by filesystem or not. Filesystem will trigger MDSHIP IO on logowner. Due to a bug in code, MDSHIP IO always gets queued in vxiod with ID as 128. Hence a dead lock situation.

RESOLUTION:
Code changes have been made to avoid handling MDSHIP IO in vxiod whose ID is bigger than 127.

* 4021238 (Tracking ID: 4008075)

SYMPTOM:
Observed with ASL changes for NVMe, This issue observed in reboot scenario. For every reboot machine was hitting panic And this was happening in loop.

DESCRIPTION:
panic was hitting for such splitted bios, root cause for this is RHEL8 introduced a new field named as __bi_remaining.
where __bi_remaining is maintanins the count of chained bios, And for every endio that __bi_remaining gets atomically decreased in bio_endio() function.
While decreasing __bi_remaining OS checks that the __bi_remaining 'should not <= 0' and in our case __bi_remaining was always 0 and we were hitting OS
BUG_ON.

RESOLUTION:
>>> For scsi devices maxsize is 4194304,
[   26.919333] DMP_BIO_SIZE(orig_bio) : 16384, maxsize: 4194304
[   26.920063] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 4194304

>>>and for NVMe devices maxsize is 131072
[  153.297387] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 131072
[  153.298057] DMP_BIO_SIZE(orig_bio) : 262144, maxsize: 131072

* 4021240 (Tracking ID: 4010612)

SYMPTOM:
$ vxddladm set namingscheme=ebn lowercase=no
This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on. means every nvme/ssd disks names would be 
hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....

DESCRIPTION:
$ vxddladm set namingscheme=ebn lowercase=no
This issue observed for NVMe and ssd. where every disk has separate enclosure like nvme0, nvme1... so on.
means every nvme/ssd disks names would be hostprefix_enclosurname0_disk0, hostprefix_enclosurname1_disk0....
eg.
smicro125_nvme0_0 <--- disk1
smicro125_nvme1_0 <--- disk2

for lowercase=no our current code is suppressing the suffix digit of enclosurname and hence multiple disks gets same name and it is showing udid_mismatch 
because whatever udid of private region is not matching with ddl. ddl database showing wrong info because of multiple disks gets same name.

smicro125_nvme_0 <--- disk1   <<<<<<<-----here suffix digit of nvme enclosure suppressed
smicro125_nvme_0 <--- disk2

RESOLUTION:
Append the suffix integer while making da_name

* 4021346 (Tracking ID: 4010207)

SYMPTOM:
System panic occurred with the below stack:

native_queued_spin_lock_slowpath()
queued_spin_lock_slowpath()
_raw_spin_lock_irqsave()
volget_rwspinlock()
volkiodone()
volfpdiskiodone()
voldiskiodone_intr()
voldmp_iodone()
bio_endio()
gendmpiodone()
dmpiodone()
bio_endio()
blk_update_request()
scsi_end_request()
scsi_io_completion()
scsi_finish_command()
scsi_softirq_done()
blk_done_softirq()
__do_softirq()
call_softirq()

DESCRIPTION:
As part of collecting the IO statistics collection, the vxstat thread acquires a spinlock and tries to copy data to the user space. During the data copy, if some page fault happens, then the thread would relinquish the CPU and provide the same to some other thread. If the thread which gets scheduled on the CPU requests the same spinlock which vxstat thread had acquired, then this results in a hard lockup situation.

RESOLUTION:
Code has been changed to properly release the spinlock before copying out the data to the user space during vxstat collection.

* 4021359 (Tracking ID: 4010040)

SYMPTOM:
A security issue occurs during Volume Manager configuration.

DESCRIPTION:
This issue occurs during the configuration of the VRTSvxvm package.

RESOLUTION:
VVR daemon is updated so that this security issue no longer occurs.

* 4021366 (Tracking ID: 4008741)

SYMPTOM:
VxVM device files appears to have device_t SELinux label.

DESCRIPTION:
If an unauthorized or modified device is allowed to exist on the system, there is the possibility the system may perform unintended or unauthorized operations.
eg: ls -LZ
...
...
/dev/vx/dsk/testdg/vol1   system_u:object_r:device_t:s0
/dev/vx/dmpconfig         system_u:object_r:device_t:s0
/dev/vx/vxcloud           system_u:object_r:device_t:s0

RESOLUTION:
Code changes made to change the device labels to misc_device_t, fixed_disk_device_t.

* 4021428 (Tracking ID: 4020166)

SYMPTOM:
Build issue becuase of "struct request"

error: struct request has no member named next_rq
Linux has deprecated the member next_req

DESCRIPTION:
The issue was observed due to changes in OS structure

RESOLUTION:
code changes are done in required files

* 4021748 (Tracking ID: 4020260)

SYMPTOM:
While enabling dmp native support tunable dmp_native_support for Centos 8 below mentioned error was observed:

[root@dl360g9-4-vm2 ~]# vxdmpadm settune dmp_native_support=on
VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume groups

VxVM vxdmpadm ERROR V-5-1-15686 The following vgs could not be migrated as error in bootloader configuration file 

 cl
[root@dl360g9-4-vm2 ~]#

DESCRIPTION:
The issue was observed due to missing code check-ins for CentOS 8 in the required files.

RESOLUTION:
Changes are done in required files for dmp native support in CentOS 8

Patch ID: VRTSvxvm-7.4.2.1400

* 4018182 (Tracking ID: 4008664)

SYMPTOM:
System panic occurs with the following stack:

void genunix:psignal+4()
void vxio:vol_logger_signal_gen+0x40()
int vxio:vollog_logentry+0x84()
void vxio:vollog_logger+0xcc()
int vxio:voldco_update_rbufq_chunk+0x200()
int vxio:voldco_chunk_updatesio_start+0x364()
void vxio:voliod_iohandle+0x30()
void vxio:voliod_loop+0x26c((void *)0)
unix:thread_start+4()

DESCRIPTION:
Vxio keeps vxloggerd proc_t that is used to send a signal to vxloggerd. In case vxloggerd has been ended for some reason, the signal may be sent to an unexpected process, which may cause panic.

RESOLUTION:
Code changes have been made to correct the problem.

* 4020207 (Tracking ID: 4018086)

SYMPTOM:
vxiod with ID as 128 was stuck with below stack:

 #2 [] vx_svar_sleep_unlock at [vxfs]
 #3 [] vx_event_wait at [vxfs]
 #4 [] vx_async_waitmsg at [vxfs]
 #5 [] vx_msg_send at [vxfs]
 #6 [] vx_send_getemapmsg at [vxfs]
 #7 [] vx_cfs_getemap at [vxfs]
 #8 [] vx_get_freeexts_ioctl at [vxfs]
 #9 [] vxportalunlockedkioctl at [vxportal]
 #10 [] vxportalkioctl at [vxportal]
 #11 [] vxfs_free_region at [vxio]
 #12 [] vol_ru_start_replica at [vxio]
 #13 [] vol_ru_start at [vxio]
 #14 [] voliod_iohandle at [vxio]
 #15 [] voliod_loop at [vxio]

DESCRIPTION:
With SmartMove feature as ON, it can happen vxiod with ID as 128 starts replication where RVG was in DCM mode, this vxiod is waiting for filesystem's response if a given region is used by filesystem or not. Filesystem will trigger MDSHIP IO on logowner. Due to a bug in code, MDSHIP IO always gets queued in vxiod with ID as 128. Hence a dead lock situation.

RESOLUTION:
Code changes have been made to avoid handling MDSHIP IO in vxiod whose ID is bigger than 127.

* 4021346 (Tracking ID: 4010207)

SYMPTOM:
System panic occurred with the below stack:

native_queued_spin_lock_slowpath()
queued_spin_lock_slowpath()
_raw_spin_lock_irqsave()
volget_rwspinlock()
volkiodone()
volfpdiskiodone()
voldiskiodone_intr()
voldmp_iodone()
bio_endio()
gendmpiodone()
dmpiodone()
bio_endio()
blk_update_request()
scsi_end_request()
scsi_io_completion()
scsi_finish_command()
scsi_softirq_done()
blk_done_softirq()
__do_softirq()
call_softirq()

DESCRIPTION:
As part of collecting the IO statistics collection, the vxstat thread acquires a spinlock and tries to copy data to the user space. During the data copy, if some page fault happens, then the thread would relinquish the CPU and provide the same to some other thread. If the thread which gets scheduled on the CPU requests the same spinlock which vxstat thread had acquired, then this results in a hard lockup situation.

RESOLUTION:
Code has been changed to properly release the spinlock before copying out the data to the user space during vxstat collection.

* 4021428 (Tracking ID: 4020166)

SYMPTOM:
Build issue becuase of "struct request"

error: struct request has no member named next_rq
Linux has deprecated the member next_req

DESCRIPTION:
The issue was observed due to changes in OS structure

RESOLUTION:
code changes are done in required files

* 4021748 (Tracking ID: 4020260)

SYMPTOM:
While enabling dmp native support tunable dmp_native_support for Centos 8 below mentioned error was observed:

[root@dl360g9-4-vm2 ~]# vxdmpadm settune dmp_native_support=on
VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume groups

VxVM vxdmpadm ERROR V-5-1-15686 The following vgs could not be migrated as error in bootloader configuration file 

 cl
[root@dl360g9-4-vm2 ~]#

DESCRIPTION:
The issue was observed due to missing code check-ins for CentOS 8 in the required files.

RESOLUTION:
Changes are done in required files for dmp native support in CentOS 8

Patch ID: VRTSvxvm-7.4.2.1300

* 4008606 (Tracking ID: 4004455)

SYMPTOM:
snapshot restore failed on a instant_snapshot created on older version DG

DESCRIPTION:
create a DG with older version, create a instant snapshot, 
do some IOs on source volume.
try to restore the snapshot.
snapshot failed for this scenario.

RESOLUTION:
rca for this issue is there flag values were conflicting.
fixed this issue code has been checkedin

* 4010892 (Tracking ID: 4009107)

SYMPTOM:
CA chain certificate verification fails in VVR when the number of intermediate certificates is greater than the depth. So, we get error in SSL initialization.

DESCRIPTION:
CA chain certificate verification fails in VVR when the number of intermediate certificates is greater than the depth. SSL_CTX_set_verify_depth() API decides the depth of certificates (in /etc/vx/vvr/cacert file) to be verified, which is limited to count 1 in code. Thus intermediate CA certificate present first  in /etc/vx/vvr/cacert (depth 1  CA/issuer certificate for server certificate) could be obtained and verified during connection, but root CA certificate (depth 2  higher CA certificate) could not be verified while connecting and hence the error.

RESOLUTION:
Removed the call of SSL_CTX_set_verify_depth() API so as to handle the depth automatically.

* 4011866 (Tracking ID: 3976678)

SYMPTOM:
vxvm-recover:  cat: write error: Broken pipe error encountered in syslog multiple times.

DESCRIPTION:
Due to a bug in vxconfigbackup script which is started by vxvm-recover "cat : write error: Broken pipe" is encountered in syslog 
and it is reported under vxvm-recover. In vxconfigbackup code multiple subshells are created in a function call and the first subshell is for cat command. When a particular if condition is satistfied, return is called exiting the later subshells even when there is data to be read in the created cat subshell, which results in broken pipe error.

RESOLUTION:
Changes are done in VxVM code to handle the broken pipe error.

* 4011971 (Tracking ID: 3991668)

SYMPTOM:
Configured with sec logging, VVR reports data inconsistency when hit "No IBC message arrived" error.

DESCRIPTION:
It might happen seconday node served updates with larger sequence ID when In-Band Control (IBC) update arrived. In this case, VVR will drop the IBC update. Any updates whose sequence ID are larger couldn't start data volume writes. They will get queued. Data lost will happen when seconary receives automic commit and clear the queue. Hence vradmin verifydata reports data inconsistency.

RESOLUTION:
Code changes have been made to trigger updates in order to start data volume writes.

* 4012485 (Tracking ID: 4000387)

SYMPTOM:
Existing VxVM module fails to load on Rhel 8.2

DESCRIPTION:
RHEL 8.2 is a new release and had few KABI changes  on which VxVM compilation breaks .

RESOLUTION:
Compiled VxVM code against 8.2 kernel and made changes to make it compatible.

* 4012848 (Tracking ID: 4011394)

SYMPTOM:
As a part of verifying the performance of CFS cloud tiering verses scale out file system tiering in Access, it was found that CFS cloud tiering performance was degraded.

DESCRIPTION:
On verifying the performance of CFS cloud tiering verses scale out file system tiering in Access, it was found that CFS cloud tiering performance was degraded because the design was single threaded which was causing bottleneck and performance issues.

RESOLUTION:
Code Changes are around Multiple IO queues in the kernel, Multithreaded request loop to fetch IOs from kernel queues into userland global queue and Allow curl threads to work in parallel.

* 4013155 (Tracking ID: 4010458)

SYMPTOM:
In VVR (Veritas Volume replicator), the rlink might inconsistently disconnect due to unexpected transactions with below messages:
VxVM VVR vxio V-5-0-114 Disconnecting rlink <rlink_name> to permit transaction to proceed

DESCRIPTION:
In VVR (Veritas Volume replicator), a transaction is triggered when a change in the VxVM/VVR objects needs 
to be persisted on disk. 

In some scenario, few unnecessary transactions were getting triggered in loop. This was causing multiple rlink
disconnects with below message logged frequently:
VxVM VVR vxio V-5-0-114 Disconnecting rlink <rlink_name> to permit transaction to proceed

One such unexpected transaction was happening due to open/close on volume as part of SmartIO caching.
Additionally, vradmind daemon was also issuing some open/close on volumes as part of IO statistics collection,
which was causing unnecessary transactions. 

Additionally some unexpected transactions were happening due to incorrect checks in code related
to some temporary flags on volume.

RESOLUTION:
The code is fixed to disable the SmartIO caching on the volumes if the SmartIO caching is not configured on the system.
Additionally code is fixed to avoid the unexpected transactions due to incorrect checking on the temporary flags
on volume.

* 4013169 (Tracking ID: 4011691)

SYMPTOM:
Observed high CPU consumption on the VVR secondary nodes because of high pending IO load.

DESCRIPTION:
High replication related IO load on the VVR secondary and the requirement of maintaining write order fidelity with limited memory pools created  contention. This resulted in multiple VxVM kernel threads contending for shared resources and there by increasing the CPU consumption.

RESOLUTION:
Limited the way in which VVR consumes its resources so that a high pending IO load would not result into high CPU consumption.

* 4013718 (Tracking ID: 4008942)

SYMPTOM:
file system gets disabled when cache object gets full and hence unmount is failing.

DESCRIPTION:
When cache object gets full, IO errors comes on volume. 
Because IOs are not getting served as cache object is full so there is inconsistency of IOs.
Because of IO inconsistency vxfs gets disabled and unmount failed

RESOLUTION:
Fixed the issue and code has been checkedin

Patch ID: VRTSspt-7.4.2.1600

* 4189440 (Tracking ID: 4189526)

SYMPTOM:
Firstlook did not collect VVR specific data previously

DESCRIPTION:
With current design of FirstLook, the following set of commands will be collected vrstat, vxmemstat, vxrlink. These commands are targeted to collect data specifically for VVR enabled environments.

RESOLUTION:
Code changes have been done to include the above commands collection.

* 4189700 (Tracking ID: 4189723)

SYMPTOM:
FirstLook collects threadlist every minute.

DESCRIPTION:
In most of the performance related stat collection, per minute threadlist is not required.

RESOLUTION:
Added an argument for "-a" option to provide the interval to collect threadlist.



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch infoscale-rhel8_x86_64-Patch-7.4.2.5800.tar.gz to /tmp
2. Untar infoscale-rhel8_x86_64-Patch-7.4.2.5800.tar.gz to /tmp/patch
    # mkdir /tmp/patch
    # cd /tmp/patch
    # gunzip /tmp/infoscale-rhel8_x86_64-Patch-7.4.2.5800.tar.gz
    # tar xf /tmp/infoscale-rhel8_x86_64-Patch-7.4.2.5800.tar
3. Install the patch(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/patch
    # ./installVRTSinfoscale742P5800 [<host1> <host2>...]

You can also install this patch together with 7.4.2 base release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.4.2 directory and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory
    # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
Manual installation is not recommended.


REMOVING THE PATCH
------------------
Manual uninstallation is not recommended.


SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
NONE

Applies to the following product releases

InfoScale Availability 7.4.2

Release date: 2020-06-01

End of standard support: 2025-06-01

Sustaining support starts: 2027-06-01

End of support life: To be determined

InfoScale Storage 7.4.2

Release date: 2020-06-01

End of standard support: 2025-06-01

Sustaining support starts: 2027-06-01

End of support life: To be determined

InfoScale Foundation 7.4.2

Release date: 2020-06-01

End of standard support: 2025-06-01

Sustaining support starts: 2027-06-01

End of support life: To be determined

InfoScale Enterprise 7.4.2

Release date: 2020-06-01

End of standard support: 2025-06-01

Sustaining support starts: 2027-06-01

End of support life: To be determined

This update requires

InfoScale 7.4.2 Update 8 Cumulative Patch on RHEL8 Platform

Release date: 2025-04-01

InfoScale 7.4.2 Update 8 Cumulative Patch on RHEL8 Platform

Update files

	File name	Description	Version	Platform	Size

Choose an account to download the files you selected.