Veritas NetBackup™ Flex Scale Release Notes
- Getting help
 - Features, enhancements, and changes
 - Limitations
 - Known issues
- Cluster configuration issues
- Cluster configuration fails if there is a conflict between the cluster private network and any other network
 - Cluster configuration process may hang due to an ssh connection failure
 - DNS servers that are added after initial configuration are not present in the /etc/resolv.conf file
 - Empty log directories are created in the downloaded log file
 - For the private network, if you use the default IPv4 IP address but specify an IPv6 IP other than the default, the specified IPv6 IP address is ignored
 - Node discovery fails during initial configuration if the default password is changed
 - When NetBackup Flex Scale is configured, the size of NetBackup logs might exceed the /log partition size
 - Error message is not displayed when NTP server is added as FQDN during initial configuration in a non-DNS environment
 
 - Disaster recovery issues
- Backup data present on the primary site before the time Storage Lifecycle Policies (SLP) was applied is not replicated to the secondary site
 - When disaster recovery gets configured on the secondary site, the catalog storage usage may be displayed as zero
 - Catalog backup policy may fail or use the remote media server for backup
 - Takeover to a secondary cluster fails even after the primary cluster is completely powered off
 - Catalog replication may fail to resume automatically after recovering from node fault that exceeds fault tolerance limit
 - If the replication link is down on a node, the replication IP does not fail over to another node
 - If both primary and secondary clusters are down and are brought online again, it may happen that the replication is in error state
 - Disaster recovery configuration fails if the lockdown mode on the secondary cluster is enterprise or compliance
 - Unable to perform a takeover operation from the new site acting as the secondary
 - Enabling compliance mode for the first time on the secondary cluster may fail if disaster recovery is configured
 - If disaster recovery is configured and an upgrade from NetBackup Flex Scale 2.1 to 3.0, the upgrade operation hangs
 - On a NetBackup Flex Scale cluster with disaster recovery configuration, the replication state shows Primary-Primary on the faulted primary cluster after takeover
 - Unable to create universal shares on a cluster on which disaster recovery is configured or only media server is deployed
 
 - Miscellaneous issues
- Red Hat Virtualization (RHV) VM discovery and backup and restore jobs fail if the Media server node that is selected as the discovery host, backup host, or recovery host is replaced
 - The file systems offline operation gets stuck for more than 2hrs after a reboot all operation
 - cvmvoldg agent causes resource faults because the database not updated
 - SQLite, MySQL, MariaDB, PostgreSQL database backups fail in pure IPv6 network configuration
 - Exchange GRT browse of Exchange-aware VMware policy backups may fail with a database error
 - Call Home test fails if a proxy server is configured without specifying a user
 - In a non-DNS NetBackup Flex Scale setup, performing a backup from a snapshot operation fails for NAS-Data-Protection policy
 - In a non-DNS environment, the CRL check does not work if CDP URL is not accessible
 - Unable to add multiple host entries against the same IP address and vice versa in a non-DNS IPv4 environment
 
 - NetBackup issues
- The NetBackup web GUI does not list media or storage hosts in Security > Hosts page
 - Media hosts do not appear in the search icon for Recovery host/target host during Nutanix AHV agentless files and folders restore
 - On the NetBackup media server, the ECA health check shows the warning, 'hostname missing'
 - If NetBackup Flex Scale is configured, the storage paths are not displayed under MSDP storage
 - Failure may be observed on STU if the Only use the following media servers is selected for Media server under Storage > Storage unit
 - NetBackup primary server services fail if an nfs share is mounted at /mnt mount path inside the primary server container
 - NetBackup primary container goes into unhealthy state
 - NetBackup fails to discover VMware workloads in an IPv6 environment
 
 - Networking issues
 - Node and disk management issues
- Storage-related logs are not written to the designated log files
 - Arrival or recovery of the volume does not bring the file system back into online state making the file system unusable
 - Unable to replace a stopped node
 - An NVMe disk is wrongly selected as a target disk while replacing a SAS SSD
 - Disk replacement might fail in certain situations
 - Replacing an NVMe disk fails with a data movement from source disk to destination disk error
 - Unable to detect a faulted disk that is brought online after some time
 - Nodes may go into an irrecoverable state if shut down and reboot operations are performed using IPMI-based commands
 - Add node fails because of memory fragmentation
 - Replace node may fail if the new node is not reachable
 - Node is displayed as unhealthy if the node on which the management console is running is stopped
 - Unable to collect logs from the node if the node where the management console is running is stopped
 - Log rotation does not work for files and directories in /log/VRTSnas/log
 - After replacing a node, the AutoSupport settings are not synchronized to the replacement node
 - Unable to start or stop a cluster node
 - The Add nodes to the cluster button remains disabled even after providing all the inputs
 - Unable to add more than seven nodes simultaneously to the cluster
 - Backup jobs of the workload which uses SSL certificate fail during or post Add node operation
 
 - Security and authentication issues
- NetBackup certificates tab and the External certificates tab in the Certificate management page on the NetBackup UI show different hosts list
 - Replicated images do not have retention lock after the lockdown mode is changed from normal to any other mode
 - Unable to switch the lockdown mode from normal to enterprise or compliance for a cluster that is deployed with only media servers and with lockdown mode set to normal
 - CRL mode does not get updated on the secondary site after ECA is renewed on a cluster on which disaster recovery is configured
 - Setting lockdown mode to enterprise or compliance fails on the secondary cluster of a NetBackup Flex Scale cluster on which disaster recovery is configured
 - User account gets locked on a management or non-management console node
 - The changed password is not synchronized across the cluster
 
 - Upgrade issues
- After an upgrade, if checkpoint is restored, backup and restore jobs may stop working
 - Upgrade fails during pre-flight in VCS service group checks even if the failover service group is ONLINE on a node, but FAULTED on another node
 - Upgrade from version 2.1 to 3.0 fails if the cluster is configured with an external certificate
 - During EEB installation, a hang is observed during the installation of the fourth EEB and the proxy log reports "Internal Server Error"
 - EEB installation may fail if some of the NetBackup services are busy
 - During an upgrade the NetBackup Flex Scale UI shows incorrect status for some of the components
 - After an upgrade Call Home does not work
 - After an upgrade, the proxy server configured for Call Home is disabled but is displayed as enabled in the UI
 - Unable to view the login banner after an upgrade
 - After an upgrade to NetBackup Flex Scale 2.1, the metadata format in cloud storage of MSDP cloud volume is changed
 - Rollback fails after a failed upgrade
 - Add node operation hangs on the secondary site after an upgrade
 - Alerts about inconsistent login banner and password policy appear after an upgrade from NetBackup Flex Scale version 2.1 to NetBackup Flex Scale 3.0
 - Alerts about node being down are generated during an upgrade
 - GUI takes a long time to update the status of the upgrade task
 - In a disaster recovery environment, upgrade gets stuck during node evacuation stage as VVRInfra_Grp cannot be brought down
 - Upgrade may fail after node evacuation if a VCS parallel service group is OFFLINE on a partial set of nodes at the beginning of the upgrade
 - Upgrade may fail if operations such as such as OS reboot, cluster restart, and node stop and shutdown are used during the upgrade
 
 - UI issues
- In-progress user creation tasks disappear from the infrastructure UI if the management console node restarts abruptly
 - During the replace node operation, the UI wrongly shows that the replace operation failed because the data rebuild operation failed
 - Changes in the local user operations are not reflected correctly in the NetBackup GUI when the failover of the management console and the NetBackup primary occurs at the same time
 - Mozilla Firefox browser may display a security issue while accessing the infrastructure UI
 - Recent operations that were completed successfully are not reflected in the UI if the NetBackup Flex Scale management console fails over to another cluster node
 - Previously generated log packages are not displayed if the infrastructure management console fails over to another node.
 
 - User management issues
- AD server test connection fails due to incorrect username on the IPV6 media only cluster
 - AD/LDAP domain unreachable alerts do not get cleared after the AD/LDAP server is deleted
 - GUI login fails with LDAP user if the domain is configured with SSL
 - Assigning role to correct AD/LDAP user/group with wrong domain causes the user listing to fail
 - After cluster reboot all/shutdown all operation, AD/LDAP domains become unreachable from one or more nodes on a NetBackup Flex Scale cluster on which only media servers are deployed
 
 
 - Cluster configuration issues
 - Fixed issues
 
Upgrade may fail after node evacuation if a VCS parallel service group is OFFLINE on a partial set of nodes at the beginning of the upgrade
If a VCS parallel service group is in OFFLINE state on few nodes (partial set of nodes) and ONLINE on other nodes at the start of the upgrade, upgrade may fail after the node evacuation step (hastop -evacuate). This happens because as a part of the VCS service group check after the hastop -evacuate command is executed, the parallel service group should be ONLINE on all the nodes. But if such a group was in OFFLINE state on a partial set of nodes at the beginning of the upgrade, it remains in OFFLINE state on those nodes and the VCS service group check fails after waiting for 3-4 hours. (IA-40597)
Workaround:
Perform the following steps and restart upgrade.
Identify the parallel service groups.
# hagrp -display | grep -i parallel
Service groups with 1 in the 4th column are parallel service groups.
For example:
# hagrp -display | grep -i parallel CanHostNLM Parallel global 1 GLOBAL_API_SERVER Parallel global 0 ManagementConsole Parallel global 0 NBUMasterBrain Parallel global 0 NBUMasterWorker Parallel global 0 NFSShareOfflineGrp Parallel global 0 NLMGroup Parallel global 0 NicMonitorGrp Parallel global 1 Phantomgroup_pubeth0 Parallel global 1 Phantomgroup_pubeth1 Parallel global 1 Phantomgroup_pubeth2 Parallel global 1 VVRInfra_Grp Parallel global 1
Before you start the upgrade, check if any of the parallel service group is OFFLINE on partial set of nodes and online on other nodes.
# hagrp -state <SG name>
For example:
# hagrp -state VVRInfra_Grp #Group Attribute System Value VVRInfra_Grp State dellsite1-01|ONLINE| VVRInfra_Grp State dellsite1-02|OFFLINE| VVRInfra_Grp State dellsite1-03|OFFLINE| VVRInfra_Grp State dellsite1-04|ONLINE|
Try to online the parallel service groups on all the nodes of the cluster before upgrade.
# hagrp -online <SG Name> -sys <nodename>
For example:
# hagrp -online VVRInfra_Grp -sys dellsite1-02