Sign In
Forgot Password

Don’t have an account? Create One.

NetBackup 8.2 / 3.2 Hotfix - CloudCatalyst EEB Bundle (Etrack 3981837)

HotFix Critical

Abstract

This NetBackup 8.2 CloudCatalyst Hotfix EEB Bundle resolves CC Media Server issues.

Description

This Hotfix resolves the following issues:

  • 5240 CloudCatalyst Appliance hangs every 2 weeks due to memory exhaustion
  • NVE-386 CC needs to check for revoked certificates via OCSP support in go SDK code
  • CloudCatalyst starts to experience slower writes and then eventually halts
  • Make esfs use multi-object delete request when supported
  • ESFS legacy log rotation out of principle
  • Restore from CloudCatalyst-AWS-Snowball fails with error "Image warming failed 501"
  • Backups and dups to CloudCatalyst are not progressing vxesfsd and rocksdb core dumps
  • Restore of a VMware image from a CC media server is failing.
  • [SWST] NB_D2C_AMZ_AIR_Imageshare -> NB_Cloud_DR_API -> Failed to Initialize DR In Cloud
  • Amazon backup failed with media close error 87 [SWST] NB_CC_AMZ_SS_Encrypt -> Verify_Encrypt_Enabled ->
  • Failed to verify if server side encryption has been enabled
  • Datastore will not initialize on cloud catalyst server
  • CloudCatalyst vxesfsd process corrupted on nbu 8.2.
  • [CC][Glacier]Restore a image failed with error "Image warming failed 409"
  • Upgrade to the latest fuse library (version 3.6.2)
  • Fix vxesfsd crashes by removing boost library.
  • Fix vxesfsd crashes caused by invalid value of sys_nlink.
  • Remove unnecessary/obsolete entries from the fsdb database.
  • Fix vxesfsd crashes caused by a null-pointer exception in esfs_opendir.
  • Adding ocsp check to the ocsd using VerifyPeerCertificate OCSP check will happen for every tls connection made OCSP information is retrieved from the server certificate
  • Currently esfs only uses the multi-object delete feature for Amazon.
  • Adding a check for the BulkDelete attribute in CloudProvider.xml so that esfs will use the feature for all providers known to support it.
  • Honor max log size in esfs.json for ocsd
  • Fix restore failure from AWS Snowball device
  • Restore fails with error 'Image warming failed 501'.
  • esfs_storage logs an error 'NotImplemented: This operation is not supported yet. status code: 501'
  • Improve performance and reduce memory consumption for ocsd process.
  • OCSD log tool.
  • Fix import fail error in cloud NBU(for DR).
  • Add more detailed logging for cache eviction process
  • Remove orphaned entries in file list directories
  • Switch from char arrays to std::string
  • Change log write to avoid race condition
  • Refactor requestWorker to not reuse connection after getting region. Add socket log.
  • Do not create an empty zero-byte log file on startup.
  • Add OSCP caching to remove overhead of response time slowdown from OCSP server. Setting
    the default cache time to 60 Minutes.

  • ET3982970: Cannot remove certain directories. Change rmdir logic of checking directory is empty.
    This change makes sure that directory with garbage data is able to be deleted.
    vxesfsd crashing. Set max open file to limit RocksDB memory allocation. The allocated
    RSS memory will never be larger than 1GB.

  • ET3985755: Retry when there is http conflict with aws 'operation aborted' error.

  • ET 3990062: Cache sys_ino for /data and /databases for performance
    Remove unnecessary lock for esfs_opendir for MSDP performance
    Start ocsd even if vxesfsd is already running
    Check disk usage no more than once every 10 seconds

    ET 3990062:

  •     Cache eviction improvements
        Skip bhd files and recently modified files during cache eviction
        If unable to reclaim enough space, consider them for eviction the next time

  • Cache the metadata instead of release it when reference count is 0. Change ocsd to download multiple objects for one file.

  • Free cached memory in destruction method and fix a incorrect memory free.

  • ET 3989115:

  • Round robin between upload and delete requests to avoid starving delete requests in very busy environments.

  • ET 3990062:

  • Fix performance issue of image sharing when data locality is bad.

  • Fix imagesharing's issue over AIR.

  • Improve performance of opendir/readdir (remove support for optional d_type on readdir since MSDP does not use it).

  • Correct name of temp download file for Azure.

  • Prevent inode reuse and change list result for Azure.

  • Change log mechanism. There is a dedicated ocsd log routine. 

  • Problems addressed: 
  • 1. The small log file 
  • 2. Log file is unexpected closed
  • The log configuration will be more consistent.
  • Remove nbu_wrapper dependency from ocsd. It can get cloud configuration using web service.
  • Get cloud instance configuration file directly, if NB web service does not return the configuration.
  • ET 3997365: Allow esfs running for non fatal error in fsdb. Avoid crash once vxesfs cannot continue at startup.
  • ET 3994287: Ignore unrecognizable lines in bp.conf.
  • ET 3993119: Support ECA and remove '.dl' from azure download method.
  • ET 3993574: For delete requests change the ino to ext_rscn if it's not null (case of duplicate ino) for DR from cloud.
  • storage manager uses ext_rscn as real inode for download because it might be reused. The utility of DR from cloud stores inode in cloud into ext_rscn.
  • Comprehensive fsdb check at start. The allocated inode checking time is the same as metadata checking time.
  • Implement fsdb check and integrate it into vxesfsd. vxesfsd will stop when fsdb has problems. fsdb check can remove garbage entries.
  • Flush FSDB WAL at some important points. Add more info into fill_emptyfile for better analysis in future.
  • ET 4006406:  When proxy server is not enabled, we shouldn't see proxy related errors in the logs. Also handled NONE auth type.
  • ET 3995775: Remove eof error message printed in ocsd logs.
  • ET 4002975: Add ReadAt function for OCSReader because AWS SDK has special logic to reduce memory allocation when ReadAt is implemented.
  • ET 3998016: Upgrades to go aws-sdk-go that include fixes for memory usage and other improvements
  • Remove libnbsqlite.so dependency from fsdb_check.
  • Change checking condition for socket ready.
  • Search for short name in certmapinfo.json if exact match is not found. Ignore case when comparing server names.
  • needWarm interface for msdp to know the bucket supports warming or not.
  • Handle warm request for Azure blob.
  • Update the warm stat file without warming when low latency storage type is selected.
  • Match the objects for MSDPCC restores.
  • Fix UseCRL log to identify if CRL is enabled or not.
  • Fix some error in calling newOCSHTTPClient and change 0,1,2 file descriptor. /var/log/ocsd.log will have info when ocsd crashes.
  • ET 4007911: Correct OS command paths from /usr/bin to /bin in pre and post-install scripts so they work on older versions of RedHat Linux.
  • ET 4005838: Handle partial read case for Azure. When a file buffer is read partially it should update the buffer and size.
  • ET 4010076: Set skipVerify to true when UseCRL is empty
  • ET 4010614: Change return value of select() for socket is ready. select() returns a value that is larger than expected.
  • ET 4010651: Skip verification of certificate when UseCRL is empty for S3 compatible provider.
  • ET 4010965: vxesfsd is unable to convert ocsd's pid to integer due to a larger than expected value. It returns an error: 'Unable to convert pid to a integer'.
  • ET 4012960: Avoid ocsd crash in ocspVerify(). Check size of vChain before referencing vChain[0] and vChain[1]. Do not reference nil err variable when issuer cert cannot be found.
  • ET 4013386: Enable multiPath upload/download for NetApp StorageGRID and VERITAS Access. This will avoid the misleading timeout error on large upload requests that take longer than 15 minutes because the upload requests are now broken up into multiple/separate part requests.
  • ET 4012807: Rebuild of rocksdb with portable=1 and remove USE_SSE=0ET
  • 4012349: On appliances when daemons are started from the CLISH, some process is sending SIGHUP or SIGINT, killing ocsd. Ignore SIGHUP and SIGINT.

 

Version 23 includes:

ET 4027975: Use 100MB partSize for vtas-access multipart.
Effectively disables multipart upload and download for vtas-access except for large DO files (over 100MB).
Addresses performance problem with vtas-access.
Add NetBackup Release Static Version String to all binaries to allow to identify what version of NetBackup or EEB binaries come from.
Uninstall script issues where environments without CloudCatalyst configured will complain about missing esfstab file.
 

 

Read me

Versions Affected

  • NetBackup 8.2

 

This EEB should be installed on: Cloud Catalyst Media Servers

 

README Notes:
This EEB introduces a comprehensive fsdb check.
If vxesfsd detects records with inode conflicts, ESFS stops and fsdb requires to be rebuilt.

The vxesfsd process will start if it detects that the following conditions are true:

  1. used inode is smaller than system inode.
  2. no entries are using the same inode.
  3. The information for an entry is complete. When an entry under a directory exists, the inode record and metadata records must exist.

If these check conditions are met, fsdb is considered to be in a consistent state and then ESFS will proceed to start.

If an inode record violates any of these check conditions, fsdb is considered to be in an inconsistent state and will require rebuilding.

  • Prior to installing this EEB, make sure you have a current drcontrol policy backup of your CloudCatalyst server/appliance.
  • CloudCatalyst servers/appliances should always be protected by an active drcontrol policy.
  • This allows for recovery of the CloudCatalyst environment in order to access the data uploaded to the cloud.
  • Without this, it is possible that data loss could occur.
  • More information about the drcontrol policy is available in the NetBackup Deduplication Guide.

 

Downloads:

NB_8.2_ET3981837_23.zip

 

Appliance:
NBAPP_EEB_ET3981837-3.2.0.0-23.x86_64.rpm

VRTSflex-nb_EEB_ET3981837-8.2-23.x86_64.rpm


 

Installation Instructions:

1. Stop NBU services.

2. Uninstall any previous version of this EEB (3981837 versions 1 to 19) before installing version 20.

3. If not an NBU appliance, please run the EEB installer with the -create option.

4. Start NBU services.

 

Using the NetBackup Emergency Engineering Binary (EEB) installer

https://www.veritas.com/docs/100019405

 

Installing EEBs on a NetBackup 52x0 / 5330 Appliance

https://www.veritas.com/docs/100023444

 

How to install  add-ons or an EEB on NetBackup instances running on Flex 1.3 version
https://www.veritas.com/content/support/en_US/doc/130821112-136840843-0/v137506948-136840843
 

Checksums for installed files:

 

File                                                                            Checksum           Byte count

linuxR_x86/cc_touch                                              280189831        189624
linuxR_x86/cred_ioctl                                             697233685        57672
linuxR_x86/dbdump                                               925364650        7610200
linuxR_x86/esfs_check                                          2944113558      8465600
linuxR_x86/esfs_cleanup                                      2398003567      2901371
linuxR_x86/esfs_init.sh                                         979561273        8296
linuxR_x86/esfs_reconfig                                     3138041570      473008
linuxR_x86/esfs_recover.sh                                 3001937396      2175
linuxR_x86/esfs_upgrade.sh                               1946210879      7338
linuxR_x86/esfs_version.txt                                 2123593552     180
linuxR_x86/fsdb_check                                        395990259        13780808
linuxR_x86/fsdbbackup                                        3835221081     17368
linuxR_x86/install-3981837                                 263100556        3258
linuxR_x86/libfuse3.so.3.6.2                               2532643060      833440
linuxR_x86/librocksdb.so.6.0.2                           1177832479     5731584
linuxR_x86/mkesfs                                                917406739       8056856
linuxR_x86/nbu_wrapper                                      2848546672    1582320
linuxR_x86/ocsd                                                    1667249659     18208536
linuxR_x86/ocsd_log_view                                   885179505       128992
linuxR_x86/post_uninstall-3981837                   832498491       4694
linuxR_x86/pre_proc_uninstall_3981837          1996522080      2631
linuxR_x86/recoverdb                                           633645045        615208
linuxR_x86/setlsu_ioctl                                        2802370043      17400
linuxR_x86/vxesfs                                                 3237992321      3271537
linuxR_x86/vxesfsd                                               3902631439      7563608

 

Recommended service state:

Stop all NetBackup services before applying this hotfix.

 

Update files

File name Description Version Platform Size

Applies to the following product releases

Knowledge base

0
2020-02-11

Severity Possible Data Loss Description Duplication jobs to the cloud using NetBackup CloudCatalyst will complete with status 0, but if the CloudCatalyst cache volume becomes full, data loss can result. Versions Affected NetBackup 8.1 EEB 3958410...

3
2020-02-14

Problem In NetBackup 8.2, Image Sharing does not support HCP as a cloud provider. You cannot configure an Image Sharing server with an HCP cloud provider. Error Message When you try to configure an Image Sharing server with an HCP cloud provider,...

2
2020-11-17

Problem Test Error Message Cause Test Solution net