Veritas NetBackup™ Deduplication Guide
- Introducing the NetBackup media server deduplication option
- Quick start
- Planning your deployment- Planning your MSDP deployment
- NetBackup naming conventions
- About MSDP deduplication nodes
- About the NetBackup deduplication destinations
- About MSDP storage capacity
- About MSDP storage and connectivity requirements
- About NetBackup media server deduplication
- About NetBackup Client Direct deduplication
- About MSDP remote office client deduplication
- About the NetBackup Deduplication Engine credentials
- About the network interface for MSDP
- About MSDP port usage
- About MSDP optimized synthetic backups
- About MSDP and SAN Client
- About MSDP optimized duplication and replication
- About MSDP performance
- About MSDP stream handlers
- MSDP deployment best practices- Use fully qualified domain names
- About scaling MSDP
- Send initial full backups to the storage server
- Increase the number of MSDP jobs gradually
- Introduce MSDP load balancing servers gradually
- Implement MSDP client deduplication gradually
- Use MSDP compression and encryption
- About the optimal number of backup streams for MSDP
- About storage unit groups for MSDP
- About protecting the MSDP data
- Save the MSDP storage server configuration
- Plan for disk write caching
 
 
- Provisioning the storage
- Licensing deduplication
- Configuring deduplication- Configuring MSDP server-side deduplication
- Configuring MSDP client-side deduplication
- About the MSDP Deduplication Multi-Threaded Agent
- Configuring the Deduplication Multi-Threaded Agent behavior
- Configuring deduplication plug-in interaction with the Multi-Threaded Agent
- About MSDP fingerprinting
- About the MSDP fingerprint cache
- Configuring the MSDP fingerprint cache behavior
- About seeding the MSDP fingerprint cache for remote client deduplication
- Configuring MSDP fingerprint cache seeding on the client
- Configuring MSDP fingerprint cache seeding on the storage server
- Enabling 250-TB support for MSDP
- About MSDP Encryption using NetBackup KMS service
- About MSDP Encryption using external KMS server
- Configuring a storage server for a Media Server Deduplication Pool
- Configuring a storage server for a PureDisk Deduplication Pool
- About disk pools for NetBackup deduplication
- Configuring a disk pool for deduplication
- Creating the data directories for 250-TB MSDP support
- Adding volumes to a 250-TB Media Server Deduplication Pool
- Configuring a Media Server Deduplication Pool storage unit
- Configuring client attributes for MSDP client-side deduplication
- Disabling MSDP client-side deduplication for a client
- About MSDP compression
- About MSDP encryption
- MSDP compression and encryption settings matrix
- Configuring encryption for MSDP backups
- Configuring encryption for MSDP optimized duplication and replication
- About the rolling data conversion mechanism for MSDP
- Modes of rolling data conversion
- MSDP encryption behavior and compatibilities
- Configuring optimized synthetic backups for MSDP
- About a separate network path for MSDP duplication and replication
- Configuring a separate network path for MSDP duplication and replication
- About MSDP optimized duplication within the same domain
- Configuring MSDP optimized duplication within the same NetBackup domain
- About MSDP replication to a different domain
- Configuring MSDP replication to a different NetBackup domain- About NetBackup Auto Image Replication
- About trusted master servers for Auto Image Replication
- About the certificate to be used for adding a trusted master server
- Adding a trusted master server using a NetBackup CA-signed (host ID-based) certificate
- Adding a trusted master server using external CA-signed certificate
- Removing a trusted master server
- Enabling NetBackup clustered master server inter-node authentication
- Configuring NetBackup CA and NetBackup host ID-based certificate for secure communication between the source and the target MSDP storage servers
- Configuring external CA for secure communication between the source MSDP storage server and the target MSDP storage server
- Configuring a target for MSDP replication to a remote domain
 
- About configuring MSDP optimized duplication and replication bandwidth
- About performance tuning of optimized duplication and replication for MSDP cloud
- About storage lifecycle policies
- About the storage lifecycle policies required for Auto Image Replication
- Creating a storage lifecycle policy
- About MSDP backup policy configuration
- Creating a backup policy
- Resilient Network properties
- Specifying resilient connections
- Adding an MSDP load balancing server
- About variable-length deduplication on NetBackup clients
- About the MSDP pd.conf configuration file
- Editing the MSDP pd.conf file
- About the MSDP contentrouter.cfg file
- About saving the MSDP storage server configuration
- Saving the MSDP storage server configuration
- Editing an MSDP storage server configuration file
- Setting the MSDP storage server configuration
- About the MSDP host configuration file
- Deleting an MSDP host configuration file
- Resetting the MSDP registry
- About protecting the MSDP catalog
- Changing the MSDP shadow catalog path
- Changing the MSDP shadow catalog schedule
- Changing the number of MSDP catalog shadow copies
- Configuring an MSDP catalog backup
- Updating an MSDP catalog backup policy
- About MSDP FIPS compliance
- Configuring the NetBackup client-side deduplication to support multiple interfaces of MSDP
- About MSDP multi-domain support
- About MSDP mutli-domain VLAN Support
- About NetBackup WORM storage support for immutable and indelible data
 
- Configuring deduplication to the cloud with NetBackup Cloud Catalyst- Using NetBackup Cloud Catalyst to upload deduplicated data to the cloud
- Cloud Catalyst requirements and limitations
- Configuring a Linux media server as a Cloud Catalyst storage server
- Configuring a Cloud Catalyst storage server for deduplication to the cloud- How to configure a NetBackup Cloud Catalyst Appliance
- How to configure a Linux media server as a Cloud Catalyst storage server
- Configuring a Cloud Catalyst storage server as the target for the deduplications from MSDP storage servers
- Certificate validation using Online Certificate Status Protocol (OCSP)
- Managing Cloud Catalyst storage server with IAM Role or CREDS_CAPS credential broker type
- Configuring a storage lifecycle policy for NetBackup Cloud Catalyst
 
- About the Cloud Catalyst esfs.json configuration file
- About the Cloud Catalyst cache
- Controlling data traffic to the cloud when using Cloud Catalyst
- Configuring source control or target control optimized duplication for Cloud Catalyst
- Configuring a Cloud Catalyst storage server as the source for optimized duplication
- Decommissioning Cloud Catalyst cloud storage
- NetBackup Cloud Catalyst workflow processes
- Disaster recovery for Cloud Catalyst
 
- MSDP cloud support- About MSDP cloud support
- Creating a cloud storage unit
- Updating cloud credentials for a cloud LSU
- Updating encryption configurations for a cloud LSU
- Deleting a cloud LSU
- Backup data to cloud by using cloud LSU
- Duplicate data cloud by using cloud LSU
- Configuring AIR to use cloud LSU
- About backward compatibility support
- About the configuration items in cloud.json, contentrouter.cfg and spa.cfg
- About the tool updates for cloud support
- About the disaster recovery for cloud LSU
- About Image Sharing using MSDP cloud
- About restore from a backup in Microsoft Azure Archive
 
- Monitoring deduplication activity- Monitoring the MSDP deduplication and compression rates
- Viewing MSDP job details
- About MSDP storage capacity and usage reporting
- About MSDP container files
- Viewing storage usage within MSDP container files
- Viewing MSDP disk reports
- About monitoring MSDP processes
- Reporting on Auto Image Replication jobs
 
- Managing deduplication- Managing MSDP servers- Viewing MSDP storage servers
- Determining the MSDP storage server state
- Viewing MSDP storage server attributes
- Setting MSDP storage server attributes
- Changing MSDP storage server properties
- Clearing MSDP storage server attributes
- About changing the MSDP storage server name or storage path
- Changing the MSDP storage server name or storage path
- Removing an MSDP load balancing server
- Deleting an MSDP storage server
- Deleting the MSDP storage server configuration
 
- Managing NetBackup Deduplication Engine credentials
- Managing Media Server Deduplication Pools- Viewing Media Server Deduplication Pools
- Determining the Media Server Deduplication Pool state
- Changing Media Server Deduplication Pool state
- Viewing Media Server Deduplication Pool attributes
- Setting a Media Server Deduplication Pool attribute
- Changing a Media Server Deduplication Pool properties
- Clearing a Media Server Deduplication Pool attribute
- Determining the MSDP disk volume state
- Changing the MSDP disk volume state
- Inventorying a NetBackup disk pool
- Deleting a Media Server Deduplication Pool
 
- Deleting backup images
- About MSDP queue processing
- Processing the MSDP transaction queue manually
- About MSDP data integrity checking
- Configuring MSDP data integrity checking behavior
- About managing MSDP storage read performance
- About MSDP storage rebasing
- About the MSDP data removal process
- Resizing the MSDP storage partition
- How MSDP restores work
- Configuring MSDP restores directly to a client
- About restoring files at a remote site
- About restoring from a backup at a target master domain
- Specifying the restore server
 
- Managing MSDP servers
- Recovering MSDP
- Replacing MSDP hosts
- Uninstalling MSDP
- Deduplication architecture
- Configuring and using universal shares- About Universal Shares
- Configuring and using an MSDP build-your-own (BYO) server for Universal Shares
- MSDP build-your-own (BYO) server prerequisites and hardware requirements to configure Universal Shares
- Mounting a Universal Share created from the NetBackup web UI
- Creating a Protection Point for a Universal Share
 
- Troubleshooting- About unified logging
- About legacy logging
- NetBackup MSDP log files
- Troubleshooting MSDP installation issues
- Troubleshooting MSDP configuration issues
- Troubleshooting MSDP operational issues- Verify that the MSDP server has sufficient memory
- MSDP backup or duplication job fails
- MSDP client deduplication fails
- MSDP volume state changes to DOWN when volume is unmounted
- MSDP errors, delayed response, hangs
- Cannot delete an MSDP disk pool
- MSDP media open error (83)
- MSDP media write error (84)
- MSDP no images successfully processed (191)
- MSDP storage full conditions
- Troubleshooting MSDP catalog backup
- Storage Platform Web Service (spws) does not start
- Disk volume API or command line option does not work
 
- Viewing MSDP disk errors and events
- MSDP event codes and messages
- Troubleshooting Cloud Catalyst issues- Cloud Catalyst logs
- Problems encountered while using the Cloud Storage Server Configuration Wizard
- Disk pool problems
- Problems during cloud storage server configuration
- Status 191: No images were successfully processed
- Media write error (84) if due to a full local cache directory
- Troubleshooting restarting ESFS after the Cloud Catalyst storage server is down
- Restarting the vxesfsd process
- Problems restarting vxesfsd
- Unable to create CloudCatalyst with a media server that has version earlier to 8.2
- Cloud Catalyst troubleshooting tools
 
- Unable to obtain the administrator password to use an AWS EC2 instance that has a Windows OS
- Trouble shooting multi-domain issues
 
- Appendix A. Migrating to MSDP storage
- Index
About the Cloud Catalyst cache
The administrator configures a local cache directory as part of configuring a Cloud Catalyst storage server. The primary function of the local cache directory (or Cloud Catalyst cache) is to allow the Cloud Catalyst to continue to deduplicate data. Deduplication of data occurs even if the ingest rate from targeted backup and duplication jobs temporarily exceeds the available upload throughput to the destination cloud storage.
For example, if backup and duplication jobs transfer 10 TB of data per hour to the Cloud Catalyst storage server, and the Cloud Catalyst deduplicates the data at a ratio of 10:1, the 1 TB of deduplicated data may exceed the upload capacity of .7 TB per hour of writes to cloud storage. The cache allows the jobs to continue to send and process the data, assuming that at some point the incoming data rate slows. The Cloud Catalyst cache only stores the deduplicated data. Jobs are not marked as complete until all data is uploaded to the cloud.
While a Cloud Catalyst cache of 4 TB is recommended, a larger cache has the following benefits:
- For restores: - If the data exists in the Cloud Catalyst cache, it is restored from the cache instead of the cloud. The larger the cache, the more deduplicated objects can reside in the cache. 
- For data with poor deduplication rates: - A larger cache may be required since the poor deduplication ratios require that larger amounts of data be uploaded to the cloud. 
- For job windows that experience bursts of activity: - A larger cache can be helpful if frequent jobs are targeted to the Cloud Catalyst storage server within a narrow window of time. 
While a larger cache can be beneficial, jobs are not marked as complete until all data is uploaded to the cloud. Data is uploaded from the cache to the cloud when an MSDP container file is full. This occurs soon after the backup or duplication job begins, but not immediately. Deduplication makes it possible for second and subsequent backup jobs to transfer substantially less data to the cloud, depending on the deduplication rate.
For example, 4 TB of cache is expected to manage 1 PB of data in the cloud without issue.
Note:
If you initiate a restore from Glacier or Glacier Deep Archive, NetBackup initiates a warming step. NetBackup does not proceed with the restore until all the data is available in S3 storage to be read.
The warming step is always done if using Amazon, even if the data is in the Cloud Catalyst cache. For storage classes other than Glacier and Glacier Deep Archive, the warming step is almost immediate with no meaningful delay. For Glacier and Glacier Deep Archive, the warming step may be immediate if files were previously warmed and are still in S3 Standard storage. However, it may take several minutes, hours, or days depending on settings being used.
The Cloud Catalyst manages the cache based on the configuration settings in the esfs.json file. Once the high watermark is reached, data is purged when the used space reaches the midpoint between HighWatermark and LowWatermark					(high+low)/2 and continues until LowWatermark is reached. If the rate of incoming data exceeds the rate where the watermark can be maintained, the jobs begin to fail.  Administrators should not manually delete or purge the managed data in the cache storage unless directed to do so by NetBackup Technical Support.