NetBackup™ for Hadoop Administrator's Guide
- Introduction
- Prerequisites and best practices for the Hadoop plug-in for NetBackup
- Configuring NetBackup for Hadoop
- About configuring NetBackup for Hadoop
- Managing backup hosts
- Adding Hadoop credentials in NetBackup
- Configuring the Hadoop plug-in using the Hadoop configuration file
- Configuring NetBackup for a highly-available Hadoop cluster
- Configuring a custom port for the Hadoop cluster
- Configuring number of threads for backup hosts
- Configuring number of streams for backup hosts
- Configuring distribution algorithm and golden ratio for backup hosts
- Configuring communication between NetBackup and Hadoop clusters that are SSL-enabled (HTTPS)
- Configuration for a Hadoop cluster that uses Kerberos
- Hadoop.conf configuration for parallel restore
- Create a BigData policy for Hadoop clusters
- Disaster recovery of a Hadoop cluster
- Performing backups and restores of Hadoop
- Troubleshooting
- About troubleshooting NetBackup for Hadoop issues
- About NetBackup for Hadoop debug logging
- Troubleshooting backup issues for Hadoop data
- Backup operation fails with error 6609
- Backup operation failed with error 6618
- Backup operation fails with error 6647
- Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed up or restored for Hadoop
- Backup operation fails with error 6654
- Backup operation fails with bpbrm error 8857
- Backup operation fails with error 6617
- Backup operation fails with error 6616
- Backup operation fails with error 84
- NetBackup configuration and certificate files do not persist after the container-based NetBackup appliance restarts
- Unable to see incremental backup images during restore even though the images are seen in the backup image selection
- One of the child backup jobs goes in a queued state
- Troubleshooting restore issues for Hadoop data
- Restore fails with error code 2850
- NetBackup restore job for Hadoop completes partially
- Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed up or restored for Hadoop
- Restore operation fails when Hadoop plug-in files are missing on the backup host
- Restore fails with bpbrm error 54932
- Restore operation fails with bpbrm error 21296
- Hadoop with Kerberos restore job fails with error 2850
- Configuration file is not recovered after a disaster recovery
- Index
Best practice for improving performance during backup and restore
Performance issues such as slow throughput and high CPU usage are observed during the backup and recovery of Hadoop using the SSL environment (HTTPS). The issue is caused if the internal communications in Hadoop are not encrypted. The HDFS configurations must be tuned correctly in the HDFS cluster to improve the internal communication and performance in Hadoop, which can also improve the backup and recovery performance.
For a better backup and restore performance, NetBackup recommended to follow the Hadoop configuration recommendations from Apache or Hadoop distributions in use.
If you have Hadoop encryption turned on within the cluster, follow the recommendations from Apache or Hadoop distributions in use to select the right cipher and bit length for data transfer within Hadoop cluster.
NetBackup performs better during backup and recovery when AES 128 is used for data encryption during the block data transfer.
You can also increase the number of backup hosts in case of backup to get a better performance; when you have more than one folder to be backed up in the Hadoop cluster. You can have maximum one backup host per folder in the Hadoop cluster to get the maximum benefit.
You can also increase the number of threads per backup host that are used to fetch data from the Hadoop cluster by NetBackup during backup operation. If you have files with the size in the range of tens of GBs, then you can increase the number of threads for better performance. The default number for threads is 4.
You can also increase the number of streams per backup host that are used for parallel streaming.
You can choose any one of the data distribution algorithms best suited for your deployment:
For small number of large files in your data set, use distribution algorithm 1.
For large number of small sized files in your data set, use distribution algorithm 2.
For a mix of small number of very large sized files and large number of small sized files in your data set, use the appropriate combination of distribution algorithm and golden ratio. See the example below:
Table: Example for large number of small files and small number of large file case
Data size | Number of backup hosts | Number of threads | Number of streams | Distribution algorithm | Golden ratio |
|---|---|---|---|---|---|
|
Upto 1 TB | 4 | 16 | 5 | 4 | 80 |
Upto 50TB | 5 | 32 | 5 | 4 | 80 |
>50TB | 6 | 32 | 5 | 4 | 80 |
For more details, refer Apache Hadoop documentation for secure mode.
Additionally for optimal performance, ensure the following:
Primary server is not used as a backup host.
In case of multiple policies scheduled to be triggered in parallel:
Avoid using the same discovery host in all policies.
The last Backup_Host entry is different for these policies.
Note:
Discovery host is the last entry in the Backup_Host list.