Veritas NetBackup™ for HBase Administrator's Guide
- Introduction
- Deploying HBase plug-in for NetBackup
- Configuring NetBackup for HBase
- Managing backup hosts
- Configuring the HBase plug-in using the HBase configuration file
- Configuring NetBackup policies for HBase plug-in
- Performing backups and restores of HBase
- Troubleshooting
Backing up HBase data
HBase data is backed up in parallel streams wherein HBase Region servers stream data blocks simultaneously to multiple backup hosts.
The following diagram provides an overview of the backup flow:
As illustrated in the following diagram:
A scheduled backup job is triggered from the master server.
Backup job for HBase data is a compound job. When the backup job is triggered, first a discovery job is run.
During discovery, the first backup host connects with the Hmaster and performs a discovery to get details of data that needs to be backed up.
A workload discovery file is created on the backup host. The workload discovery file contains the details of the data that needs to be backed up from the different Region servers.
The backup host uses the workload discovery file and decides how the workload is distributed amongst the backup hosts. Workload distribution files are created for each backup host.
Individual child jobs are executed for each backup host. As specified in the workload distribution files, data is backed up.
Data blocks are streamed simultaneously from different Region servers to multiple backup hosts.
The compound backup job is not completed until all the child jobs are completed. After the child jobs are completed, NetBackup cleans all the snapshots from the HMaster. Only after the cleanup activity is completed, the compound backup job is completed.