Veritas NetBackup™ for MongoDB Administrator's Guide
- Overview of protecting MongoDB using NetBackup
- Verify the pre-requisites for the MongoDB plug-in for NetBackup
- Configuring NetBackup for MongoDB
- Configuring backup options for MongoDB using the mongodb.conf file
- Adding MongoDB credentials in NetBackup
- Managing backup hosts
- Backing up MongoDB using NetBackup
- Backing up MongoDB data
- Configuring NetBackup policies for MongoDB plug-in
- Restoring or recovering MongoDB data using NetBackup
- About the restore scenarios for MongoDB database from the BAR interface
- Recovering a MongoDB database using the command line
- Appendix A. Additional information
Backing up MongoDB data
MongoDB data is backed up in parallel streams wherein MongoDB data nodes stream data blocks simultaneously to multiple backup hosts.
The following diagram provides an overview of the backup flow:
As illustrated in the above diagram:
A scheduled backup job is triggered from the master server.
Backup job for MongoDB data is a compound job. When the backup job is triggered, first a discovery job runs.
During discovery, the backup host deploys a transient thin client (mdbserver) on the configuration server and obtains the details of the shards in the MongoDB cluster. The thin client also stops the balancing across the nodes in a replica set.
After receiving the information about the cluster, the backup host deploys a thin client on the secondary node of a replica set in the MongoDB cluster.
The thin client discovers the database paths dynamically, quiesces the secondary nodes, and takes snapshots for full backups and captures
oplogfor incremental backups.
Individual child jobs run for each backup stream and data is backed up.
Data blocks are streamed simultaneously from different secondary nodes to multiple backup hosts.
Once the backup operation is completed, the thin client is removed from the servers.
The compound backup job is not completed until all the child jobs are completed. After the child jobs are completed, NetBackup cleans all the snapshots from the secondary nodes. Only after the cleanup activity is completed, the compound backup job is completed.