NetBackup™ for MongoDB Administrator's Guide

Last Published:
Product(s): NetBackup & Alta Data Protection (10.5)
  1. Overview of protecting MongoDB using NetBackup
    1.  
      About protecting a sharded, replica set, or standalone MongoDB cluster using NetBackup
    2.  
      Protecting MongoDB data using NetBackup
    3.  
      NetBackup for MongoDB terminologies
    4.  
      Limitations
    5.  
      Prerequisites and the best practices for protecting MongoDB
  2. Verify the pre-requisites for the MongoDB plug-in for NetBackup
    1.  
      Operating system and platform compatibility
    2.  
      Prerequisites for configuring the MongoDB plug-in
  3. Configuring NetBackup for MongoDB
    1.  
      About the MongoDB configuration tool
    2.  
      Prerequisites for manually creating the mongodb.conf file
    3. Configuring backup options for MongoDB using the mongodb.conf file
      1.  
        Including the configuration file path in the allowed list on the NetBackup primary server
    4.  
      Obtaining the RSA key of the MongoDB nodes
    5. Adding MongoDB credentials in NetBackup
      1.  
        About the credential configuration file
      2.  
        How to add the MongoDB credentials in NetBackup
      3.  
        About the MongoDB roles for protecting the data
    6.  
      Host user requirements
    7. Managing backup hosts
      1.  
        Including a NetBackup client on NetBackup primary server allowed list
  4. Backing up MongoDB using NetBackup
    1. About backing up MongoDB data
      1.  
        Backing up a MongoDB cluster
    2.  
      Prerequisites for backing up a MongoDB cluster
    3. Configuring NetBackup policies for MongoDB plug-in
      1.  
        Creating a BigData backup policy for MongoDB clusters with web UI
  5. Restoring or recovering MongoDB data using NetBackup
    1.  
      About restoring MongoDB data
    2.  
      Prerequisites for MongoDB restore and recovery
    3.  
      Restore the MongoDB data on the same cluster
    4.  
      Restore the MongoDB data on an alternate cluster
    5.  
      Restoring MongoDB data in a high availability setup to an alternate client
    6.  
      Manual steps after the recovery process
  6. Troubleshooting
    1.  
      About NetBackup for MongoDB debug logging
    2.  
      Known limitations for MongoDB protection using NetBackup
  7. Appendix A. Additional information
    1.  
      Sample MongodB configuration utility workflow to add and update MongodB credentials

Known limitations for MongoDB protection using NetBackup

The following table lists the known limitations for MongoDB protection using NetBackup:

Table: Known limitations

Limitation

Workaround

Consider a configuration with a sharded MongoDB cluster with high availability that contains multiple mongos processes. Before you start a restore and recover operation, only the mongos process on the restore destination for the Config Server Replica Set (CSRS) image should be running.

Manually stop any other mongos processes in the cluster before you start a restore and recover operation.

After recovery reconfigure the mongos services to point to the recovered cluster.

If the mongos process is not shut down on all nodes except one, the additional mongos processes might conflict with the restore operation. This situation causes the data that is restored to be inaccessible with a connection to mongos.

If you do not shut down the mongos processes before you start the restore and recovery, then after recovery you must manually shut down the statemongos processes. Then restart all the recovered mongod and mongos processes under the cluster.

You must start the MongoDB processes with an absolute path to the configuration files. You must use the absolute paths for the certificate files and the CA file as well. You must specify the absolute paths for the CA file, PEM file, and key files as well.

N/A

If the authentication type that was present during backup changes and you run a recovery job that requires a different authentication, the recovery process might fail.

Ensure that the authentication type during recovery remains the same as the type that was used during the backup.

If after you run a backup you then rename the volume group or the logical volume, the subsequent backup may fail.

N/A

During recovery, ensure that you select only one full backup image and the subsequent incremental images that are relevant. If you select more than one image, the recovery may fail as the restored data could be corrupted.

N/A

After your recover the MongoDB cluster, the cluster information for only the restored node is available.

After the recovery process is complete, manually add the secondary nodes to the cluster.

For more information, refer to the following article: add-members-to-the-replica-set

During the backup process, if the MongoDB import operation is running, it can become unresponsive. Avoid the MongoDB import operation during the backup or restore process.

N/A

During the restore process, the message The restore was successfully initiated is displayed, but the restore job does not start.

This issue occurs when you enter the Application server for both the Source client and the Destination client in the web UI.

Ensure that Source client and Destination client are entered correctly. The Source client must be the Application server and the Destination client must be the backup host.

If your environment has DNAT, ensure that the backup host or the restore host and all the MongoDB nodes are in the same private network.

N/A

The NetBackup for MongoDB plug-in does not support the command line bprestore options -w and -print_jobid.

N/A

MongoDB restores are not supported from the backup hosts. All the restore operations for MongoDB must be initiated from the NetBackup primary.

N/A

If your restore job submission does not display the restore job, verify that your destination node has a MongoDB plug-in that is installed on it.

N/A

If you restore the MongoDB database to a non-LVM location and then try to take a backup from this non-LVM location, the backup fails.

Restore the data to an LVM location and then try to take a backup of the restored data.

The NetBackup for MongoDB plug-in does not support hard or soft links in the data path folders. Do not add any hard or any soft links that point to locations in a different logical volume or a non-logical volume.

NetBackup cannot ensure that the data is consistent at the time of backups if you have hard or soft links in the data path folder. During the restore process, the hard or the soft links are created as folders and not links.

N/A

When you cancel a child restore job during the MongoDB restore and recovery process, the thin client (mdbserver) is not removed immediately. The thin client is removed after the next restore operation.

N/A

MongoDB restore fails and displays error 2850.

Consider the following solutions:

  • Ensure that the destination host and port are valid and that the credentials were configured with the tpconfig command and the credentials file. For more information, refer to the tar logs.

  • The target database path does not exist and there are insufficient permissions for the non-root user.

    Workaround:

    Ensure that the target database path exists and there are sufficient permissions for the non-root user.

  • Ensure that there are no special characters in the rename and filelist file. Also, if the primary server is a Windows computer then make sure that the EOL conversion of the file is Unix Style (LF).

After recovery, the MongoDB shard node fails to restart manually and the following error is seen in the MongoDB logs:

NoSuchKey: Missing expected field "configsvrConnectionString"

On the MongoDB shard where the problem occurs, start MongoDB in the maintenance mode and run the following method on the system.version collection in the admin database:

use admin
db.system.version.deleteOne
( { _id: "minOpTimeRecovery" } )

In a restore operation that contains one or more replica sets, replica set members are restored to the replica set that uses the default "cfg.members[#].host" value that rs.config() provides.

If this value was previously updated from the default value after the restore and recover completes, this value may need to be updated to match the original configuration. (For example, from shortname to FQDN.)

Workaround:

  1. Log on to the replica set MongoDB cluster

  2. Use the following command to verify the configuration:

    rs.conf()

  3. Use the following command to update the configuration for the replica set:

    Update configuration for replica set member 0:
    cfg = rs.conf();
    cfg.members[0].host = '<hostname.domain.com>:
    <port-number>';
    rs.reconfig(cfg)
  4. Verify the changes using the following command:

    rs.conf()

  5. Repeat the steps for the other replica sets and the members, or only the replica set members.

Backup jobs fail and the following error codes are displayed:

  • (50) client process aborted

  • (1) The requested operation was partially successful.

  • (112) no files specified in the file list

Ensure that the backup windows for incremental backups are different for the same MongoDB cluster. The backup windows must not overlap each other for incremental backups for the same MongoDB cluster.

Ensure that permissions are in place for the mdbserver location, oplog location, and snapshot mount location. For more information, See Host user requirements.

In a sharded MongoDB cluster environment, a 112 error can indicate that the mongos process is not running on the client that is defined in the backup policy.

An error 112 can also indicate that same hosts names for multiple backup hosts are added to the BigData policy. Use unique host names for multiple backup hosts that are running the backup operations.

After a restore operation, if you try to stop and restart the mongod or mongos services (service mongod stop or service mongod restart), the commands fail.

This error occurs when the mongod or mongos processes are launched as service using the service or systemctl commands and not using a direct command.

Workaround:

Stop the mongod or mongos services using alternative methods. For example, mongod -f /etc/mongod.conf --shutdown or kill <PID>. After stopping the services, you can use the service or systemctl commands again.

Note:

When you stop the services after restore and recovery, the .pid or .sock files remain when you shut down the mongod or mongos processes. You must delete the files if the mongod or mongos services do not start after shutting them down.

The default location of the .sock files is /tmp

The default location of the .pid files is /var/run/mongodb/

Backup operation fails if a command that generates output in .bashrc is added.

Backup fails with error 6646 and displays the following error:

Error: Unable to communicate with the server.

Ensure that no output is generated by .bashrc (echo or any other output generating command). The output should not return STDERR or STDOUT when the shell is non-interactive.

When you select two full backup images and try to restore to a point-in-time image that is between the two full backup images, the latest full backup image is restored.

Workaround:

During the restore operation, do not select more than one full backup image.

For an effective point-in-time recovery, ensure that you run differential incremental backups of shorter duration.

Unable to see the restore job progress in the Recover node.

Workaround:

For a compound restore job that uses a non-primary server as the restore host, look for the job record and status in the Task progress section in the Recover node. Click Refresh to refresh the task list.

Backup fails with the following error:

(6625) The backup host is either unauthorized to complete the operation or it is unable to establish a connection with the application server.

Workaround:

On the server where MongoDB is installed, ensure that PasswordAuthentication is not disabled in /etc/ssh/sshd_config file.

Run the sudo service sshd restart command.

Backup fails with the following error:

(6646) Unable to communicate with the server.

Workaround:

Ensure that the backup host can access the defined port in mongodb.conf file or the default mdbserver_port (11000).

There can be an error when you copy the thin client files on the MongoDB server because of the following issues:

  • Connectivity issues with the MongoDB server

  • User does not have permissions to the location for copying the thin-client files.

The following error is displayed in the mdbserver logs:

error-sudo: sorry, you must have a tty to run sudo

Workaround:

  • To disable the requiretty option globally in the sudoers file, replace Defaults requiretty with Defaults !requiretty. This action changes the global sudo configuration.

  • You can change the sudo configuration for the user, group, or command. On the server where MongoDB is installed, add the host user, or group, or command in the sudoers file.

    Add Defaults /path/to/my/bin !requiretty

    Add Default <host_user> !requiretty

The nbaapireq_handler log folder is not created on a Flex Container, even you run the mklogdir command.

Workaround:

When a Flex Appliance is upgraded from version 8.1.2 to 8.2 and the Flex media server is used as backup host, the MongoDB plug-in creates the following log directory:

/usr/openv/netbackup/logs/nbaapireq_handler

The snapshot size as described by the free_space_percentage_snapshot parameter must be set according to the MongoDB cluster size and must be large enough. If these criteria are not met, the backup fails and displays the following error:

invalid command parameter (20)

Validate the free_space_percentage_snapshot value with the MongoDB cluster.

Backup fails with the following error:

(13) file read failed for Media

Ensure that the:

  • NetBackup version on the primary server is the latest.

  • NetBackup version on the media server is the same as the primary server but newer than the NetBackup client version on the backup host.

  • NetBackup client version on the backup host is the same as or older than the media server.

The mdb_progress_loglevel parameter is missing from the MongoDB configuration tool.

To modify the mdb_progress_loglevel parameter, update the mongodb.conf file after the MongoDB configuration tool creates it.

For more information, refer to the MongoDB Administrator's Guide.

Snapshots are not deleted and stale mdbserver instances are seen. This scenario might cause Cannot lstat errors during backup and partially successful backups.

Change the configuration settings for the following parameters in the mongodb.conf file:

  • cleanup_time_in_min

  • mdbserver_timeout_min

Set the values such that the stale snapshots and stale instances of mdbserver are cleared before the next full or incremental backup schedule.

If the backup host has NetBackup version earlier than 8.3 and primary and media server have the latest version of NetBackup, the following invalid error codes can be seen for various scenarios:

13302, 13303, 13304, 13305, 13306, 13307, 13308, 13309, 13310, 13311, 13312, 13313, 13314, 13315

Workaround:

Refer to the following list of corresponding actual error codes if you see the invalid error codes for the actual scenarios and recommended actions:

  • Invalid error code: 13302

    Actual error: 6724

    Message: Restore node count is invalid.

  • Invalid error code: 13303

    Actual error: 6725

    Message: Unable to find information about the MongoDB replica set.

  • Invalid error code: 13304

    Actual error: 6704

    Message: Restoring multiple MongoDB nodes on one replica set is invalid.

  • Invalid error code: 13305

    Actual error: 6705

    Message: Restoring MongoDB data on an arbiter node is invalid.

  • Invalid error code: 13306

    Actual error: 6706

    Message: A discovered shard was found in a drain state, cannot proceed with backup.

  • Invalid error code: 13307

    Actual error: 6707

    Message: An unsupported MongoDB storage engine is detected.

  • Invalid error code: 13308

    Actual error: 6708

    Message: Unable to parse command output

  • Invalid error code: 13309

    Actual error: 6709

    Message: Unable to run the command.

  • Invalid error code: 13310

    Actual error: 6710

    Message: Pre-check for recovery has failed as WiredTiger log files are present at the database path.

  • Invalid error code: 13311

    Actual error: 6711

    Message: Unable to backup MongoDB configuration file.

  • Invalid error code: 13312

    Actual error: 6712

    Message: Unable to find operation log for previous backup.

  • Invalid error code: 13313

    Actual error: 6713

    Message: Operations log roll-over detected.

  • Invalid error code: 13314

    Actual error: 6714

    Message: Error while collection was iterated.

  • Invalid error code: 13315

    Actual error: 6715

    Message: Operation log verification error.

For detailed information and recommended actions, refer to the NetBackup Status Codes Reference Guide.

The Restore button in the NetBackup web UI is disabled for the imported MongoDB backup images.

Workaround:

If you import the images to the same NetBackup primary server that was originally used to back them up, use either of the following methods:

  • Perform the restore operation using the bprestore command.

  • Restore the catalog backup that enables the Restore button in the web UI and then restore the images.

If you import the images to a different NetBackup primary server than the one that was originally used to back them up, use the bprestore command to run the restore operation.

Recovery operation fails on an alternate, sharded MongoDB cluster. The following error is displayed:

Unable to find the configuration parameter. (6661)

This issue occurs during an alternate cluster recovery because the pre-recovery check is unable to find the mongos port for the alternate cluster in the mongodb.conf file. This issue occurs because of the way the MongoDB configuration tool creates the mongodb.conf file when you add the alternate MongoDB cluster details using the Update option from the tool.

Workaround:

Before you start the recovery process, update the mongodb.conf file to separate the alternate cluster from the original cluster.

For example:

Existing mongodb.conf file:

 "application_servers":
   {
    "original.mongodb.cluster.com:26050":
		 {
     "alternate_config_server":
			  [
       {
         "hostname:port": "alt.mongodb.cluster.com:26000",
         "mongos_port": "26001"
       }
      ],
    "mongos_port": "26051"
    }
   }

Suggested update to the mongodb.conf file:

"application_servers": 
   {
    "original.mongodb.cluster.com:26050":
   {
      "mongos_port": "26051"
   },
      "alt.mongodb.cluster.com:26000":
   {
   "mongos_port": "26001"
   }
   }

The MUI tool displays the following error:

Unable to delete configuration.

Recommended action:

  • Verify that the <hostname-port>.conf file still exists in the /usr/openv/var/global directory.

  • Refer to the tpconfig logs and check for error:

    Translate EMM_ERROR_MachineNotExist(2000000) to 88 in the Device Config context.

Work Around:

Delete the <hostname-port>.conf file manually from /usr/openv/var/global.

In case of certificate-based authentication enabled on MongoDB, a differential incremental backup fails with error 6709: Unable to run the command.

Workaround:

Refer to the mdbserver logs to find the error code and command details. Then perform one of the following actions:

  • If mdbserver logs indicate mongodump command failure, try running mongodump command manually on the MongoDB host and check the error.

  • If mongodump command fails with X509 certificate-related connection errors, make sure to fix these errors by updating MongoDB server certificates with subjectAltName property as per MongoDB documentation. Then re-run the differential incremental backup.