NetBackup Auto Image Replication (AIR) bandwidth tuning tips

Article: 100031167
Last Published: 2017-11-29
Ratings: 0 0
Product(s): NetBackup & Alta Data Protection

Problem

Replication of images between two NetBackup Master Server Domains is managed via Storage Lifecycle Policies.  This operation is similar to a duplication except that there is no target storage unit.  In certain circumstances where the data is being replicated over a high-latency connection, the overhead of the communication between two storage devices may prevent the storage layer from utilizing all of the available bandwidth.  The "backlog" of images that need to be replicated grows over time (note that there is always some backlog, as long as new images are being created).  One way to see the images which need to be replicated is by using the command line:

nbstlutil list -copy_type replica -copy_incomplete

There are two known reasons this may happen: either too many jobs are running on the same storage server concurrently, or not enough jobs are running concurrently.

In the first case, the NetBackup activity monitor shows many replication jobs as active and a very large number of them (or all) are replicating from the same source storage server.  The file list for these jobs does not progress as expected, though, and in some cases, these jobs actually appear to be "hung."

In the case of too few jobs running, relatively few replication jobs are running - often only one or two per disk volume - and the bandwidth consumption on the network is low. This is typically seen when replicating on networks with latency over 100ms. This appears to the user as if though the realized bandwidth for replication operations is far less than the available bandwidth of the network.

Solution

For the first case when many replication jobs are running at once, but few appear to be making progress, the system may have reached a state of "gridlock."  The solution is simply to decrease the Max I/O streams on the disk pool which is doing the replications.  By default, this number is -1 which indicates unlimited replication jobs can be run on this storage server.  In the NetBackup Administration Console, the Disk Pool properties dialog has a check box to enable limiting the I/O streams (figure):
Image

To increase bandwidth utilization, it may be beneficial to increase the number of replication jobs running concurrently.  This is done by tuning the SLP parameters.  In NetBackup 7.5, the tuning parameters for SLP replication jobs are the same as those used for duplication jobs.  To increase the number of concurrent jobs the size of the replication batches must be made smaller.  This will force NetBackup to run more simultaneous jobs. 

Note: the disk pool's Maximum I/O streams settings must support additional concurrent jobs.  To reduce the maximum size of the replication job use the following parameter in the LIFECYCLE_PARAMETERS file:

MAX_GB_SIZE_PER_DUPLICATION_JOB = <gigabyte value>

The default value for this setting is 25 gigabytes, so the new value must be a factor smaller in order to determine if this provides any benefit.  Of course, this also depends on the nature of the replication batches themselves (i.e. typical batches are a few large images vs. thousands of small images).

Again, note that this setting will also impact your SLP-managed duplication jobs.

If your storage device supports it (such as a NetBackup 5020 Appliance), you can also accomplish this at the storage level.  By default, the NetBackup 5020 will create up to four data streams concurrently within a single replication job.  This setting is controlled by the agent.cfg file settings on the NetBackup 5020 appliance itself.  Currently, no equivalent mechanism exists for the NetBackup Media Server Deduplication Pool (MSDP) storage server.

Reference:
See the chapter Configuring storage lifecycle policies in the Veritas NetBackup Administrator's Guide, Volume I (linked below) for more information on LIFECYCLE_PARAMETERS tuning options.

 

Applies To

These instructions are specific to NetBackup 7.5.x versions.  Similar options may exist in NetBackup 7.1.

Was this content helpful?