NetBackup™ Backup Planning and Performance Tuning Guide

Last Published:
Product(s): NetBackup (10.3.0.1, 10.2, 10.1.1, 10.1, 10.0.0.1, 10.0, 9.1.0.1, 9.1, 9.0.0.1, 9.0, 8.3.0.2, 8.3.0.1, 8.3)
  1. NetBackup capacity planning
    1.  
      Purpose of this guide
    2.  
      Changes in Veritas terminology
    3.  
      Disclaimer
    4.  
      How to analyze your backup requirements
    5.  
      How to calculate the size of your NetBackup image database
    6. Sizing for capacity with MSDP
      1. Key sizing parameters
        1.  
          Data types and deduplication
        2.  
          Determining FETB for workloads
        3.  
          Retention periods
        4.  
          Change rate
        5.  
          Replication and duplication of backups
        6.  
          Sizing calculations for MSDP clients
    7.  
      About how to design your OpsCenter server
  2. Primary server configuration guidelines
    1.  
      Note for users of NetBackup 10.2 or later
    2.  
      Size guidance for the NetBackup primary server and domain
    3.  
      Factors that limit job scheduling
    4.  
      More than one backup job per second
    5.  
      Stagger the submission of jobs for better load distribution
    6.  
      NetBackup job delays
    7.  
      Selection of storage units: performance considerations
    8.  
      About file system capacity and NetBackup performance
    9.  
      About the NetBackup catalog
    10.  
      Guidelines for managing the catalog
    11.  
      Adjusting the batch size for sending metadata to the NetBackup catalog
    12.  
      Methods for managing the catalog size
    13.  
      Performance guidelines for NetBackup policies
    14.  
      Legacy error log fields
  3. Media server configuration guidelines
    1. NetBackup hardware design and tuning considerations
      1.  
        PCI architecture
      2.  
        Central processing unit (CPU) trends
      3.  
        Storage trends
      4.  
        Conclusions
    2. About NetBackup Media Server Deduplication (MSDP)
      1.  
        Data segmentation
      2.  
        Fingerprint lookup for deduplication
      3.  
        Predictive and sampling cache scheme
      4.  
        Data store
      5.  
        Space reclamation
      6.  
        System resource usage and tuning considerations
      7.  
        Memory considerations
      8.  
        I/O considerations
      9.  
        Network considerations
      10.  
        CPU considerations
      11.  
        OS tuning considerations
      12. MSDP tuning considerations
        1.  
          Sample steps to change MSDP contentrouter.cfg
      13. MSDP sizing considerations
        1.  
          Data gathering
        2.  
          Leveraging requirements and best practices
    3.  
      Cloud tier sizing and performance
    4. Accelerator performance considerations
      1.  
        Accelerator for file-based backups
      2.  
        Controlling disk space for Accelerator track logs
      3.  
        Accelerator for virtual machine backups
      4.  
        Forced rescan schedules
      5.  
        Reporting the amount of Accelerator data transferred over the network
      6.  
        Accelerator backups and the NetBackup catalog
  4. Media configuration guidelines
    1.  
      About dedicated versus shared backup environments
    2.  
      Suggestions for NetBackup media pools
    3.  
      Disk versus tape: performance considerations
    4.  
      NetBackup media not available
    5.  
      About the threshold for media errors
    6.  
      Adjusting the media_error_threshold
    7.  
      About tape I/O error handling
    8.  
      About NetBackup media manager tape drive selection
  5. How to identify performance bottlenecks
    1.  
      Introduction
    2.  
      Proper mind set for performance issue RCA
    3.  
      The 6 steps of performance issue RCA and resolution
    4. Flowchart of performance data analysis
      1.  
        How to create a workload profile
  6. Best practices
    1.  
      Best practices: NetBackup SAN Client
    2. Best practices: NetBackup AdvancedDisk
      1.  
        AdvancedDisk performance considerations
      2.  
        Exclusive use of disk volumes with AdvancedDisk
      3.  
        Disk volumes with different characteristics
      4.  
        Disk pools and volume managers with AdvancedDisk
      5.  
        Network file system considerations
      6.  
        State changes in AdvancedDisk
    3.  
      Best practices: Disk pool configuration - setting concurrent jobs and maximum I/O streams
    4.  
      Best practices: About disk staging and NetBackup performance
    5.  
      Best practices: Supported tape drive technologies for NetBackup
    6. Best practices: NetBackup tape drive cleaning
      1.  
        How NetBackup TapeAlert works
      2.  
        Disabling TapeAlert
    7.  
      Best practices: NetBackup data recovery methods
    8.  
      Best practices: Suggestions for disaster recovery planning
    9.  
      Best practices: NetBackup naming conventions
    10.  
      Best practices: NetBackup duplication
    11.  
      Best practices: NetBackup deduplication
    12. Best practices: Universal shares
      1.  
        Benefits of universal shares
      2.  
        Configuring universal shares
      3.  
        Tuning universal shares
    13. NetBackup for VMware sizing and best practices
      1.  
        Configuring and controlling NetBackup for VMware
      2.  
        Discovery
      3.  
        Backup and restore operations
    14.  
      Best practices: Storage lifecycle policies (SLPs)
    15.  
      Best practices: NetBackup for Nutanix AHV
    16.  
      Best practices: NetBackup Sybase database
    17.  
      Best practices: Avoiding media server resource bottlenecks with Oracle VLDB backups
    18.  
      Best practices: Avoiding media server resource bottlenecks with MSDPLB+ prefix policy
    19.  
      Best practices: Cloud deployment considerations
  7. Measuring Performance
    1.  
      Measuring NetBackup performance: overview
    2.  
      How to control system variables for consistent testing conditions
    3.  
      Running a performance test without interference from other jobs
    4.  
      About evaluating NetBackup performance
    5.  
      Evaluating NetBackup performance through the Activity Monitor
    6.  
      Evaluating NetBackup performance through the All Log Entries report
    7. Table of NetBackup All Log Entries report
      1.  
        Additional information on the NetBackup All Log Entries report
    8. Evaluating system components
      1.  
        About measuring performance independent of tape or disk output
      2.  
        Measuring performance with bpbkar
      3.  
        Bypassing disk performance with the SKIP_DISK_WRITES touch file
      4.  
        Measuring performance with the GEN_DATA directive (Linux/UNIX)
      5.  
        Monitoring Linux/UNIX CPU load
      6.  
        Monitoring Linux/UNIX memory use
      7.  
        Monitoring Linux/UNIX disk load
      8.  
        Monitoring Linux/UNIX network traffic
      9.  
        Monitoring Linux/Unix system resource usage with dstat
      10.  
        About the Windows Performance Monitor
      11.  
        Monitoring Windows CPU load
      12.  
        Monitoring Windows memory use
      13.  
        Monitoring Windows disk load
    9.  
      Increasing disk performance
  8. Tuning the NetBackup data transfer path
    1.  
      About the NetBackup data transfer path
    2.  
      About tuning the data transfer path
    3.  
      Tuning suggestions for the NetBackup data transfer path
    4.  
      NetBackup client performance in the data transfer path
    5. NetBackup network performance in the data transfer path
      1.  
        Network interface settings
      2.  
        Network load
      3. Setting the network buffer size for the NetBackup media server
        1.  
          Network buffer size in relation to other parameters
      4.  
        Setting the NetBackup client communications buffer size
      5.  
        About the NOSHM file
      6.  
        Using socket communications (the NOSHM file)
    6. NetBackup server performance in the data transfer path
      1. About shared memory (number and size of data buffers)
        1.  
          Default number of shared data buffers
        2.  
          Default size of shared data buffers
        3.  
          Amount of shared memory required by NetBackup
        4.  
          How to change the number of shared data buffers
        5.  
          Notes on number data buffers files
        6.  
          How to change the size of shared data buffers
        7.  
          Notes on size data buffer files
        8.  
          Size values for shared data buffers
        9.  
          Note on shared memory and NetBackup for NDMP
        10.  
          Recommended shared memory settings
        11.  
          Recommended number of data buffers for SAN Client and FT media server
        12.  
          Testing changes made to shared memory
      2.  
        About NetBackup wait and delay counters
      3.  
        Changing parent and child delay values for NetBackup
      4. About the communication between NetBackup client and media server
        1.  
          Processes used in NetBackup client-server communication
        2.  
          Roles of processes during backup and restore
        3.  
          Finding wait and delay counter values
        4.  
          Note on log file creation
        5.  
          About tunable parameters reported in the bptm log
        6.  
          Example of using wait and delay counter values
        7.  
          Issues uncovered by wait and delay counter values
      5.  
        Estimating the effect of multiple copies on backup performance
      6. Effect of fragment size on NetBackup restores
        1.  
          How fragment size affects restore of a non-multiplexed image
        2.  
          How fragment size affects restore of a multiplexed image on tape
        3.  
          Fragmentation and checkpoint restart
      7. Other NetBackup restore performance issues
        1.  
          Example of restore from multiplexed database backup (Oracle)
    7.  
      NetBackup storage device performance in the data transfer path
  9. Tuning other NetBackup components
    1.  
      When to use multiplexing and multiple data streams
    2.  
      Effects of multiplexing and multistreaming on backup and restore
    3. How to improve NetBackup resource allocation
      1.  
        Improving the assignment of resources to NetBackup queued jobs
      2.  
        Sharing reservations in NetBackup
      3.  
        Disabling the sharing of NetBackup reservations
      4.  
        Disabling on-demand unloads
    4.  
      Encryption and NetBackup performance
    5.  
      Compression and NetBackup performance
    6.  
      How to enable NetBackup compression
    7.  
      Effect of encryption plus compression on NetBackup performance
    8.  
      Information on NetBackup Java performance improvements
    9.  
      Information on NetBackup Vault
    10.  
      Fast recovery with Bare Metal Restore
    11.  
      How to improve performance when backing up many small files
    12. How to improve FlashBackup performance
      1.  
        Adjusting the read buffer for FlashBackup and FlashBackup-Windows
    13.  
      Veritas NetBackup OpsCenter
  10. Tuning disk I/O performance
    1. About NetBackup performance and the hardware hierarchy
      1.  
        About performance hierarchy level 1
      2.  
        About performance hierarchy level 2
      3.  
        About performance hierarchy level 3
      4.  
        About performance hierarchy level 4
      5.  
        Summary of performance hierarchies
      6.  
        Notes on performance hierarchies
    2.  
      Hardware examples for better NetBackup performance

Cloud tier sizing and performance

Sizing and performance of data in the cloud is based on customer need and will vary from customer to customer, which makes providing exact sizing and performance information difficult.

To get data to the cloud, customers can use a simple internet connection if the data to be transmitted is less than the amount of available bandwidth.

Cloud providers may provide a direct connection service. Customers can get a dedicated link to their cloud provider with performance similar to LAN bandwidth inside a data center. They can compress data at the data center before being sent across the network to the cloud provider or use MSDP Cloud (MSDP-C) to optimize the data being sent before it is stored in the cloud. They can also throttle bandwidth, if desired, to prevent over-saturation of the network pipe.

Cloud instance model

Public cloud providers use a regional model in which it has configured various regions across the globe. Each region can have zones that are similar to data centers within the region that communicate with each other over high-bandwidth connections. This setup is similar to a customer having multiple physical data centers in a geographical region that are close enough for low-latency connectivity, yet far enough apart to not be affected by the same natural or artificial disaster.

Data within the region will typically stay within the region, but customers have the option to select a geographically dispersed region that is available for regional DR. Data can be replicated between zones to provide high availability within the cloud for the customer's data. The loss of a single zone does not affect the operations of the others. Customers typically choose to operate within the region that is closest to provide optimized bandwidth for moving data in and out of the Cloud and have the option to select a geographically dispersed region to provide regional DR.

Cloud storage options

One of the many benefits of the cloud storage model is the ability to quickly add storage to environments. Customers don't pay for the storage until it is provisioned. This model is much different from a traditional data center where racks of disks may sit idle until needed, thus increasing total cost of ownership (TCO). If the disk is spinning and generating heat, additional cooling and power could be needed to keep the disk spinning even if it is not currently in use. Although next-generation SSD arrays require less cooling and power, idle disks still increase TCO.

Once data is in the cloud, cloud providers uses various types of storage including block and object storage. Other options include Network-Attached Storage (NAS) services. Sizing of the environment is based on the needs of the customer and the workloads placed in the cloud. Pricing is based on the type of storage chosen and is typically priced per GB. For example, standard object storage typically runs approximately $0.023/GB per month, whereas archive storage currently runs about $0.004/GB per month and even less for archive storage tiers. Cost also depends on the region where the data is stored. (See the cloud provider's price list for current costs.) Archive tiers are typically used as long-term archive storage targets whereby data is moved there automatically using a variety of methods, and a restore from storage could take hours to access versus seconds for a restore from object storage.

Environment description and assumptions for sizing

The following sample sizing guidelines are based on the assumptions listed and were created using the standard NetBackup Appliance Calculator to determine the storage required for each type of workload. This guideline applies to back up in the cloud workloads only.

The following assumptions were used to size this environment:

  • Data assumptions:

    • Data split - 80% FS / 20% DB [ no hypervisor level in the cloud]

    • Daily retention 2 weeks / weekly - 4 weeks / monthly 3 months

    • Daily change rate 2%, and year-over-year (YoY) growth 10% [ sizing for 1 year only]

  • Instance Type workload descriptions:

    • Small - FETB <=100 TB <= 100 concurrent jobs

    • Medium - FETB <=500 TB <= 500 concurrent jobs

    • Large - FETB <=1,000 TB <= 1,000 concurrent jobs

    • Extra-Large - FETB = 1 PB >1,000 concurrent jobs. For mixed object and block storage, total maximum capacity is 1.2PB

For more information, see About the Media Server Deduplication (MSDP) node cloud tier in the NetBackup Deduplication Guide.

NetBackup cloud instance sizing

The architecture is based on a single NetBackup domain consisting of a NetBackup primary server and multiple MSDP media servers in the cloud.

Typically, backups are written directly to local MSDP block storage for an immediate copy, then opt-duped to a cloud tier to send deduplicated data to object storage. However, there is no requirement that backups must go to standard MSDP before sending to cloud. If the solution doesn't require MSDP data to be "local" on block storage, then MSDP-C enables backup data to be sent directly to a cloud tier.

With MSDP-C or MSDP in general, data is deduplicated in the plug-in (source side) layer before writing to the storage server. The storage server assembles 64MB containers in memory and writes them to object storage. If the number of streams x 128MB exceeds the memory limits, disk is used as cache to assemble the containers.

Requirements consist of the following:

  • Data assumptions:

    • NetBackup primary server

      A single NetBackup primary server can be on any supported operating system.

    • NetBackup MSDP media server's block storage

      MSDP media servers receive the initial backups from clients and perform deduplication.

    • NetBackup MSDP media server's cloud tier

      MSDP can have one or more targets on the same storage server that takes the deduplicated backup images from the MSDP media server's block storage and stores them in object storage.

      The MSDP cloud tier is dedicated to performing NetBackup deduplication writes to object storage. It is a dedicated high-end Red Hat server that meets the minimum requirements for the MSDP cloud tier. It takes the deduplicated backup images from the MSDP media servers and stores them in object storage.

    • Backup Workloads (Clients/Agents)

      These are the systems or applications that are being protected.

NetBackup primary server

The NetBackup primary server should be sized according to the standard Veritas guidelines depending on the load placed on the complete NetBackup domain. Plan accordingly for the initial needs of the environment. Cloud providers offer the added benefit of being able to scale up the systems as workloads grow. The solution can scale out by adding additional media server nodes.

Primary Server Memory and CPU Requirements

For details about the recommended memory and CPU processor requirements for the various environment sizes:

See Table: Sizing guidelines.

These estimates are based on the number of media servers and the number of jobs the primary server must support. You may need to increase the amount of RAM and number of processors based on other site-specific factors.

Primary Server Recommendations

Table: Primary server recommendations

Small

Medium

Large

Extra Large

32 GiB / 8 vCPU

64 GiB / 8 vCPU

64 GiB / 16 vCPU

128 GiB / 16 vCPU

Install 500 GB

Catalog 5 GB

Install 500 GB

Catalog 5 GB

Install 500 GB

Catalog 10 GB

Install 500 GB

Catalog 10 GB

NetBackup MSDP storage

NetBackup MSDP storage can reside on either a NetBackup Appliance, a Virtual Appliance or a build-your-own (BYO) virtual or physical host, including a cloud-based virtual instance. This section outlines MSDP in a public cloud Infrastructure-as-a-Service (IaaS)  deployment.

The host computer's CPU and memory constrain how many jobs can run concurrently. The storage server requires sufficient capability for deduplication and for storage management. Processors for deduplication should have a high clock rate and high floating-point performance. Furthermore, high throughput per core is desirable. Each backup stream uses a separate core.

Table: Recommended specifications for MSDP media servers

Hardware component

MSDP media server specification

CPU

  • Veritas recommends at least a 2.2-GHz clock rate. A 64-bit processor is required.

  • At least 4 cores are required. Veritas recommends 8 cores.

  • For 64 TBs of storage, Intel x86-64 architecture requires 8 cores.

RAM

  • From 8 TB to 32 TB of storage, Veritas recommends 1 GB of dedicated RAM for 1 TB of block storage consumed.

  • However, beyond 32 TB storage, Veritas recommends more than 32 GB of RAM for better and enhanced performance.

  • MSDP-C uses a dynamic spooler cache based on previous and currently running backups and does not leverage the traditional persistent fingerprint pool. The size is set in the MaxCloudCacheSize parameter in contentrouter.cfg. The default setting is 20%

  • MSDP-C also will try and leverage memory as an upload/download cache before falling back on disk. This will be relative to the number of concurrent jobs and each job will use 128 MB of upload cache data. The default maximum value for CloudUploadCacheSize is 12 GB, which allows for roughly 90 concurrent jobs.

Storage

  • MSDP block storage will perform best with storage throughput of 250 MB/s or faster. Because many volumes/VMs have a 250 MB/s max, it's recommended to use a RAID0/1 stripe.

  • Start out with the expected needed storage based on deduplication rates. Storage can easily be expanded by adding additional volumes to MSDP.

  • MSDP-C does not use a dedicated cache volume. Rather, it will make non-persistent use of free storage on the MSDP server when needed. By default, MSDP-C does require at least 1 TB free space on the MSDP server per cloud tier, configurable in contentrouter.cfg.

Operating system

  • The operating system must be a supported 64-bit operating system. MSDP-C requires a RHEL/ CentOS 7.3 or later server.

  • See the operating system compatibility list at http://www.netbackup.com/compatibility.

Growing the media server

As the amount of data protected by a server increases, the load requirements on that host will increase. In that scenario, there is a simple solution. You can easily expand any virtual instance or add volumes. Refer to your cloud provider's documentation for information about resizing instances and adding storage to them.

Media Server Deduplication Pool (MSDP) recommendations

Running traditional MSDP in a cloud environment requires specific resources to be available such as a 10G network and volumes with provisioned IOPS. The recommendations below have been formulated using cloud provider kits that address MSDP pools of different sizes. These are just recommendations and specific customer environments may have different needs. Depending on the footprint, any of the environments below would work based on the sizes.

MSDP Considerations

An example MSDP storage pool size is up to 96 TB on Linux:

  • Can be a direct backup target, use Fingerprinting Media Servers or a client-side deduplication target.

  • MSDP will be storing all data on managed disks.

  • The pool will be able to use optimized deduplication or Automated Image Replication (A.I.R.) to send images to any Veritas deduplication-compatible target, including MSDP-C.

Storage considerations

Although multiple deduplication storage servers can exist in a NetBackup domain, storage servers do not share their storage. Each storage server manages its own storage. Deduplication within a storage server is supported, including between different block and cloud Disk Pools. Optimized deduplication is supported between different storage servers in the same primary domain. Automated Image Replication (A.I.R.) is supported for replicating data and image metadata between primary domains. For a small 32 TB MSDP storage pool performing a single stream read or write operation, storage media 250 MB/sec is recommended for enterprise-level performance. Scaling the disk capacity to 250 TB recommends a performing 500 MB/sec transfer rate. Multiple volumes may be used to provision storage; however, each volume should be able to sustain 250 MB/sec of IO. Greater individual data stream capability or aggregate capability may be required to satisfy your objectives for simultaneous writing to and reading from disk. The suggested layout to use in a cloud infrastructure is a striped RAID0 or RAID1 configuration. More complex RAID configurations are not cost-efficient.

Table: Recommended media server sizing guidelines shows recommended NetBackup media server sizing guidelines based on the size of the intended deduplication pool.

Table: Recommended media server sizing guidelines

Deduplication Pool

Storage

Cores

RAM in GB

Network in Gbps

IOPS

10 TB (Small)

1x160 SSD

1x16 TB SSD

8

64

  

1 - 20 TB (Small)

1x80 SSD

1x16 TB SSD

8

32

10

 

32 TB (Medium)

1x80 SSD

2x16 TB SSD

16

32

10

 

1x160 SSD

2x16 TB SSD IOPs - 12,000

36

64

 

12,000

32 - 64 TB (Large)

1x80 SSD

2-4x16 TB SSD

16

64

10

 

1x80 SSD

2-4x16 TB SSD

36

64

10

 

1x160 SSD

2x16 TB SSD IOPs - 12,000

40

160

 

12,000

32 - 64 TB (Large)

1x80 SSD

2-4x16 TB SSD

32

144

10

 

2x320 SSD

2-6x16 TB SSD IOPs -12,000

40

160

10

12,000

Table: Recommended initial NetBackup MSDP-C sizing guidelines

Role

Storage

CPUs

RAM

MSDP-C Small

250 GB SSD

1+TB

4

16 GB

MSDP-C Large

500 GB SSD

1+TB

8

32 GB