NetBackup™ Backup Planning and Performance Tuning Guide

Last Published:
Product(s): NetBackup (10.3.0.1, 10.2, 10.1.1, 10.1, 10.0.0.1, 10.0, 9.1.0.1, 9.1, 9.0.0.1, 9.0, 8.3.0.2, 8.3.0.1, 8.3)
  1. NetBackup capacity planning
    1.  
      Purpose of this guide
    2.  
      Changes in Veritas terminology
    3.  
      Disclaimer
    4.  
      How to analyze your backup requirements
    5.  
      How to calculate the size of your NetBackup image database
    6. Sizing for capacity with MSDP
      1. Key sizing parameters
        1.  
          Data types and deduplication
        2.  
          Determining FETB for workloads
        3.  
          Retention periods
        4.  
          Change rate
        5.  
          Replication and duplication of backups
        6.  
          Sizing calculations for MSDP clients
    7.  
      About how to design your OpsCenter server
  2. Primary server configuration guidelines
    1.  
      Note for users of NetBackup 10.2 or later
    2.  
      Size guidance for the NetBackup primary server and domain
    3.  
      Factors that limit job scheduling
    4.  
      More than one backup job per second
    5.  
      Stagger the submission of jobs for better load distribution
    6.  
      NetBackup job delays
    7.  
      Selection of storage units: performance considerations
    8.  
      About file system capacity and NetBackup performance
    9.  
      About the NetBackup catalog
    10.  
      Guidelines for managing the catalog
    11.  
      Adjusting the batch size for sending metadata to the NetBackup catalog
    12.  
      Methods for managing the catalog size
    13.  
      Performance guidelines for NetBackup policies
    14.  
      Legacy error log fields
  3. Media server configuration guidelines
    1. NetBackup hardware design and tuning considerations
      1.  
        PCI architecture
      2.  
        Central processing unit (CPU) trends
      3.  
        Storage trends
      4.  
        Conclusions
    2. About NetBackup Media Server Deduplication (MSDP)
      1.  
        Data segmentation
      2.  
        Fingerprint lookup for deduplication
      3.  
        Predictive and sampling cache scheme
      4.  
        Data store
      5.  
        Space reclamation
      6.  
        System resource usage and tuning considerations
      7.  
        Memory considerations
      8.  
        I/O considerations
      9.  
        Network considerations
      10.  
        CPU considerations
      11.  
        OS tuning considerations
      12. MSDP tuning considerations
        1.  
          Sample steps to change MSDP contentrouter.cfg
      13. MSDP sizing considerations
        1.  
          Data gathering
        2.  
          Leveraging requirements and best practices
    3.  
      Cloud tier sizing and performance
    4. Accelerator performance considerations
      1.  
        Accelerator for file-based backups
      2.  
        Controlling disk space for Accelerator track logs
      3.  
        Accelerator for virtual machine backups
      4.  
        Forced rescan schedules
      5.  
        Reporting the amount of Accelerator data transferred over the network
      6.  
        Accelerator backups and the NetBackup catalog
  4. Media configuration guidelines
    1.  
      About dedicated versus shared backup environments
    2.  
      Suggestions for NetBackup media pools
    3.  
      Disk versus tape: performance considerations
    4.  
      NetBackup media not available
    5.  
      About the threshold for media errors
    6.  
      Adjusting the media_error_threshold
    7.  
      About tape I/O error handling
    8.  
      About NetBackup media manager tape drive selection
  5. How to identify performance bottlenecks
    1.  
      Introduction
    2.  
      Proper mind set for performance issue RCA
    3.  
      The 6 steps of performance issue RCA and resolution
    4. Flowchart of performance data analysis
      1.  
        How to create a workload profile
  6. Best practices
    1.  
      Best practices: NetBackup SAN Client
    2. Best practices: NetBackup AdvancedDisk
      1.  
        AdvancedDisk performance considerations
      2.  
        Exclusive use of disk volumes with AdvancedDisk
      3.  
        Disk volumes with different characteristics
      4.  
        Disk pools and volume managers with AdvancedDisk
      5.  
        Network file system considerations
      6.  
        State changes in AdvancedDisk
    3.  
      Best practices: Disk pool configuration - setting concurrent jobs and maximum I/O streams
    4.  
      Best practices: About disk staging and NetBackup performance
    5.  
      Best practices: Supported tape drive technologies for NetBackup
    6. Best practices: NetBackup tape drive cleaning
      1.  
        How NetBackup TapeAlert works
      2.  
        Disabling TapeAlert
    7.  
      Best practices: NetBackup data recovery methods
    8.  
      Best practices: Suggestions for disaster recovery planning
    9.  
      Best practices: NetBackup naming conventions
    10.  
      Best practices: NetBackup duplication
    11.  
      Best practices: NetBackup deduplication
    12. Best practices: Universal shares
      1.  
        Benefits of universal shares
      2.  
        Configuring universal shares
      3.  
        Tuning universal shares
    13. NetBackup for VMware sizing and best practices
      1.  
        Configuring and controlling NetBackup for VMware
      2.  
        Discovery
      3.  
        Backup and restore operations
    14.  
      Best practices: Storage lifecycle policies (SLPs)
    15.  
      Best practices: NetBackup for Nutanix AHV
    16.  
      Best practices: NetBackup Sybase database
    17.  
      Best practices: Avoiding media server resource bottlenecks with Oracle VLDB backups
    18.  
      Best practices: Avoiding media server resource bottlenecks with MSDPLB+ prefix policy
    19.  
      Best practices: Cloud deployment considerations
  7. Measuring Performance
    1.  
      Measuring NetBackup performance: overview
    2.  
      How to control system variables for consistent testing conditions
    3.  
      Running a performance test without interference from other jobs
    4.  
      About evaluating NetBackup performance
    5.  
      Evaluating NetBackup performance through the Activity Monitor
    6.  
      Evaluating NetBackup performance through the All Log Entries report
    7. Table of NetBackup All Log Entries report
      1.  
        Additional information on the NetBackup All Log Entries report
    8. Evaluating system components
      1.  
        About measuring performance independent of tape or disk output
      2.  
        Measuring performance with bpbkar
      3.  
        Bypassing disk performance with the SKIP_DISK_WRITES touch file
      4.  
        Measuring performance with the GEN_DATA directive (Linux/UNIX)
      5.  
        Monitoring Linux/UNIX CPU load
      6.  
        Monitoring Linux/UNIX memory use
      7.  
        Monitoring Linux/UNIX disk load
      8.  
        Monitoring Linux/UNIX network traffic
      9.  
        Monitoring Linux/Unix system resource usage with dstat
      10.  
        About the Windows Performance Monitor
      11.  
        Monitoring Windows CPU load
      12.  
        Monitoring Windows memory use
      13.  
        Monitoring Windows disk load
    9.  
      Increasing disk performance
  8. Tuning the NetBackup data transfer path
    1.  
      About the NetBackup data transfer path
    2.  
      About tuning the data transfer path
    3.  
      Tuning suggestions for the NetBackup data transfer path
    4.  
      NetBackup client performance in the data transfer path
    5. NetBackup network performance in the data transfer path
      1.  
        Network interface settings
      2.  
        Network load
      3. Setting the network buffer size for the NetBackup media server
        1.  
          Network buffer size in relation to other parameters
      4.  
        Setting the NetBackup client communications buffer size
      5.  
        About the NOSHM file
      6.  
        Using socket communications (the NOSHM file)
    6. NetBackup server performance in the data transfer path
      1. About shared memory (number and size of data buffers)
        1.  
          Default number of shared data buffers
        2.  
          Default size of shared data buffers
        3.  
          Amount of shared memory required by NetBackup
        4.  
          How to change the number of shared data buffers
        5.  
          Notes on number data buffers files
        6.  
          How to change the size of shared data buffers
        7.  
          Notes on size data buffer files
        8.  
          Size values for shared data buffers
        9.  
          Note on shared memory and NetBackup for NDMP
        10.  
          Recommended shared memory settings
        11.  
          Recommended number of data buffers for SAN Client and FT media server
        12.  
          Testing changes made to shared memory
      2.  
        About NetBackup wait and delay counters
      3.  
        Changing parent and child delay values for NetBackup
      4. About the communication between NetBackup client and media server
        1.  
          Processes used in NetBackup client-server communication
        2.  
          Roles of processes during backup and restore
        3.  
          Finding wait and delay counter values
        4.  
          Note on log file creation
        5.  
          About tunable parameters reported in the bptm log
        6.  
          Example of using wait and delay counter values
        7.  
          Issues uncovered by wait and delay counter values
      5.  
        Estimating the effect of multiple copies on backup performance
      6. Effect of fragment size on NetBackup restores
        1.  
          How fragment size affects restore of a non-multiplexed image
        2.  
          How fragment size affects restore of a multiplexed image on tape
        3.  
          Fragmentation and checkpoint restart
      7. Other NetBackup restore performance issues
        1.  
          Example of restore from multiplexed database backup (Oracle)
    7.  
      NetBackup storage device performance in the data transfer path
  9. Tuning other NetBackup components
    1.  
      When to use multiplexing and multiple data streams
    2.  
      Effects of multiplexing and multistreaming on backup and restore
    3. How to improve NetBackup resource allocation
      1.  
        Improving the assignment of resources to NetBackup queued jobs
      2.  
        Sharing reservations in NetBackup
      3.  
        Disabling the sharing of NetBackup reservations
      4.  
        Disabling on-demand unloads
    4.  
      Encryption and NetBackup performance
    5.  
      Compression and NetBackup performance
    6.  
      How to enable NetBackup compression
    7.  
      Effect of encryption plus compression on NetBackup performance
    8.  
      Information on NetBackup Java performance improvements
    9.  
      Information on NetBackup Vault
    10.  
      Fast recovery with Bare Metal Restore
    11.  
      How to improve performance when backing up many small files
    12. How to improve FlashBackup performance
      1.  
        Adjusting the read buffer for FlashBackup and FlashBackup-Windows
    13.  
      Veritas NetBackup OpsCenter
  10. Tuning disk I/O performance
    1. About NetBackup performance and the hardware hierarchy
      1.  
        About performance hierarchy level 1
      2.  
        About performance hierarchy level 2
      3.  
        About performance hierarchy level 3
      4.  
        About performance hierarchy level 4
      5.  
        Summary of performance hierarchies
      6.  
        Notes on performance hierarchies
    2.  
      Hardware examples for better NetBackup performance

PCI architecture

Peripheral Component Interconnect (PCI), PCI-X architecture was the first step of sending PCI signals quickly to the peripheral cards such as Ethernet NICs, Fibre Channel and Parallel SCSI Host Bus Adapters as well as RAID controllers, all of which enabled RAID storage and advanced connectivity to many servers in a network.

PCI-X was a very good start, beginning in 1998, to the solution. It was a parallel interface and utilized an expander to derive multiple "slots" from the signals sent from the CPUs. With parallel architecture, timing of the signals needed to be rigidly enforced as they all needed to arrive or be sent at the concurrent times. With this restriction, the overall speed and latency of the system was limited to the frequency of the timing circuitry in the hardware. As the speed market needs kept increasing, the difficulty of maintaining the concurrent timing became more and more difficult.

PCIe came into being in 2002 and provided the change to the Peripheral Component Interconnect with two features; serial communication and direct communication from the processor to the PCIe enabled card, NIC, HBA, RAID, and so on. This allowed for a significant increase in bandwidth as multiple lanes of PCIe could be allocated to the cards. As an example, Fibre Channel Host Bus Adapters in 1998 had speeds of 1 Gb with PCI-X, and today, 22 years later, the standard is 16Gb and 32Gb is expected to surpass 16Gb in the next two years.

PCI-X @ 133Mhz was the last widely supported speed. PCIe supplanted PCI-X at 800MB/s data transfer speed. PCIe 3 can today achieve up to 15.745GB/s with 16 lane cards. PCIe 4, which is available today on AMD processor systems and will be available on Intel based systems in 2021, can reach 15.745GB/s with 8 lane cards as PCIe 4 doubles the transfer rates of the current PICe 3 Architecture. The following page notes the speed capability of the versions past and future. By 2026, the supported PCIe throughput is expected to increase 8-fold.

It is expected that the number of PCIe lanes per processor will increase rapidly in the future. It's easy to see that the race to increase lanes dramatically is on by reviewing currently available processors. For example, the Intel processor family has 40 PCIe lanes, and AMD has countered with 128 lanes per processor.

Table: PCI Express Link performance

Version

Introduced

Line code

Transfer rate

Throughput

1 lane

2 lane

4 lane

8 lane

16 lane

1.0

2003

8b/10b

2.5 GT/s

0.250GB/s

0.500GB/s

01.00GB/s

2.00GB/s

4.00GB/s

2.0

2007

8b/10b

5.0 GT/s

0.500GB/s

1.00GB/s

2.00GB/s

4.00GB/s

8.00GB/s

3.0

2010

128b/130b

8.0 GT/s

0.985GB/s

01.969GB/s

3.938GB/s

7.877GB/s

15.754GB/s

4.0

2017 (now on AMD)

128b/130b

16.0GT/s

1.969GB/s

3.938GB/s

7.877GB/s

15.754GB/s

31.508GB/s

5.0

2019 (projected 2022)

128b/130b

32.0GT/s

3.938GB/s

7.877GB/s

15.754GB/s

31.508GB/s

63.015GB/s

6.0

2021 (projected 2024)

128b/130b+ PAM-4+ECC

64.0GT/s

7.877GB/s

15.754GB/s

31.508GB/s

63.015GB/s

126.031GB/s

With the advance in speed of the CPU to peripheral communication, both latency (decreased lag) and data transfer rate (more data per second) have improved dramatically. Intel CPUs will increase from present Skylake and Cascade Lake families of 40 PCIe 3 lanes to Ice Lake with 64 PCIe 4 lanes. As noted earlier, AMD had built their processors with 128 PCIe4 lanes. The reason for this up trend is peripherals other than Ethernet, Fibre Channel, and RAID are quickly earning a place on the bus.

NVMe SSDs (Non-Volatile Memory express, Solid State Drives) have quickly carved out a significant niche in the market. The primary advantage they possess is the use of PCIe connection to the processor. These SSDs do not require a SAS or SATA interface to communicate, which results in significant speed and latency advantages because the inherent media conversion is not needed. With the aforementioned PCIe 4 coming into being and the expansion of the number of PCIe lanes, the speed of the NVMe SSD will double, increasing throughput and decreasing (slightly but measurable) access timing.

The latest designs of Intel and AMD motherboards accommodate the NVMe architecture as the primary storage for the future. It is expected that in 2021, systems with the new architecture will be available and will have density up to 12.8TB, speeds of 8,000 MB/s reads and 3,800 MB/s writes and 24 SSDs. These new systems will be dramatically faster than earlier disk-based solutions, which can struggle to reach 10 - 12GB/s reads or writes. The new architecture will also increase network reads and writes; 200GB/s is not a difficult to reach read nor is a 100 GB/s Write. As a scale perspective a 30TB backup at 0% deduplication would take 8 seconds with the proper transport: 304 connections of 100Gb Ethernet NIC ports. Now, this is not going to be the kind of network bandwidth we can expect, but it is illustrative of the coming speeds.

The future of the PCI-e technology is to move to PCI-e 5.0 starting in 2022 and 6.0 in 2024. This cadence would appear to be rather optimistic given the past history of the PCIe revisions as shown below.

Figure: PCI Bandwidth over time

PCI Bandwidth over time

It should be noted, however, that the specifications for revisions 5.0 and 6.0 are well defined. The significant challenge appears to be, from the delays that were incurred on the 4.0 release, with routing on motherboards. It stands to reason that the PCIe 5 and 6 will be relegated to very high-end systems initially, such as 4 and 8 socket systems that can more adequately use the additional bandwidth and number of lanes.