Veritas NetBackup for Hadoop Administrator's Guide

Last Published:
Product(s): NetBackup (8.1)
  1. Introduction
    1.  
      Protecting Hadoop data using NetBackup
    2.  
      Backing up Hadoop data
    3.  
      Restoring Hadoop data
    4.  
      Deploying the Hadoop plug-in
    5.  
      NetBackup for Hadoop terminologies
    6.  
      Limitations
  2. Installing and deploying Hadoop plug-in for NetBackup
    1.  
      About installing and deploying the Hadoop plug-in
    2. Pre-requisites for installing the Hadoop plug-in
      1.  
        Operating system and platform compatibility
      2.  
        License for Hadoop plug-in for NetBackup
    3.  
      Best practices for deploying the Hadoop plug-in
    4.  
      Preparing the Hadoop cluster
    5.  
      Downloading the Hadoop plug-in
    6.  
      Installing the Hadoop plug-in
    7.  
      Verifying the installation of the Hadoop plug-in
  3. Configuring NetBackup for Hadoop
    1.  
      About configuring NetBackup for Hadoop
    2. Managing backup hosts
      1.  
        Whitelisting a NetBackup client on NetBackup master server
      2.  
        Configure a NetBackup Appliance as a backup host
    3.  
      Adding Hadoop credentials in NetBackup
    4. Configuring the Hadoop plug-in using the Hadoop configuration file
      1.  
        Configuring NetBackup for a highly-available Hadoop cluster
      2.  
        Configuring a custom port for the Hadoop cluster
      3.  
        Configuring number of threads for backup hosts
    5.  
      Configuration for a Hadoop cluster that uses Kerberos
    6. Configuring NetBackup policies for Hadoop plug-in
      1. Creating a BigData backup policy
        1. Creating BigData policy using the NetBackup Administration Console
          1.  
            Using the Policy Configuration Wizard to create a BigData policy for Hadoop clusters
          2.  
            Using the NetBackup Policies utility to create a BigData policy for Hadoop clusters
        2.  
          Using NetBackup Command Line Interface (CLI) to create a BigData policy for Hadoop clusters
    7.  
      Disaster recovery of a Hadoop cluster
  4. Performing backups and restores of Hadoop
    1. About backing up a Hadoop cluster
      1.  
        Pre-requisite for running backup and restore operations for a Hadoop cluster with Kerberos authentication
      2.  
        Backing up a Hadoop cluster
      3.  
        Best practices for backing up a Hadoop cluster
    2. About restoring a Hadoop cluster
      1. Restoring Hadoop data on the same Hadoop cluster
        1.  
          Using the Restore Wizard to restore Hadoop data on the same Hadoop cluster
        2.  
          Using the bprestore command to restore Hadoop data on the same Hadoop cluster
      2.  
        Restoring Hadoop data on an alternate Hadoop cluster
      3.  
        Best practices for restoring a Hadoop cluster
  5. Troubleshooting
    1.  
      About troubleshooting NetBackup for Hadoop issues
    2.  
      About NetBackup for Hadoop debug logging
    3. Troubleshooting backup issues for Hadoop data
      1.  
        Backup operation for Hadoop fails with error code 6599
      2.  
        Backup operation fails with error 6609
      3.  
        Backup operation failed with error 6618
      4.  
        Backup operation fails with error 6647
      5.  
        Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed up or restored for Hadoop
      6.  
        Backup operation fails with error 6654
      7.  
        Backup operation fails with bpbrm error 8857
      8.  
        Backup operation fails with error 6617
      9.  
        Backup operation fails with error 6616
    4. Troubleshooting restore issues for Hadoop data
      1.  
        Restore fails with error code 2850
      2.  
        NetBackup restore job for Hadoop completes partially
      3.  
        Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed up or restored for Hadoop
      4.  
        Restore operation fails when Hadoop plug-in files are missing on the backup host
      5.  
        Restore fails with bpbrm error 54932
      6.  
        Restore operation fails with bpbrm error 21296

Using NetBackup Command Line Interface (CLI) to create a BigData policy for Hadoop clusters

You can also use the CLI method to create a BigData policy for Hadoop.

To create a BigData policy using NetBackup CLI method

  1. Log on as an Administrator.
  2. Navigate to /usr/openv/netbackup/bin/admincmd.
  3. Create a new BigData policy using the default settings.

    bppolicynew policyname

  4. View the details about the new policy using the -L option.

    bpplinfo policyname -L

  5. Modify and update the policy type as BigData.

    bpplinfo PolicyName -modify -v -M MasterServerName -pt BigData

  6. Specify the Application_Type as Hadoop.

    For Windows:

    bpplinclude PolicyName -add "Application_Type=hadoop"

    For UNIX:

    bpplinclude PolicyName -add 'Application_Type=hadoop'

    Note:

    The parameter values for Application_Type=hadoop are case-sensitive.

  7. Specify the backup host on which you want the backup operations to be performed for Hadoop.

    For Windows:

    bpplinclude PolicyName -add "Backup_Host=IP_address or hostname"

    For UNIX:

    bpplinclude PolicyName -add 'Backup_Host=IP_address or hostname'

    Note:

    The backup host must be a Linux computer. The backup host can be a NetBackup client or a media server or a master server.

  8. Specify the Hadoop directory or folder name that you want to backup.

    For Windows:

    bpplinclude PolicyName -add "/hdfsfoldername"

    For UNIX:

    bpplinclude PolicyName -add '/hdfsfoldername'

    Note:

    Directory or folder used for backup selection while defining BigData Policy with Application_Type=hadoop must not contain space or comma in their names.

  9. Modify and update the policy storage type for BigData policy.

    bpplinfo PolicyName -residence STUName -modify

  10. Specify the IP address or the host name of the NameNode for adding the client details.

    For Windows:

    bpplclients PolicyName -M "MasterServerName" -add "HadoopServerNameNode" "Linux" "RedHat"

    For UNIX:

    bpplclients PolicyName -M 'MasterServerName' -add 'HadoopServerNameNode' 'Linux' 'RedHat'

  11. Assign a schedule for the created BigData policy as per your requirements.

    bpplsched PolicyName -add Schedule_Name -cal 0 -rl 0 -st sched_type -window 0 0

    Here, sched_type value can be specified as follows:

    Schedule Type

    Description

    FULL

    Full backup

    INCR

    Differential Incremental backup

    CINC

    Cumulative Incremental backup

    TLOG

    Transaction Log

    UBAK

    User Backup

    UARC

    User Archive

    The default value for sched_type is FULL.

    Once you set the schedule, Hadoop data is backed up automatically as per the set schedule without any further user intervention.

  12. Alternatively, you can also perform a manual backup for Hadoop data.

    For performing a manual backup operation, execute all the steps from Step 1 to Step 11.

  13. For a manual backup operation, navigate to /usr/openv/netbackup/bin

    Initiate a manual backup operation for an existing BigData policy using the following command:

    bpbackup -i -p PolicyName -s Schedule_Name -S MasterServerName -t 44

    Here, -p refers to policy, -s refers to schedule, -S refers to master server, and -t 44 refers to BigData policy type.