Description
1. Identification of the type snapshot required.
Snapshot type can be identified based on customer’s requirements and source environment. The following points are considered while deciding on the type of snapshot.
All snapshots are broadly classified into 2 broad categories:
- Copy-on-Write (C-O-W) (space optimized)
- Mirror Based
A Copy-On-Write based snapshot is suitable if:
- Storage space is limited. (as only changed blocks are copied)
- Frequency of backups is high.
- Retention period for snapshot is low (for instant recovery).
- There are very small amount of changes per snapshot cycle.
- Local client snapshot backup is desired.
A mirror-based snapshot is suitable if:
- Storage space is not a concern.
- Frequency of backups is low (as each backup will require a dedicated snapshot volume)
- Retention period for snapshot is high (for instant recovery)
- Large amount of source data is changed frequently per snapshot cycle.
- Off-host alternate client snapshot backup is desired.
Both the above snapshots can be either: hardware based or software based.
If the source volume belongs to specific hardware array (listed below) that supports snapshots, hardware snapshots can be used. Various options available for hardware snapshot are as follows:
- EMC CLARiiON array provides EMC_CLARiiON _SnapView(_clone and _snapshot)
- EMC symmetrix array provide EMC_TimeFinder(_mirror, _clone and _snap)
- HP EVA arrays provide Vsnaps, snapshot and snapclone type of snapshots
- IBM DS6000 and DS8000 arrays provide IBM_DiskStorage_FlashCopy
- IBM DS4000 series provide IBM_StorageManager_FlashCopy
Below are the legacy (before VxFI) snapshots for hardware/arrays:
- BusinessCopy for mirror snapshots on HP XP series arrays
- ShadowImage for Hitachi data systems disk arrays
- Timefinder for EMC Symmetrix/DMX array series with SYMCLI
If software based snapshots are needed irrespective of array type, the following options are available:
- VxFS_checkpoint
- VxFS_snapshot
- VxVM
- Flashsnap (Veritas VM based snapshot for off host backups)
- NAS_Snapshot
- VSS(Windows)
- nbu_snap
Once the type of snapshot is identified, continue with respective configuration/validation steps as mentioned in the NetBackup 7.1 Administrator’s Guide for Snapshot Clients (linked below).
Software snapshots irrespective of underlying array type can provide snapshot capabilities with some specific software snapshot provider. Configuration steps for the software snapshot provider are explained in NetBackup 7.1 Administrator’s Guide for Snapshot Clients (chapter 8, page 145).
2. Disk/volume setup and configuration at the OS level .
- Allocate disk LUNs from the desired storage array to the systems as required. (Refer respective storage array or hardware manuals for the same)
- Format the allocated LUNs.
- Initialize the LUNs into respective Volume Manager stack (LVM or VxVM)
Example :
For HP LVM – Use commands like vgcreate, pvcreate, etc. (Refer to the operating system's manual pages for the respective commands)
For VxVM, - Use commds like vexed, vxassist, vxvol, mkfs,etc. (Refer to the VxVM Administrator’s Guide)
- Have at least 3 volumes/file-systems allocated:
- Source volume containing data files
- Snapshot volume (for mirror / COW based snapshots)
- Separate volume for database executables, redo logs, other files .
Once the volumes are configured, install the database files. Note that only data files must be located on the source volume.
3. Database specific configurations :
All the database management systems mentioned below provide some sort of Application Programming Interfaces (APIs) which NetBackup uses to perform the backup and restore operations. These APIs are used in the respective database agent module like Oracle Agent, SAP Agent, etc.
SAP, Oracle and DB2 provide two ways to backup the files: stream-based backup and File-based backup.
In the case of a stream-based backup, the DBMS is responsible for moving the backup data as streams. However, when taking snapshots, NetBackup must control the movement of data. Hence, there is a special method called “proxy method” supported by these database systems. The proxy method allows NetBackup to control to movement of data.
Each database agent has its own log directory where it logs all the relevant operations that have been performed.
Below are the few key points to be considered while taking snapshot based backups of various databases.
- DB2 :
- “proxy copy” method must be used for taking snapshots.
- Use the 'bpdb2proxy' command to perform a snapshot based backup of DB2 databases. (Example : "bpdb2proxy -backup -d sample -s 3 -n 0")
- Symbolic links (if any) must point to files on the same volume / file system.
- Snapshot backups do not back up all database objects. Your backup configuration must include policies to perform file-based and stream-based backups. DB2 does not support proxy backups of transaction logs.
- Snapshot backups must be initiated from a backup script. A template cannot be used to initiate a snapshot backup
- SAP :
- The "util_file_online" option of brbackup must be used to perform non-rman snapshot based backups
- In case of RMAN based SAP backups, enable proxy based backup by setting the environment variable :
- rman_proxy = yes
- Most of the points listed in Oracle section also apply to SAP when used with RMAN.
- For performing rollback restore, set the environment variable : S AP_RESTORE=rollback
- Snapshots are not supported for MAXDB backend.
- Oracle :
- The RMAN BACKUP command is used to initiate stream-based backups for datafiles, archive logs, and control files. These backups do not use snapshots.
- If the PROXY keyword is added to the RMAN BACKUP command, then the database, tablespaces, or data files can be backed up using a snapshot if the policy is configured appropriately.
- For control files and archived redo logs, Oracle RMAN performs conventional stream-based backups only. NetBackup for Oracle must use stream-based backups for control files and archived redo logs even when you use Snapshot Client methods for the other database objects. However, Oracle 10g extends RMAN functionality to allow the PROXY keyword to be used on the RMAN BACKUP ARCHIVELOGS command.
- Snapshot backups must be initiated from a RMAN script. A template cannot be used to initiate a snapshot backup
- MS-Sharepoint :
- Only VSS snapshots are supported.
- Snapshots are only used for GRT backups.
- Lotus
- Snapshots are not supported for standard lotus databases (plain NSF files) .
- However, Lotus databases can be present within a DB2 database at the backend. The snapshot capabilities of the Netbackup DB2 agent can be used in such an environment.
- Informix :
- Snapshots are not supported for Informix database backups.
4. Environment validation & Best practices :
- Confirm that data files are located on a separate dedicated volume and other files like oracle executables, redo logs, parameter file, control files are located separately.
- Most database agents support snapshot of only data files.
- Since snapshots are always volume level, data should not be co-located on the same file system along with the other files such as control files, redo logs, parameter files, database executable files, etc. This is because, during a snapshot, the source file system is freezed, thus making the other files unavailable.
- Hence, while installing a database, data files should always be located on a separate volume which is the source volume for the snapshot.
- For other Veritas products such as the Veritas File System and Volume Manager or Storage Foundation, install the latest patches and updates for those products.
- A snapshot may not be removed if there is system failure such as such as a system crash or abnormal backup termination. In that case, remove the snapshot manually. (See the Netbackup Snapshot Client Administrator’s Guide for more information on Removing a snapshot.)
- During snapshot rollback, if the data file you want to restore has not changed since it was backed up, the rollback may fail. Initiate the restore from a script and use the FORCE option. (See the Netbackup Snapshot Client Administrator’s Guide )
- For off-host alternate client method, your snapshot mirror must be visible/exposed on alternate client and can be imported on the alternate client successfully.
- In case of hardware based snapshots, the respective CLI/API libraries from the array vender must be compatible and installed properly.
5. Troubleshooting and typical issues:
There are various failures encountered during backups of database with snapshot configurations. The most important logs to look at are database agent logs and the bpfis/bppfi log. The following table shows the primary logs to be examined after the failure for various database agents :
|
BACKUP |
RESTORE |
SAP |
bphdb, backint, bpfis, bpbkar, bpbrm, user_ops, progress log |
backint, bpfis, bppfi, tar, bpbrm, user_ops, progress log |
Oracle |
bphdb, dbclient, bpfis, bpbkar, bpbrm, user_ops, progress log |
dbclient, bpfis , bppfi, tar, bpbrm, user_ops, progress log |
MS-Exchange |
bpbkar, bpfis, bpbrm, BEDS, bpresolver (Exchange 2010) |
tar(for streamed GRT), ncfgre(for non-stream GRT), bpfis, bppfi, bpbrm, BEDS |
MS-SQL Server |
dbclient, bpbkar, bpfis, bpbrm, user_ops, progress log |
dbclient, tar, bpfis, bppfi, bpbrm, user_ops, progress log |
DB2 |
bphdb, dbclient, bpdbsdb2, bpbkar, bpfis, bpbrm, bpdb2, user_ops |
dbclient, tar, bpfis, bppfi, bpbrm, bpdb2, bpubsdb2, user_ops |
MS-Sharepoint |
nbfsd, bpbkar, bpfis, bpbrm, BEDS, bpresolver, event viewer logs. |
ncf(6.5.x), ncfgre(7.x), nbfsd, bpbrm, BEDS |
Problems with backups/restores usually occur in the following three components:
1. Database Agent
2. Snapshot mechanism
3. Other NetBackup area
- Examine the database agent log to check if the error occurs before the database is queisced. In that case, the problem is usually with the database agent configuration or policy configuration. Recheck these configurations.
- If the database agent log indicates that the database is successfully quiesced, check the bpbrm log on whether the bpfis process was started on the client. If the bpfis process is successfully started, check the bpfis logs on the client. An error in the bpfis logs can have several different causes. Typically, bpfis fails with status code 156 which can be due to different reasons.
If the first job in the activity monitor is successful, it indicates that the snapshot was taken successfully. It also indicates that the database is also unquiesced successfully and can be back online. The database agent log will help in confirming this.
- Check the bpbrm log to determine if the bpfis and bpbkar processes were launched on the client.
- Check the bpfis log to determine whether the snapshot was mounted correctly.
- Check the bpbkar log to determine if the backup was done correctly.
- Check the bptm log to determine if the data was written correctly on the tape.
If successful, then bppfi will invoke “bpfis restore” command to perform volume level rollback.
If a copy back restore from a snapshot is being opted, then 'bppficorr' will invoke 'bppfi' to validate snapshot fragment and on success 'bppfi' will mount the snapshot volume, construct a filelist to be restored(copy backed from snapshot) and invoke 'bpbkar' by passing it the file list to be restored.
If a tape image restore is opted, restore will happen via the 'tar' process as in the standard process flow.