This article provides additional information and clarification to the section on Partial Catalog Replication in the Highly Available Environments Administrators Guide.
In partial catalog replication only the image database, policy, and the client configuration are replicated, the relational database components are not replicated. This allows the media servers and the devices can be preconfigured in the disaster recovery domain. They do not need to be rediscovered in the event of a fail over to the secondary master server. As partial catalog replication does not replicate the relational database components of the NetBackup catalog additional steps are required following a fail over to the disaster recovery master server before backups can be restored.
Note that the tapes from the production domain are not assigned in the disaster recovery domain. They must be manually added to the database and placed in a pool where they cannot get accidentally overwritten - this can also be done using a combination of barcode rules and the robot inventory command.
Preparing an environment for partial catalog replication
The catalog image metadata, which is required to run restore operations, is stored in the relational database so a backup of the relational database must be taken at regular intervals and replicated along with the flat file information.
1. Change the configuration on the source (production) master server to ensure that the staging area for the relational database is located on the replicated storage. It can be achieved as follows:
· Create a suitable directory on the replicated storage.
· Use the command nbdb_admin –vxdbms_nb_staging <directory> to make this directory the staging area.
2. Backup the relational database to the staging area several times per day (ideally hourly) by running the following command in a scheduled script:
nbdb_backup –online <directory>-truncate_tlog
Recovering the environment with partial catalog replication
In the event of a loss of the source master server (or during a disaster recover test) follow these steps:
1. Ensure that replication between the primary and the secondary sites is stopped.
Replication stops if the primary master server is unavailable or if the replication link is disabled.
2. Mount the replicated volume to the appropriate mount point on the secondary master server.
3. Use the command nbdb_admin –vxdbms_nb_staging <directory> on the target (disaster recovery) master server to point the staging area for the relational database to the location on the replicated storage.
4. Run the command cat_export –all –staging to export the metadata from the replicated relational database backup.
5. Run the command cat_import –all to import the exported metadata into the active relational database.
6. Start NetBackup on the secondary master server.
7. If the backup policies are replicated, deactivate all backup policies to prevent backups from starting automatically using the NetBackup Administration Console or the bppllist <policy> -set -inactive command.
8. Ensure that the appropriate FAILOVER_RESTORE_MEDIA_SERVER settings are defined to direct restore operations through the media servers at the secondary site.
9. In order to restore backups from tapes the tapes must be added to the disaster recovery master server’s catalog by placing them in a tape library and running an inventory of the library. To prevent the tapes from being accidently overwritten the disaster recovery master server should have a bar code rule that adds the tapes to a volume pool that is not the global scratch pool and is not used by any backup polices. Ideally the tapes should also be physically write locked.
10. For disk based backups the storage servers and disk pools must be added to the disaster recovery master server by running the disk storage server wizard. Once the disk storage is present the following command must be run to reconcile the disk media IDs:
nbcatsync –backupid <catalog backup ID> -prune_catalog
The value <catalog backup ID> is the backup ID of the most recent catalog backup and can be found in the catalog backup’s disaster recovery file.
Once the tapes have been added and the disk media IDs have been reconciled it is possible to start restore operations
Making the disaster recovery environment consistent with partial catalog replication
In the event of a major incident at the production site, operate from the disaster recovery site for some time after the recovery is completed. The following additional tasks may be optionally carried out once the disaster recovery environment is operational to make the disaster recovery environment consistent.
To make the disaster recovery environment consistent with partial catalog replication
1. Modify and enable the catalog backup policy and any other backup policies that are required in the disaster recovery domain.
2. Delete the policies that are no longer required.
3. As the tapes are not assigned on the disaster recovery master server they will not be released to the global scratch pool when backups expire and must be manually recycled. Care must be taken to ensure the tapes are manually moved to the global scratch pool only when there are no valid backups remaining on them. The simplest way of checking this is to create two lists by running the commands bpimagelist –d "01/01/1970 00:00:00" –media –l and vmquery –pn <private pool name> -b and comparing the lists. Tapes found in the second list but not found in the first list have no valid images on them and can be moved to the scratch pool by running the command vmchange –p <scratch pool number> -m <media id>.