NetBackup™ Deployment Guide for Azure Kubernetes Services (AKS) Cluster
- Introduction to NetBackup on AKS
- Deployment with environment operators
- Assessing cluster configuration before deployment
- Deploying NetBackup
- Preparing the environment for NetBackup installation on AKS
- Recommendations of NetBackup deployment on AKS
- Limitations of NetBackup deployment on AKS
- About primary server CR and media server CR
- Monitoring the status of the CRs
- Updating the CRs
- Deleting the CRs
- Configuring NetBackup IT Analytics for NetBackup deployment
- Managing NetBackup deployment using VxUpdate
- Migrating the node pool for primary or media servers
- Upgrading NetBackup
- Deploying Snapshot Manager
- Migration and upgrade of Snapshot Manager
- Deploying MSDP Scaleout
- Upgrading MSDP Scaleout
- Monitoring NetBackup
- Monitoring MSDP Scaleout
- Monitoring Snapshot Manager deployment
- Managing the Load Balancer service
- Performing catalog backup and recovery
- Managing MSDP Scaleout
- About MSDP Scaleout maintenance
- Uninstalling MSDP Scaleout from AKS
- Uninstalling Snapshot Manager
- Troubleshooting
- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Pod restart failure due to liveness probe time-out
- Socket connection failure
- Resolving an invalid license key issue
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving token issues
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolving a token expiry issue
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Data migration unsuccessful even after changing the storage class through the storage yaml file
- Host validation failed on the target host
- Primary pod is in pending state for a long duration
- Taint, Toleration, and Node affinity related issues in cpServer
- Operations performed on cpServer in environment.yaml file are not reflected
- Host mapping conflict in NetBackup
- NetBackup messaging queue broker take more time to start
- Local connection is getting treated as insecure connection
- Issue with capacity licensing reporting which takes longer time
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Primary pod goes in non-ready state
- Appendix A. CR template
Procedure to rollback when upgrade fails
Note:
The rollback procedure in this section can be performed only after assuming that the customer has taken catalog backup before performing the upgrade.
Perform the following steps to rollback from upgrade failure and install the NetBackup version prior to upgrade
- Delete the environment CR object using the following command and wait until all the underlying resources are cleaned up:
kubectl delete environment.netbackup.veritas.com <environment name> -n <namespace>
For example, primary server CR, media server CR, MSDP CR and their underlined resources.
- Delete the new operator which is deployed during upgrade using the following command:
kubectl delete -k <new-operator-directory>
This will delete the new operator and new CRDs.
- Apply the NetBackup operator directory which was preserved (the directory which was used to install operator before upgrade) using the following command:
kubectl apply -k <operator_directory>
- Get names of PV attached to primary server PVC (data, catalog and log) using the following command:
kubectl get pvc -n <namespace> -o wide
- Delete the primary server PVC (data, catalog and log) using the following command:
kubectl delete pvc <pvc-name> -n <namespace>
- Delete the PV linked to primary server PVC using the following command:
kubectl delete pv <pv-name> command
- Edit the preserved
environment.yamlfile (from older version of NetBackup package directory) and remove keySecret section from MSDP Scaleout section. Also change the CR spec paused: false to paused: true for every object in MSDP Scaleout and media servers section. - Apply the edited
environment.yamlfile using the following command:kubectl apply -f <environment.yaml>
- After the primary server pod is in ready state (1/1), change the CR spec from paused: false to paused: true in
environment.yamlfile of the primary server section and reapply theenvironment.yamlusing the following command:kubectl apply -f environment.yaml -n <namespace>
- Exec into the primary server pod using the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace> <primary-pod-name> -- /bin/bash
Increase the debug logs level on primary server.
Create a DRPackages directory at the persisted location using
mkdir /mnt/nblogs/DRPackagesfolder.Change ownership of the DRPackages folder to service user using the following command:
chown nbsvcusr:nbsvcusr /mnt/nblogs/DRPackages
- Copy the earlier copied DR files to primary pod at
/mnt/nblogs/DRPackagesusing the following command:kubectl cp <Path_of_DRPackages_on_host_machine> <primary-pod-namespace>/<primary-pod-name>:/mnt/nblogs/DRPackages
- Execute the following steps in the primary server pod:
Change ownership of the files in
/mnt/nblogs/DRPackagesusing the following command:chown nbsvcusr:nbsvcusr <filename>
Deactivate NetBackup health probes using the following command:
/opt/veritas/vxapp-manage/nbu-health deactivate
Stop the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.kill_all
Execute the following command:
nbhostidentity -import -infile /mnt/nblogs/DRPackages/<filename>.drpkg
Restart all the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.start
- Verify if the security settings are enabled.
- Add respective media server entry in host properties using NetBackupAdministration Console as follows:
Navigate to NetBackup Management > Host properties > Master Server > Add Additional server and add media server.
- Restart the NetBackup services in primary server pod and external media server as follows:
Exec into the primary server pod using command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace> <primary-pod-name> -- /bin/bash
Run the following command to stop all the services:
/usr/openv/netbackup/bin/bp.kill_all
After stopping all the services, restart the services using the following command:
/usr/openv/netbackup/bin/bp.start_all
Run the following command to stop all the NetBackup services:
/usr/openv/netbackup/bin/bp.kill_all
After stopping all the services, restart the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.start_all
- Configure a storage unit on external media server that is used during catalog backup.
- Perform catalog recovery from NetBackup Administration Console.
For more information, refer to the VeritasTM NetBackup Troubleshooting Guide
- Exec into the primary server pod using the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace> <primary-pod-name> -- /bin/bash
Stop the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.kill_all
Start the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.start_all
Activate NetBackup health probes using the following command:
/opt/veritas/vxapp-manage/nbu-health activate
- Restart the NetBackup operator pod, where user must delete the pod using the following command:
kuebctl delete <operator-pod-name> -n <namespace>
Kubernetes will start new pod after deletion.
- Pause the reconciler for primary, media servers, and msdp scaleouts in the following sequence:
Change CR spec paused: true to paused: false in
environment.yamlfile of the primary section and re-applyenvironment.yamlfile using the following command:kubectl apply -f environment.yaml -n <namespace>
Wait till primary server is in ready state.
Change CR spec paused: true to paused: false in
environment.yamlfile of the msdp scaleouts section and re-applyenvironment.yamlfile using the following command:kubectl apply -f environment.yaml -n <namespace>
Wait till primary server is in ready state.
Change CR spec paused: true to paused: false in
environment.yamlfile of the media servers section and re-applyenvironment.yamlfile using the following command:kubectl apply -f environment.yaml -n <namespace>
Wait till primary server is in ready state.
- Verify the rollback is successful by performing backups and recovery jobs.