NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Deployment
- Prerequisites for Kubernetes cluster configuration
- Deployment with environment operators
- Deploying NetBackup
- Preparing the environment for NetBackup installation on Kubernetes cluster
- Recommendations of NetBackup deployment on Kubernetes cluster
- Limitations of NetBackup deployment on Kubernetes cluster
- Primary and media server CR
- Configuring NetBackup IT Analytics for NetBackup deployment
- Managing NetBackup deployment using VxUpdate
- Migrating the cloud node for primary or media servers
- Deploying NetBackup using Helm charts
- Deploying MSDP Scaleout
- Deploying MSDP Scaleout
- Prerequisites for AKS
- Prerequisites for EKS
- Installing the docker images and binaries
- Initializing the MSDP operator
- Configuring MSDP Scaleout
- Using MSDP Scaleout as a single storage pool in NetBackup
- Configuring the MSDP cloud in MSDP Scaleout
- Using S3 service in MSDP Scaleout for AKS
- Enabling MSDP S3 service after MSDP Scaleout is deployed for AKS
- Deploying Snapshot Manager
- Verifying Cloud Scale deployment
- Section II. Monitoring and Management
- Monitoring NetBackup
- Monitoring MSDP Scaleout
- Monitoring Snapshot Manager
- Managing the Load Balancer service
- Managing MSDP Scaleout
- Managing PostrgreSQL DBaaS
- Performing catalog backup and recovery
- Setting key parameters in Cloud Scale deployments
- Section III. Maintenance
- MSDP Scaleout Maintenance
- PostgreSQL DBaaS Maintenance
- Upgrading
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Socket connection failure
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving token issues
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolving a token expiry issue
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Pod restart failure due to liveness probe time-out
- NetBackup messaging queue broker take more time to start
- Host mapping conflict in NetBackup
- Issue with capacity licensing reporting which takes longer time
- Local connection is getting treated as insecure connection
- Primary pod is in pending state for a long duration
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Storage server not supporting Instant Access capability on Web UI after upgrading NetBackup
- Taint, Toleration, and Node affinity related issues in cpServer
- Operations performed on cpServer in environment.yaml file are not reflected
- Elastic media server related issues
- Failed to register Snapshot Manager with NetBackup
- Pods unable to connect to flexsnap-rabbitmq post Kubernetes cluster restart
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
Applying security patches
This section describes how to apply security patches for operator and application images.
In the instructions below, we assume that the operators were deployed to the netbackup-operator-system namespace (the default namespace suggested by the deployment script), and that an environment resource named nb-env was deployed to a namespace named nb-example.
Although it is not necessary to manually shut down NetBackup primary server or media servers, it's still a good idea to quiesce scheduling so that no jobs get interrupted while pods are taken down and restarted.
To prepare the images to apply patches
- Unpack the tar file on a system where docker is able to push to the container registry, and kubectl can access the cluster.
- Decide on a unique tag value to use for MSDP Scaleout images. The unique tag should be in
version-postfixformat, For example, 18.0-update1. Set the DD_TAG environment variable accordingly and rundeploy.sh:DD_TAG=18.0-update1 ./deploy.sh
- In the menu that appears, select option 1 to install the operators.
- Enter the fully qualified domain name of the container registry.
For example:
(AKS-specific) exampleacr.azurecr.io
(EKS-specific) example.dkr.ecr.us-east-2.amazonaws.com/
When the script prompts to load images, answer yes.
- When the script prompts to tag and push images, wait. Open another terminal window and re-tag the MSDP Scaleout images as:
docker tag msdp-operator:18.0 msdp-operator:18.0-update1
docker tag uss-controller:18.0 uss-controller:18.0-update1
docker tag uss-engine:18.0 uss-engine:18.0-update1
docker tag uss-mds:18.0 uss-mds:18.0-update1
- Return to the deploy script and when prompted, enter yes to tag and push the images. Wait for the images to be pushed, and then the script will pause to ask another question. The remaining questions are not required, so press Ctrl+c to exit the deploy script.
- Get the image ID of the existing NetBackup operator container and record it for later. Run:
kubectl get pod -n netbackup-operator-system -l nb-control-plane=nb-controller-manager -o jsonpath --template "{.items[*].status.containerStatuses[?(@.name=='netbackup-operator')].imageID}{'\n'}"
The command prints the name of the image and includes the SHA-256 hash identifying the image. For example:
(AKS-specific) exampleacr.azurecr.io/netbackup/operator @sha256:59d4d46d82024a1ab6353 33774c8e19eb5691f3fe988d86ae16a0c5fb636e30c
(EKS-specific) example.dkr.ecr.us-east-2.amazonaws.com/
- To restart the NetBackup operator, run:
pod=$(kubectl get pod -n netbackup-operator-system -l nb-control-plane=nb-controller-manager -o jsonpath --template '{.items[*].metadata.name}')
kubectl delete pod -n netbackup-operator-system $pod
- Re-run the kubectl command from earlier to get the image ID of the NetBackup operator. Confirm that it's different from what it was before the update.
- Get the image ID of the existing MSDP Scaleout operator container and save it for later use. Run:
kubectl get pods -n netbackup-operator-system -l control-plane=controller-manager -o jsonpath --template "{.items[*].status.containerStatuses[?(@.name=='manager')].imageID}{'\n'}"
- Re-initialize the MSDP Scaleout operator using the new image.
(AKS-specific) kubectl msdp init -n netbackup-operator-system --image exampleacr.azurecr.io/msdp-operator:<msdpCluster_version>-update1 --storageclass managed-csi-hdd
(EKS-specific) kubectl msdp init -n netbackup-operator-system --image <accountid>.dkr.ecr.<region>.amazonaws.com/<registry>:<tag>/msdp-operator:<msdpCluster_version>-update1 --storageclass managed-csi-hdd
- Re-run the kubectl command from earlier to get the image ID of the MSDP Scaleout operator. Confirm that it's different from what it was before the update.
- Look at the list of pods in the application namespace and identify the pod or pods to update. The primary-server pod's name typically end with "primary-0" and media-server pods end with "media-0", "media-1", etc. Hereafter, pod will be referred to as $pod. Run:
kubectl get pods -n nb-example
- Get the image ID of the existing NetBackup container and record it for later. Run:
kubectl get pods -n nb-example $pod -o jsonpath --template "{.status.containerStatuses[*].imageID}{'\n'}"
- Look at the list of StatefulSets in the application namespace and identify the one that corresponds to the pod or pods to be updated. The name is typically the same as the pod, but without the number at the end. For example, a pod named nb-primary-0 is associated with statefulset nb-primary. Hereafter the statefulset will be referred to as $set. Run:
kubectl get statefulsets -n nb-example
- Restart the statefulset. Run:
kubectl rollout restart -n nb-example statefulset $set
The pod or pods associated with the statefulset are terminated and be re-created. It may take several minutes to reach the "Running" state.
- Once the pods are running, re-run the kubectl command from step 2 to get the image ID of the new NetBackup container. Confirm that it's different from what it was before the update.
- Look at the list of pods in the application namespace and identify the pods to update. The controller pod have "uss-controller" in its name, the MDS pods have "uss-mds" in their names, and the engine pods are be named like their fully qualified domain names. Run:
kubectl get pods -n nb-example
- Get the image IDs of the existing MSDP Scaleout containers and record them for later. All the MDS pods use the same image, and all the engine pods use the same image, so it's only necessary to get three image IDs, one for each type of pod.
kubectl get pods -n nb-example $engine $controller $mds -o jsonpath --template "{range .items[*]}{.status.containerStatuses[*].imageID}{'\n'}{end}"
- Edit the Environment resource and change the spec.msdpScaleouts[*].tag values to the new tag used earlier in these instructions.
kubectl edit environment -n nb-example nb-env
... spec: ... msdpScaleouts: - ... tag: "18.0-update1" - Save the file and close the editor. The MSDP Scaleout pods are terminated and re-created. It may take several minutes for all the pods to reach the "Running" state.
- Run kubectl get pods, to check the list of pods and note the new name of the uss-controller pod. Then, once the pods are all ready, re-run the kubectl command above to get the image IDs of the new MSDP Scaleout containers. Confirm that they're different from what they were before the update.