NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Configurations
- Prerequisites
- Recommendations and Limitations
- Configurations
- Configuration of key parameters in Cloud Scale deployments
- Tuning touch files
- Setting maximum jobs per client
- Setting maximum jobs per media server
- Enabling intelligent catalog archiving
- Enabling security settings
- Configuring email server
- Reducing catalog storage management
- Configuring zone redundancy
- Enabling client-side deduplication capabilities
- Parameters for logging (fluentbit)
- Section II. Deployment
- Section III. Monitoring and Management
- Monitoring NetBackup
- Monitoring Snapshot Manager
- Monitoring fluentbit
- Monitoring MSDP Scaleout
- Managing NetBackup
- Managing the Load Balancer service
- Managing PostrgreSQL DBaaS
- Managing fluentbit
- Performing catalog backup and recovery
- Section IV. Maintenance
- PostgreSQL DBaaS Maintenance
- Patching mechanism for primary, media servers, fluentbit pods, and postgres pods
- Upgrading
- Cloud Scale Disaster Recovery
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Socket connection failure
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving token issues
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolving a token expiry issue
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Pod restart failure due to liveness probe time-out
- NetBackup messaging queue broker take more time to start
- Host mapping conflict in NetBackup
- Issue with capacity licensing reporting which takes longer time
- Local connection is getting treated as insecure connection
- Primary pod is in pending state for a long duration
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Storage server not supporting Instant Access capability on Web UI after upgrading NetBackup
- Taint, Toleration, and Node affinity related issues in cpServer
- Operations performed on cpServer in environment.yaml file are not reflected
- Elastic media server related issues
- Failed to register Snapshot Manager with NetBackup
- Post Kubernetes cluster restart, flexsnap-listener pod went into CrashLoopBackoff state or pods were unable to connect to flexsnap-rabbitmq
- Post Kubernetes cluster restart, issues observed in case of containerized Postgres deployment
- Request router logs
- Issues with NBPEM/NBJM
- Issues with logging feature for Cloud Scale
- The flexsnap-listener pod is unable to communicate with RabbitMQ
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting issue for bootstrapper pod
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
- Appendix B. MSDP Scaleout
- About MSDP Scaleout
- Prerequisites for MSDP Scaleout (AKS\EKS)
- Limitations in MSDP Scaleout
- MSDP Scaleout configuration
- Installing the docker images and binaries for MSDP Scaleout (without environment operators or Helm charts)
- Deploying MSDP Scaleout
- Managing MSDP Scaleout
- MSDP Scaleout maintenance
Expanding log volumes for primary pods
To optimize and reduce the log sizes of the PV's of the decoupled services, execute the following steps:
To reduce the log size of the PV
- Stop the primary pods:
Untar the .tar for the kubernetes package, and within the untar folder, navigate to
scripts/.Navigate to the scripts folder inside the build folder (For example: VRTSk8s-netbackup-<version>/scripts).
Run the script
cloudscale_restart.sh, with stop (as the input for action) and the netbackup namespace (for namespace parameter).For example: ./cloudscale_restart.sh stop <namespace> This script will pause the primary server CR and stop all the decoupled services.
- Manually delete the statefulsets:
After all the pods are scaled down, get the required list of statefulsets using the command:
kubectl get sts -n <netbackup_namespace> | grep -e "nbatd" -e "nbmqbroker" -e "nbwsapp" -e "policyjob" -e "policyjobmgr" -e "primary"
Delete the statefulsets using the command:
kubectl delete statefulset <statefulset_names_obtained_above> -n <netbackup_namespace>
- List the log PVCs for primary pods:
Execute the command:
kubectl get pvc -n <netbackup_namespace> | grep log | grep -ve "catalog" -ve "uss" -ve "media"
For example: kubectl get pvc -n netbackup | grep log | grep -ve "catalog" -ve "uss" -ve "media"
- Expand the log PVCs
Expand the capacities of the above obtained log PVCs using the command:
kubectl patch pvc <pvc_names_obtained_above> -n <netbackup_namespace> -p '{"spec":{"resources":{"requests":{"storage":"<Expanded_capacity>Gi"}}}}'
For example: kubectl patch pvc logs-nbu-primary-0 -n netbackup -p '{"spec":{"resources":{"requests":{"storage":"35Gi"}}}}'
- Expand log volume size in primary server CR
Execute the command: kubectl patch environment <environment_name> -n <netbackup_namespace> --type=json --patch '[{"op": "replace", "path": "/spec/primary/storage/log/capacity", "value": "<Expanded_capacity>Gi"}]'
For example: kubectl patch environment nbu -n netbackup --type=json --patch '[{"op": "replace", "path": "/spec/primary/storage/log/capacity", "value": "35Gi"}]'
- Start decoupled services using the
cloudscale_restart.shscript:Navigate to the scripts folder inside the build folder (For example: VRTSk8s-netbackup-10.5-0053/scripts).
Run the script
cloudscale_restart.shwith start (as the input for action) and the netbackup namespace (for namespace parameter).For example: ./cloudscale_restart.sh start <namespace>