NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Configurations
- Prerequisites
- Preparing the environment for NetBackup installation on Kubernetes cluster
- Prerequisites for Snapshot Manager (AKS/EKS)
- Prerequisites for Kubernetes cluster configuration
- Prerequisites for Cloud Scale configuration
- Prerequisites for deploying environment operators
- Prerequisites for using private registry
- Recommendations and Limitations
- Configurations
- Configuration of key parameters in Cloud Scale deployments
- Tuning touch files
- Setting maximum jobs per client
- Setting maximum jobs per media server
- Enabling intelligent catalog archiving
- Enabling security settings
- Configuring email server
- Reducing catalog storage management
- Configuring zone redundancy
- Enabling client-side deduplication capabilities
- Parameters for logging (fluentbit)
- Managing media server configurations in Web UI
- Prerequisites
- Section II. Deployment
- Section III. Monitoring and Management
- Monitoring NetBackup
- Monitoring Snapshot Manager
- Monitoring fluentbit
- Monitoring MSDP Scaleout
- Managing NetBackup
- Managing the Load Balancer service
- Managing PostrgreSQL DBaaS
- Managing logging
- Performing catalog backup and recovery
- Section IV. Maintenance
- PostgreSQL DBaaS Maintenance
- Patching mechanism for primary, media servers, fluentbit pods, and postgres pods
- Upgrading
- Cloud Scale Disaster Recovery
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Socket connection failure
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving issues when media server PVs are deleted
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Pod restart failure due to liveness probe time-out
- NetBackup messaging queue broker take more time to start
- Host mapping conflict in NetBackup
- Issue with capacity licensing reporting which takes longer time
- Local connection is getting treated as insecure connection
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Storage server not supporting Instant Access capability on Web UI after upgrading NetBackup
- Taint, Toleration, and Node affinity related issues in cpServer
- Operations performed on cpServer in environment.yaml file are not reflected
- Elastic media server related issues
- Failed to register Snapshot Manager with NetBackup
- Post Kubernetes cluster restart, flexsnap-listener pod went into CrashLoopBackoff state or pods were unable to connect to flexsnap-rabbitmq
- Post Kubernetes cluster restart, issues observed in case of containerized Postgres deployment
- Request router logs
- Issues with NBPEM/NBJM
- Issues with logging feature for Cloud Scale
- The flexsnap-listener pod is unable to communicate with RabbitMQ
- Job remains in queue for long time
- Extracting logs if the nbwsapp or log-viewer pods are down
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting issue for bootstrapper pod
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
- Appendix B. MSDP Scaleout
- About MSDP Scaleout
- Prerequisites for MSDP Scaleout (AKS\EKS)
- Limitations in MSDP Scaleout
- MSDP Scaleout configuration
- Installing the docker images and binaries for MSDP Scaleout (without environment operators or Helm charts)
- Deploying MSDP Scaleout
- Managing MSDP Scaleout
- MSDP Scaleout maintenance
Upgrade PostgreSQL database
Depending on the following scenarios, perform the appropriate procedure to upgrade PostgreSQL database:
Upgrade only
Upgrade and modify additional parameters
If upgrading from 10.5 or later and you do not need to modify parameters other than tag and logDestination, use the following command:
helm upgrade postgresql postgresql-<version>.tgz -n netbackup --reuse-values \ --set postgresql.image.tag=21.0.x.x-xxxx \ --set postgresql.logDestination=stderr \ --set postgresqlUpgrade.image.tag=21.0.x.x-xxxx
Perform the following steps to upgrade PostgreSQL database when modifying the parameters in addition to the tags:
Use the following command to save the PostgreSQL chart values to a file:
helm show values postgresql-<version>.tgz > postgres-values.yaml
Use the following command to edit the chart values:
logDestination: stderr
vi postgres-values.yaml
Following is an example for
postgres-values.yamlfile:# Default values for postgresql. global: environmentNamespace: "netbackup" containerRegistry: "364956537575.dkr.ecr.us-east-1.amazonaws.com" timezone: null postgresql: replicas: 1 # The values in the image (name, tag) are placeholders. These will be set # when the deploy_nb_cloudscale.sh runs. image: name: "netbackup/postgresql" tag: "21.0.x.x-xxxx" pullPolicy: Always service: serviceName: nb-postgresql volume: volumeClaimName: nb-psql-pvc volumeDefaultMode: 0640 pvcStorage: 30Gi # configMapName: nbpsqlconf storageClassName: nb-disk-premium mountPathData: /netbackup/postgresqldb secretMountPath: /netbackup/postgresql/keys/server # mountConf: /netbackup securityContext: runAsUser: 0 createCerts: true # pgbouncerIniPath: /netbackup/pgbouncer.ini nodeSelector: key: agentpool value: nbupool # Resource requests (minima) and limits (maxima). Requests are used to fit # the database pod onto a node that has sufficient room. Limits are used to # throttle (for CPU) or terminate (for memory) containers that exceed the # limit. For details, refer to Kubernetes documentation: # https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-units-in-kubernetes # Other types of resources are documented, but only `memory` and `cpu` are # recognized by NetBackup. # # resources: # requests: # memory: 2Gi # cpu: 500m # limits: # memory: 3Gi # cpu: 3 # Example tolerations. Check taints on the desired nodes and update keys and # values. # tolerations: - key: agentpool value: nbupool - key: agentpool value: mediapool - key: agentpool value: primarypool - key: storage-pool value: storagepool - key: data-plane-pool value: dataplanepool serverSecretName: postgresql-server-crt clientSecretName: postgresql-client-crt dbSecretName: dbsecret dbPort: 13785 pgbouncerPort: 13787 dbAdminName: postgres initialDbAdminPassword: postgres dataDir: /netbackup/postgresqldb # postgresqlConfFilePath: /netbackup/postgresql.conf # pgHbaConfFilePath: /netbackup/pg_hba.conf defaultPostgresqlHostName: nb-postgresql # file => log postgresdb in file the default # stderr => log postgresdb in stderr so that fluentbit daemonset collect the logs. logDestination: file postgresqlUpgrade: replicas: 1 image: name: "netbackup/postgresql-upgrade" tag: "21.0.x.x-xxxx" pullPolicy: Always volume: volumeClaimName: nb-psql-pvc mountPathData: /netbackup/postgresqldb timezone: null securityContext: runAsUser: 0 env: dataDir: /netbackup/postgresqldbExecute the following command to upgrade the PostgreSQL database:
helm upgrade --install postgresql postgresql-<version>.tgz -f postgres-values.yaml -n netbackup
Or
If using the OCI container registry, use the following command:
helm upgrade --install postgresql oci://abcd.veritas.com:5000/helm-charts/netbackup-postgresql --version <version> -f postgres-values.yaml -n netbackup
Use the following command to verify if the postgresql statefulset is in the desired state:
kubectl get statefulset -n <environment namespace> | grep "postgresql"
nb-postgresql 1/1
If primary node pool has taints applied and they are not added to postgres-values.yaml file, then manually add tolerations to the PostgreSQL StatefulSet as follows:
To verify that node pools use taints, run the following command:
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,TaintEffect:.spec.taints[*].effect
NodeName TaintKey TaintValue TaintEffect ip-10-248-231-149.ec2.internal <none> <none> <none> ip-10-248-231-245.ec2.internal <none> <none> <none> ip-10-248-91-105.ec2.internal nbupool agentpool NoSchedule
To view StatefulSets, run the following command:
kubectl get statefulsets -n netbackup
NAME READY AGE nb-postgresql 1/1 76m nb-primary 0/1 51m
Edit the PostgreSQL StatefulSets and add tolerations as follows:
kubectl edit statefulset nb-postgresql -n netbackup
Following is an example of the modified PostgreSQL StatefulSets:
apiVersion: apps/v1
kind: StatefulSet
metadata:
annotations:
meta.helm.sh/release-name: postgresql
meta.helm.sh/release-namespace: netbackup
creationTimestamp: "2024-03-25T15:11:59Z"
generation: 1
labels:
app: nb-postgresql
app.kubernetes.io/managed-by: Helm
name: nb-postgresql
...
spec:
template:
spec:
containers:
...
nodeSelector:
nbupool: agentool
tolerations:
- effect: NoSchedule
key: nbupool
operator: Equal
value: agentpool