NetBackup™ Deployment Guide for Amazon Elastic Kubernetes Services (EKS) Cluster
- Introduction to NetBackup on EKS
- Deployment with environment operators
- Assessing cluster configuration before deployment
- Deploying NetBackup
- Preparing the environment for NetBackup installation on EKS
- Recommendations of NetBackup deployment on EKS
- Limitations of NetBackup deployment on EKS
- About primary server CR and media server CR
- Monitoring the status of the CRs
- Updating the CRs
- Deleting the CRs
- Configuring NetBackup IT Analytics for NetBackup deployment
- Managing NetBackup deployment using VxUpdate
- Migrating the node group for primary or media servers
- Upgrading NetBackup
- Deploying Snapshot Manager
- Migration and upgrade of Snapshot Manager
- Deploying MSDP Scaleout
- Upgrading MSDP Scaleout
- Monitoring NetBackup
- Monitoring MSDP Scaleout
- Monitoring Snapshot Manager deployment
- Managing the Load Balancer service
- Performing catalog backup and recovery
- Managing MSDP Scaleout
- About MSDP Scaleout maintenance
- Uninstalling MSDP Scaleout from EKS
- Uninstalling Snapshot Manager
- Troubleshooting
- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Pod restart failure due to liveness probe time-out
- Socket connection failure
- Resolving an invalid license key issue
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving token issues
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolving a token expiry issue
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Resolving the primary server connection issue
- Primary pod is in pending state for a long duration
- Host mapping conflict in NetBackup
- NetBackup messaging queue broker take more time to start
- Local connection is getting treated as insecure connection
- Issue with capacity licensing reporting which takes longer time
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Wrong EFS ID is provided in environment.yaml file
- Primary pod is in ContainerCreating state
- Webhook displays an error for PV not found
- Appendix A. CR template
Upgrade NetBackup from previous versions
Ensure that all the steps mentioned for data migration in the following section are performed before upgrading to the latest NetBackup or installing the latest :
See Preparing the environment for NetBackup installation on EKS.
User must have deployed NetBackup on AWS with
EBSas its storage class.While upgrading to latest NetBackup, the existing catalog data of primary server will be migrated (copied) from
EBStoAmazon elastic files.Fresh NetBackup deployment: If user is deploying NetBackup for the first time, then
Amazon elastic fileswill be used for primary server's catalog volume for any backup and restore operations.
Perform the following steps to create EFS when upgrading NetBackup from version 10.0.0.1
- To create EFS for primary server, see Create your Amazon EFS file system.
EFS configuration can be as follow and user can update Throughput mode as required:
Performance mode: General Purpose
Throughput mode: Provisioned (256 MiB/s)
Availability zone: Regional
Note:
Throughput mode can be increased at runtime depending on the size of workloads and also if you are seeing performance issue you can increase the Throughput mode till 1024 MiB/s.
- Install
efs-csi-controllerdriver on EKS cluster.For more information on installing the driver, see Amazon EFS CSI driver.
- Note down the EFS ID for further use.
- Mount EFS on any EC-2 instance and create and create two directories on EFS to store NetBackup data.
For more information, see Mount on EC-2 instance.
For example,
[root@sych09b03v30 ~]# mkdir /efs [root@sych09b03v30 ~]# mount -t nfs4 -o nfsvers=4.1,rsize=1048576, wsize=1048576,hard,timeo=600,retrans=2,noresvport <fs-0bde325bc5b8d6969>.efs.us-east-2.amazonaws.com: / /efs # change EFS ID
After changing the existing storage class from EBS to EFS for data migration, manually create PVC and PV with EFS volume handle and update the yaml file as described in the following procedure:
Create new PVC and PV with EFS volume handle.
CatlogPVC.yaml
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: catalog namespace: ns-155 spec: accessModes: - ReadWriteMany storageClassName: "" resources: requests: storage: 100Gi volumeName: environment-pv-primary -catalogcatalogPV
apiVersion: v1 kind: PersistentVolume metadata: name: environment-pv-primary-catalog labels: topology.kubernetes.io/region: us-east-2 # Give the region as your configuration in your cluster topology.kubernetes.io/zone: us-east-2c # Give the zone of your node instance,can also check with subnet zone in which your node instance is there. spec: capacity: storage: 100Gi volumeMode: Filesystem accessModes: - ReadWriteMany storageClassName: "" persistentVolumeReclaimPolicy: Retain mountOptions: - iam csi: driver: efs.csi.aws.com volumeHandle: fs-07a82a46b4a7d87f8:/nbdata #EFS id need to be changed as per your created EFS id claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: catalog # catalog pvc name to which data to be copied namespace: ns-155
PVC for data (EBS)
PVC
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: data-< Primary name >-primary-0 namespace: ns-155 spec: accessModes: - ReadWriteOnce storageClassName: <Storageclass name> resources: requests: storage: 30GiEdit the
environment.yamlfile and change the value of paused to true in primary section and apply the yaml.Scale down the primary server using the following commands:
To get statefulset name: kubectl get sts -n < namespace in environment cr (ns-155)>
To scale down the STS: kubectl scale sts --replicas=0 < STS name > -n < Namespace >
Copy the data using the migration yaml file as follows:
catalogMigration.yaml
apiVersion: batch/v1 kind: Job metadata: name: rsync-data namespace: ns-155 spec: template: spec: volumes: - name: source-pvc persistentVolumeClaim: # SOURCE PVC claimName: <EBS PVC name of catalog> # catalog-environment-migrate1-primary- 0# old PVC (EBS) from which data to be copied - name: destination-pvc persistentVolumeClaim: # DESTINATION PVC claimName: catalog # new PVC (EFS) to which data will be copied securityContext: runAsUser: 0 runAsGroup: 0 containers: - name: netbackup-migration image: OPERATOR_IMAGE:TAG#image name with tag command: ["/migration", '{"VolumesList":[{"Src":"srcPvc","Dest":"destPvc", "Verify":true,"StorageType":"catalog","OnlyCatalog":true}]}']volumeMounts: - name: source-pvc mountPath: /srcPvc - name: destination-pvc mountPath: /destPvc restartPolicy: NeverdataMigration.yaml
apiVersion: batch/v1 kind: Job metadata: name: rsync-data2 namespace: ns-155 spec: template: spec: volumes: - name: source-pvc persistentVolumeClaim: # SOURCE PVC claimName: <EBS PVC name of catalog> # catalog-environment-migrate1-primary- 0# old PVC (EBS) from where data to be copied - name: destination-pvc persistentVolumeClaim: # DESTINATION PVC claimName: data (EBS) pvc name # new PVC (EFS) to where data will be copied securityContext: runAsUser: 0 runAsGroup: 0 containers: - name: netbackup-migration image: OPERATOR_IMAGE:TAG # image name with tag command: ["/migration", '{"VolumesList":[{"Src":"srcPvc","Dest":"destPvc", "Verify":true,"StorageType":"data","OnlyCatalog":false}]}']volumeMounts: - name: source-pvc mountPath: /srcPvc - name: destination-pvc mountPath: /destPvc restartPolicy: NeverDelete the migration job once the pods are in complete state.
For primary server, delete old PVC (EBS) of catalog volume.
For example, catalog-<Name_of_primary>-primary-0 and create new PVC with same name (as deleted PVC) which were attached to primary server.
Follow the naming conventions of static PV and PVC to consume for Primary Server Deployment.
catalog-<Name_of_primary>-primary-0 data-<Name_of_primary>-primary-0 Example: catalog-test-env-primary-0 data-test-env-primary-0 environment.yaml apiVersion: netbackup.veritas.com/v2 kind: Environment metadata: name: test-env namespace: ns-155 spec: ... primary: # Set name to control the name of the primary server. The default value is the same as the Environment's metadata.name. name: test-envYaml to create new catalog PVC:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: catalog-test-env-primary-0 namespace: ns-155 spec: accessModes: - ReadWriteMany storageClassName: "" resources: requests: storage: 100Gi volumeName: environment-pv-primary-catalog
Edit the PV (mounted on EFS) and replace the name, resource version, uid with new created PVC to meet the naming convention.
Get the PV's and PVC's using the following commands:
To get PVC details: kubectl get pvc -n < Namespace>
Use edit command to get PVC details: kubectl edit pvc < New PVC(old name) name > -n < Namespace >
To get PV details: kubectl edit pv < PV name (in which data is copied) >
Upgrade the MSDP with new build and image tag. Apply the following command to MSDP:
./kubectl-msdp init --image <<Image name:Tag>> --storageclass << Storage Class Name>> --namespace << Namespace >>
Apply the following command operator from new build with and :
kubectl apply -k operator/
Edit the
environment.yamlfile from new build and perform the following changes:Add the
tag: <new_tag_of_upgrade_image>tag separately under primary sections.Provide EFS ID for of the catalog volume under primary section. Set the
paused=falseunder primary section.EFS ID must be same as used in the step in the above section.
Change the for data and logs as with and then apply
environment.yamlfile using the following command and ensure that the primary server is upgraded successfully:kubectl apply -f environment.yaml
Upgrade the MSDP Scaleout by updating the new image tag in msdpscaleout section in
environment.yaml file.Apply
environment.yamlfile using the following command and ensure that MSDP is deployed successfully:kubectl apply -f environment.yaml
Edit the
environment.yamlfile and update the image tag for Media Server in mediaServer section.Apply
environment.yaml fileusing the following command and ensure that the Media Server is deployed successfully:kubectl apply -f environment.yaml
Perform the following steps when upgrading NetBackup from version 10.1
- Make primary environment controller paused to true as follows:
Edit the environment custom resource using the kubectl edit Environment <environmentCR_name> -n <namespace> command.
To pause the reconciler of the particular custom resource, change the paused: false value to paused: true in the primaryServer or mediaServer section and save the changes.
Scale down the primary server using the following commands:
To get statefulset name: kubectl get sts -n <namespace>
To scale down the STS: kubectl scale sts --replicas=0 < STS name of primary server> -n <Namespace>
- Upgrade the MSDP with new build and image tag. Apply the following command to MSDP:
./kubectl-msdp init --image <Image name:Tag> --storageclass <Storage Class Name> --namespace <Namespace>
- Edit the
sample/environment.yamlfile from new build and perform the following changes:Add the
tag: <new_tag_of_upgrade_image>tag separately under primary sections.Provide the EFS ID for storageClassName of catalog volume in primary section.
Note:
The provided EFS ID for storageClassName of catalog volume must be same as previously used EFS ID to create PV and PVC.
Use the following command to retrieve the previously used EFS ID from PV and PVC:
kubectl get pvc -n <namespace>
From the output, copy the name of catalog PVC which is of the following format:
catalog-<resource name prefix>-primary-0
Describe catalog PVC using the following command:
kubectl describe pvc <pvc name> -n <namespace>
Note down the value of Volume field from the output.
Describe PV using the following command:
kubectl describe pv <value of Volume obtained from above step>
Note down the value of VolumeHandle field from the output which is the previously used EFS ID.
For data and logs volume, provide the storageClassNameand then apply
environment.yamlfile using the following command and ensure that the primary server is upgraded successfully:kubectl apply -f environment.yaml
Upgrade the MSDP Scaleout by updating the new image tag in msdpscaleout section in
environment.yamlfile.Apply
environment.yamlfile using the following command and ensure that MSDP is deployed successfully:kubectl apply -f environment.yaml
Edit the
environment.yamlfile and update the image tag for Media Server in mediaServer section.Apply
environment.yamlfile using the following command and ensure that the Media Server is deployed successfully:kubectl apply -f environment.yaml