NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Configurations
- Prerequisites
- Recommendations and Limitations
- Configurations
- Configuration of key parameters in Cloud Scale deployments
- Tuning touch files
- Setting maximum jobs per client
- Setting maximum jobs per media server
- Enabling intelligent catalog archiving
- Enabling security settings
- Configuring email server
- Reducing catalog storage management
- Configuring zone redundancy
- Enabling client-side deduplication capabilities
- Parameters for logging (fluentbit)
- Section II. Deployment
- Section III. Monitoring and Management
- Monitoring NetBackup
- Monitoring Snapshot Manager
- Monitoring fluentbit
- Monitoring MSDP Scaleout
- Managing NetBackup
- Managing the Load Balancer service
- Managing PostrgreSQL DBaaS
- Managing fluentbit
- Performing catalog backup and recovery
- Section IV. Maintenance
- PostgreSQL DBaaS Maintenance
- Patching mechanism for primary, media servers, fluentbit pods, and postgres pods
- Upgrading
- Cloud Scale Disaster Recovery
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Socket connection failure
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving token issues
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolving a token expiry issue
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Pod restart failure due to liveness probe time-out
- NetBackup messaging queue broker take more time to start
- Host mapping conflict in NetBackup
- Issue with capacity licensing reporting which takes longer time
- Local connection is getting treated as insecure connection
- Primary pod is in pending state for a long duration
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Storage server not supporting Instant Access capability on Web UI after upgrading NetBackup
- Taint, Toleration, and Node affinity related issues in cpServer
- Operations performed on cpServer in environment.yaml file are not reflected
- Elastic media server related issues
- Failed to register Snapshot Manager with NetBackup
- Post Kubernetes cluster restart, flexsnap-listener pod went into CrashLoopBackoff state or pods were unable to connect to flexsnap-rabbitmq
- Post Kubernetes cluster restart, issues observed in case of containerized Postgres deployment
- Request router logs
- Issues with NBPEM/NBJM
- Issues with logging feature for Cloud Scale
- The flexsnap-listener pod is unable to communicate with RabbitMQ
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting issue for bootstrapper pod
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
- Appendix B. MSDP Scaleout
- About MSDP Scaleout
- Prerequisites for MSDP Scaleout (AKS\EKS)
- Limitations in MSDP Scaleout
- MSDP Scaleout configuration
- Installing the docker images and binaries for MSDP Scaleout (without environment operators or Helm charts)
- Deploying MSDP Scaleout
- Managing MSDP Scaleout
- MSDP Scaleout maintenance
DBaaS Disaster Recovery
Run following commands after providing the required values:
export SERVER_NAME=<Postgress Server Name can be found from azure UI> #Change IT export NAMESPACE=netbackup #Change IT export AKS_SUBNET_NAME=<vnet_name from applied TF-Var file> #Change IT export KV_NAME=<Key Vault Name can be found from azure UI> #Change IT export AKS_NAME=<aks_name from applied TF-Var file> #Change IT export GROUP_NAME=<new_rg_name from applied TF-Var file> export PG_SUBNET_NAME=<db_subnet_name from applied TF-Var file> export LOCATION="<location can be found from azure UI>" export VNET_RESOURCE_GROUP=<vnet_rg_name from applied TF-Var file> export VNET_NAME=<vnet_name from applied TF-Var file> export TAGS="" export PSQL_DNS_ZONE_NAME=<can be found from azure UI go to postgres server then networking and use the name of private DNS Zone being used> export PRIVATE_DNS_LINK_NAME=<dns_to_vnet_link_name from applied TF-Var file> export DB_LOGIN_NAME="dbadminlogin" export DB_SECRET_NAME="dbadminpassword" export DB_SERVER_NAME="dbserver" export SECRET_PROVIDER_CLASS_NAME="dbsecret-spc" export DB_PG_BOUNCER_PORT_NAME="pgbouncerport" export DB_PORT_NAME="dbport" export DB_CERT_NAME="dbcertpem" export CLIENT_ID=$(az aks show -g "${GROUP_NAME}" -n "${AKS_NAME}" --query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId -o tsv 2>/dev/null) export TENANT_ID=$(az account show --query 'tenantId' -o tsv) export DB_CERT_URL="https://cacerts.digicert.com/DigiCertGlobalRootCA.crt.pem" export TLS_FILE_NAME='/tmp/tls.crt'Run the following command:
export KEYVAULT_ID=$(az keyvault show --name "${KV_NAME}" --resource-group "${GROUP_NAME}" --query id --output tsv)
az postgres flexible-server parameter set --resource-group "${GROUP_NAME}" --server "${SERVER_NAME}" --name require_secure_transport --value off
Create SecretProviderClass using the following command:
cat <<END_SECRETS_STORE_YAML | kubectl apply -f - apiVersion: secrets-store.csi.x-k8s.io/v1 kind: SecretProviderClass metadata: name: ${SECRET_PROVIDER_CLASS_NAME} namespace: ${NAMESPACE} spec: provider: azure parameters: usePodIdentity: "false" useVMManagedIdentity: "true" userAssignedIdentityID: ${CLIENT_ID} keyvaultName: ${KV_NAME} cloudName: "" objects: | array: - | objectName: ${DB_LOGIN_NAME} objectType: secret objectVersion: "" - | objectName: ${DB_SECRET_NAME} objectType: secret objectVersion: "" - | objectName: ${DB_SERVER_NAME} objectType: secret objectVersion: "" - | objectName: ${DB_PG_BOUNCER_PORT_NAME} objectType: secret objectVersion: "" - | objectName: ${DB_PORT_NAME} objectType: secret objectVersion: "" tenantId: ${TENANT_ID} END_SECRETS_STORE_YAMLRun the following command:
DIGICERT_ROOT_CA_URL="https://cacerts.digicert.com/DigiCertGlobalRootCA.crt.pem" curl ${DIGICERT_ROOT_CA_URL} --output "${TLS_FILE_NAME}"
DIGICERT_ROOT_G2_URL="https://cacerts.digicert.com/DigiCertGlobalRootG2.crt.pem" curl ${DIGICERT_ROOT_G2_URL} >> "${TLS_FILE_NAME}"
MICROSOFT_RSA_CERT="http://www.microsoft.com/pkiops/certs/Microsoft%20RSA%20Root%20Certificate%20Authority%202017.crt" curl "${MICROSOFT_RSA_CERT}" | openssl x509 -inform DER -outform PEM >> "${TLS_FILE_NAME}"
Create bundle using the following command:
cat <<EOF | kubectl apply -f - apiVersion: trust.cert-manager.io/v1alpha1 kind: Bundle metadata: name: db-cert namespace: netbackup spec: sources: - secret: name: "postgresql-netbackup-ca" key: "tls.crt" target: namespaceSelector: matchLabels: kubernetes.io/metadata.name: "netbackup" configMap: key: "dbcertpem" EOFReset the password and use the same one used at the time of backup.
For more information on resetting the password refer to the Azure-specific procedure in the following section:
Create Service Account for service access:
# create secret access policy cat <<EOF > /tmp/db-secret-access-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret" ], "Resource": [ <admin-secret-arn>, <cert-secret-arn> ] } ] } EOF $ aws iam create-policy \ --policy-name db-secret-access-policy \ --policy-document file:///tmp/db-secret-access-policy.json # create SA and link it with IAM Policy eksctl create iamserviceaccount \ --override-existing-serviceaccounts \ --approve \ --config-file - <<EOF apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: $EKS_CLUSTER_NAME region: $REGION tags: OWNER: $OWNER iam: withOIDC: true serviceAccounts: - metadata: name: db-access namespace: netbackup attachPolicyARNs: - $SECRET_ACCESS_POLICY_ARN permissionsBoundary: $PERMISSIONS_BOUNDARY EOFCreate SecretProviderClass as follows:
DB_SECRETS_ARN=<secret_arn> # enter admin secret ARN which will be available in AWS UI SECRET_PROVIDER_CLASS_NAME=dbsecret-spc NAMESPACE=netbackup cat <<EOF | kubectl apply -f - apiVersion: secrets-store.csi.x-k8s.io/v1 kind: SecretProviderClass metadata: name: ${SECRET_PROVIDER_CLASS_NAME} namespace: ${NAMESPACE} spec: provider: aws parameters: objects: | - objectName: ${DB_SECRETS_ARN} jmesPath: - path: "username" objectAlias: "dbadminlogin" - path: "host" objectAlias: "dbserver" - path: "password" objectAlias: "dbadminpassword" - path: to_string("port") objectAlias: "dbport" - path: "rdsproxy_endpoint" objectAlias: "dbproxyhost" EOFRun the following command:
TLS_FILE_NAME='/tmp/tls.crt' PROXY_FILE_NAME='/tmp/proxy.pem' rm -f ${TLS_FILE_NAME} ${PROXY_FILE_NAME} DB_CERT_URL="https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem" DB_PROXY_CERT_URL="https://www.amazontrust.com/repository/AmazonRootCA1.pem" curl ${DB_CERT_URL} --output ${TLS_FILE_NAME} curl ${DB_PROXY_CERT_URL} --output ${PROXY_FILE_NAME} cat ${PROXY_FILE_NAME} >> ${TLS_FILE_NAME} kubectl -n netbackup create secret generic postgresql-netbackup-ca --from-file ${TLS_FILE_NAME}Create bundle using the following command:
cat <<EOF | kubectl apply -f - apiVersion: trust.cert-manager.io/v1alpha1 kind: Bundle metadata: name: db-cert namespace: netbackup spec: sources: - secret: name: "postgresql-netbackup-ca" key: "tls.crt" target: namespaceSelector: matchLabels: kubernetes.io/metadata.name: "netbackup" configMap: key: "dbcertpem" EOFPerform the steps listed in the AWS-specific procedure in the following section to change password and replace with password saved during backup phase: