Please enter search query.
 
              Search <book_title>...
            
 
          NetBackup™ Deployment Guide for Kubernetes Clusters
                Last Published: 
				
                2025-02-26
              
              
                Product(s): 
				
                 NetBackup (10.5.0.1)
              
              
            - Introduction
- Section I. Configurations- Prerequisites
- Recommendations and Limitations
- Configurations
- Configuration of key parameters in Cloud Scale deployments- Tuning touch files
- Setting maximum jobs per client
- Setting maximum jobs per media server
- Enabling intelligent catalog archiving
- Enabling security settings
- Configuring email server
- Reducing catalog storage management
- Configuring zone redundancy
- Enabling client-side deduplication capabilities
- Parameters for logging (fluentbit)
 
 
- Section II. Deployment
- Section III. Monitoring and Management- Monitoring NetBackup
- Monitoring Snapshot Manager
- Monitoring fluentbit
- Monitoring MSDP Scaleout
- Managing NetBackup
- Managing the Load Balancer service
- Managing PostrgreSQL DBaaS
- Managing fluentbit
- Performing catalog backup and recovery
 
- Section IV. Maintenance- PostgreSQL DBaaS Maintenance
- Patching mechanism for primary, media servers, fluentbit pods, and postgres pods
- Upgrading
- Cloud Scale Disaster Recovery
- Uninstalling
- Troubleshooting- Troubleshooting AKS and EKS issues- View the list of operator resources
- View the list of product resources
- View operator logs
- View primary logs
- Socket connection failure
- Resolving an issue where external IP address is not assigned to a NetBackup server's load balancer services
- Resolving the issue where the NetBackup server pod is not scheduled for long time
- Resolving an issue where the Storage class does not exist
- Resolving an issue where the primary server or media server deployment does not proceed
- Resolving an issue of failed probes
- Resolving token issues
- Resolving an issue related to insufficient storage
- Resolving an issue related to invalid nodepool
- Resolving a token expiry issue
- Resolve an issue related to KMS database
- Resolve an issue related to pulling an image from the container registry
- Resolving an issue related to recovery of data
- Check primary server status
- Pod status field shows as pending
- Ensure that the container is running the patched image
- Getting EEB information from an image, a running container, or persistent data
- Resolving the certificate error issue in NetBackup operator pod logs
- Pod restart failure due to liveness probe time-out
- NetBackup messaging queue broker take more time to start
- Host mapping conflict in NetBackup
- Issue with capacity licensing reporting which takes longer time
- Local connection is getting treated as insecure connection
- Primary pod is in pending state for a long duration
- Backing up data from Primary server's /mnt/nbdata/ directory fails with primary server as a client
- Storage server not supporting Instant Access capability on Web UI after upgrading NetBackup
- Taint, Toleration, and Node affinity related issues in cpServer
- Operations performed on cpServer in environment.yaml file are not reflected
- Elastic media server related issues
- Failed to register Snapshot Manager with NetBackup
- Post Kubernetes cluster restart, flexsnap-listener pod went into CrashLoopBackoff state or pods were unable to connect to flexsnap-rabbitmq
- Post Kubernetes cluster restart, issues observed in case of containerized Postgres deployment
- Request router logs
- Issues with NBPEM/NBJM
- Issues with logging feature for Cloud Scale
- The flexsnap-listener pod is unable to communicate with RabbitMQ
 
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting issue for bootstrapper pod
 
- Troubleshooting AKS and EKS issues
 
- Appendix A. CR template
- Appendix B. MSDP Scaleout- About MSDP Scaleout
- Prerequisites for MSDP Scaleout (AKS\EKS)
- Limitations in MSDP Scaleout
- MSDP Scaleout configuration
- Installing the docker images and binaries for MSDP Scaleout (without environment operators or Helm charts)
- Deploying MSDP Scaleout
- Managing MSDP Scaleout
- MSDP Scaleout maintenance
 
MSDP Scaleout CR template for AKS
# The MSDPScaleout CR YAML
apiVersion: msdp.veritas.com/v1
kind: MSDPScaleout
metadata:
  # The CR name should not be longer than 40 characters.
  name: sample-app
  # The namespace needs to be present for the CR to be created in.
  # It's not allowed to deploy the CR in the same namespace with MSDP
operator.
  namespace: sample-namespace
spec:
  # Your ACR URL where the docker images can be pulled from by the
AKS cluster on demand
  # The allowed length is in range 1-255
  # It's optional for BYO. The code does not check the presence or
validation.
  # User needs to specify it correctly if it's needed.
  containerRegistry: sample.azurecr.io
  #
  # The MSDP version string. It's the tag of the MSDP docker images.
  # The allowed length is in range 1-64
  version: "sample-version-string"
  #
  # Size defines the number of Engine instances in MSDP Scaleout.
  # The allowed size is between 1-16
  size: 4
  #
  # The IP and FQDN pairs are used by the Engine Pods to expose the MSDP
services.
  # The IP and FQDN in one pair should match each other correctly.
  # They must be pre-allocated.
  # The item number should match the number of Engine instances.
  # They're not allowed to be changed or re-ordered. New items can be
appended for scaling out.
  # The first FQDN is used to configure the storage server in NetBackup, 
automatically if autoRegisterOST is enabled,
  # or manually by the user if not.
  serviceIPFQDNs:
    # The pattern is IPv4 or IPv6 format
  - ipAddr: "sample-ip1"
    # The pattern is FQDN format. `^[a-z][a-z0-9-.]{1,251}[a-z0-9]$`
    fqdn: "sample-fqdn1"
  - ipAddr: "sample-ip2"
    fqdn: "sample-fqdn2"
  - ipAddr: "sample-ip3"
    fqdn: "sample-fqdn3"
  - ipAddr: "sample-ip4"
    fqdn: "sample-fqdn4"
  #
 # # s3ServiceIPFQDN is the IP and FQDN pair to expose the S3 service from the MSDP instance.
  # # The IP and FQDN in one pair should match each other correctly.
  # # It must be pre-allocated.
  # # It is not allowed to be changed after deployment.
  # s3ServiceIPFQDN:
  #   # The pattern is IPv4 or IPv6 format
  #   ipAddr: "sample-s3-ip"
  #   # The pattern is FQDN format.
  #   fqdn: "sample-s3-fqdn"
  #
  # Optional annotations to be added in the LoadBalancer services for the
Engine IPs.
  # In case we run the Engines on private IPs, we need to add some
customized annotations to the LoadBalancer services.
  # See https://docs.microsoft.com/en-us/azure/aks/internal-lb
  # It's optional. It's not needed in most cases if we're
with public IPs.
  # loadBalancerAnnotations:
  #   service.beta.kubernetes.io/azure-load-balancer-internal: "true"
  #
  # SecretName is the name of the secret which stores the MSDP credential.
  # AutoDelete, when true, will automatically delete the secret specified
by SecretName after the
  # initial configuration. If unspecified, AutoDelete defaults to true.
  # When true, SkipPrecheck will skip webhook validation of the MSDP
credential. It is only used in data re-use
  # scenario (delete CR and re-apply with pre-existing data) as the
secret will not take effect in this scenario. It
  # can't be used in other scenarios. If unspecified, SkipPrecheck
defaults to false.
  credential:
    # The secret should be pre-created in the same namespace which has
the MSDP credential stored.
    # The secret should have a "username" and a "password" key-pairs
with the corresponding username and password values.
    # Please follow MSDP guide for the rules of the credential.
    #   https://www.veritas.com/content/support/en_US/article.100048511
    # A secret can be created directly via kubectl command or with the
equivalent YAML file:
    #   kubectl create secret generic sample-secret --namespace
sample-namespace \
    #   --from-literal=username=<username> --from-literal=password=
<password>
    secretName: sample-secret
    # Optional
    # Default is true
    autoDelete: true
    # Optional
    # Default is false.
    # Should be specified only in data re-use scenario (aka delete and
re-apply CR with pre-existing data)
    skipPrecheck: false
  #
# s3Credential:
  #   # Use this option in conjunction 
with KMS option enabled.
  #   # The secret should be pre-created in the same namespace that the 
MSDP cluster is deployed.
  #   # The secret should have an "accessKey" and a "secretKey" key-pairs 
with the corresponding accessKey and secretKey values.
  #   # A secret can be created directly via kubectl-msdp command:
  #   #   kubectl-msdp generate-s3-secret --namespace <namespace> 
--s3secret <s3SecretName>
  #   secretName: s3-secret
  #   # Optional
  #   # Default is true
  #   autoDelete: true
  #   # Optional
  #   # Default is false.
  #   # Should be specified only in data re-use scenario (aka delete and 
re-apply CR with pre-existing data)
  #   skipPrecheck: false
  # Paused is used for maintenance only. In most cases you don't need
to specify it.
  # When it's specified, MSDP operator stops reconciling the corresponding
MSDP-X (aka the CR).
  # Optional.
  # Default is false
  # paused: false
  #
  # The storage classes for logVolume, catalogVolume and dataVolumes should
be:
  #   - Backed with Azure disk CSI driver "disk.csi.azure.com" with the
managed disks, and allow volume
  #     expansion.
  #   - The Azure in-tree storage driver "kubernetes.io/azure-disk" is not
supported. You need to explicitly
  #     enable the Azure disk CSI driver when configuring your AKS cluster, 
or use k8s version v1.21.x which
  #     has the Azure disk CSI driver built-in.
  #   - In LRS category.
  #   - At least Standard SSD for dev/test, and Premium SSD or Ultra Disk
for production.
  #   - The same storage class can be used for all the volumes.
  #   -
  #
  # LogVolume is the volume specification which is used to provision a
volume of an MDS or Controller
  # Pod to store the log files and core dump files.
  # It's not allowed to be changed.
  # In most cases, 5-10 GiB capacity should be big enough for one MDS or
Controller Pod to use.
  logVolume:
    storageClassName: sample-azure-disk-sc1
    resources:
      requests:
        storage: 5Gi
  #
  # CatalogVolume is the volume specification which is used to provision a
volume of an MDS or Engine
  # Pod to store the catalog and metadata. It's not allowed to be changed
unless for capacity expansion.
  # Expanding the existing catalog volumes expects short downtime of the
Engines.
  # Please note the MDS Pods don't respect the storage request in
CatalogVolume, instead they provision the
  # volumes with the minimal capacity request of 500MiB.
  catalogVolume:
    storageClassName: sample-azure-disk-sc2
    resources:
      requests:
        storage: 600Gi
  #
  # DataVolumes is a list of volume specifications which are used to
provision the volumes of
  # an Engine Pod to store the MSDP data.
  # The items are not allowed to be changed or re-ordered unless for
capacity expansion.
  # New items can be appended for adding more data volumes to each
Engine Pod.
  # Appending new data volumes or expanding the existing data volumes
expects short downtime of the Engines.
  # The allowed item number is in range 1-16. To allow the other MSDP-X
Pods (e.g. Controller, MDS) running
  # on the same node, the item number should be no more than "<the maximum
allowed volumes on the node> - 5".
  # The additional 5 data disks are for the potential one MDS Pod, one
Controller Pod or one MSDP operator Pod
  # to run on the same node with one MSDP Engine.
  dataVolumes:
    - storageClassName: sample-azure-disk-sc3
      resources:
        requests:
          storage: 8Ti
    - storageClassName: sample-azure-disk-sc3
      resources:
        requests:
          storage: 8Ti
  #
  # NodeSelector is used to schedule the MSDPScaleout Pods on the specified
nodes.
  # Optional.
  # Default is empty (aka all available nodes)
  nodeSelector:
    # e.g.
    # agentpool: nodepool2
    sample-node-label1: sampel-label-value1
    sample-node-label2: sampel-label-value2
  #
  # NBCA is the specification for MSDP-X to enable NBCA SecComm
for the Engines.
  # Optional.
  nbca:
    # The master server name
    # The allowed length is in range 1-255
    masterServer: sample-master-server-name
    # The CA SHA256 fingerprint
    # The allowed length is 95
    cafp: sample-ca-fp
    # The NBCA authentication/reissue token
    # The allowed length is 16
    # For security consideration, a token with maximum 1 user allowed and
valid for 1 day should be sufficient.
    token: sample-auth-token
    # # S3TokenSecret is the secret name that holds NBCA authentication/reissue token for MSDP S3 service.
    # # It is used to request NBCA certificate for S3 service.
    # # It must be set if MSDP S3 service is enabled.
    # # The allowed length is in range 1-255
    # # For security consideration, a token with maximum 1 user allowed and valid for 1 day should be sufficient.
    # s3TokenSecret: sample-auth-token-secret-for-s3
  #
  # KMS includes the parameters to enable KMS for the Engines.
  # We support to enable KMS in init or post configuration.
  # We don't support to change the parameters once they have been set.
  # Optional.
  kms:
    # As either the NetBackup KMS or external KMS (EKMS) is configured or
registered on NetBackup master server, then used by
    # MSDP by calling the NetBackup API, kmsServer is the NetBackup master
server name.
    kmsServer: sample-master-server-name
    keyGroup: sample-key-group-name
  #
  # autoRegisterOST includes the parameter to enable or disable the
automatic registration of
  # the storage server, the default disk pool and storage unit when MSDP-X
configuration finishes.
  autoRegisterOST:
    # If it is true, and NBCA is enabled, the operator would register the
storage server,
    # disk pool and storage unit on the NetBackup primary server, when the
MSDP CR is deployed.
    # The first Engine FQDN is the storage server name.
    # The default disk pool is in format "default_dp_<firstEngineFQDN>".
    # The default storage unit is in format "default_stu_<firstEngineFQDN>".
    # The default maximum concurrent jobs for the STU is 240.
    # In the CR status, field "ostAutoRegisterStatus.registered" with value
True, False or Unknown indicates the registration state.
    # It's false by default.
    # Note: Please don't enable it unless with NB_9.1.2_0126+.
    enabled: true
  #
  # CorePattern is the core pattern of the nodes where the MSDPScaleout
Pods are running.
  # It's path-based. A default core path "/core/core.%e.%p.%t" will be
used if not specified.
  # In most cases, you don't need to specify it.
  # It's not allowed to be changed.
  # Optional.
  # corePattern: /sample/core/pattern/path
  #
  # tcpKeepAliveTime sets the namespaced sysctl parameter
net.ipv4.tcp_keepalive_time in Engine Pods.
  # It's in seconds.
  # The minimal allowed value is 60 and the maximum allowed value is 1800.
  # A default value 120 is used if not specified. Set it to 0 to disable
the option.
  # It's not allowed to change unless in maintenance mode (Paused=true), 
and the change will not apply until the Engine Pods get restarted
  # For AKS deployment in P release, please leave it unspecified or specify
it with a value smaller than 240.
  # tcpKeepAliveTime: 120
  #
  # TCPIdleTimeout is used to change the default value for Azure Load
Balancer rules and Inbound NAT rules.
  # It's in minutes.
  # The minimal allowed value is 4 and the maximum allowed value is 30.
  # A default value 30 minutes is used if not specified. Set it to 0 to
disable the option.
  # It's not allowed to change unless in maintenance mode (Paused=true), 
and the change will not apply
  # until the Engine Pods and the LoadBalancer services get recreated.
  # For AKS deployment in P release, please leave it unspecified or specify
it with a value larger than 4.
  # tcpIdleTimeout: 30