Please enter search query.
Search <book_title>...
InfoScale™ 9.0 Cluster Server Administrator's Guide - Linux
Last Published:
2025-08-11
Product(s):
InfoScale & Storage Foundation (9.0)
Platform: Linux
- Section I. Clustering concepts and terminology
- Introducing Cluster Server
- About Cluster Server
- About cluster control guidelines
- About the physical components of VCS
- Logical components of VCS
- About resources and resource dependencies
- Categories of resources
- About resource types
- About service groups
- Types of service groups
- About the ClusterService group
- About the cluster UUID
- About agents in VCS
- About agent functions
- About resource monitoring
- Agent classifications
- VCS agent framework
- About cluster control, communications, and membership
- About security services
- Components for administering VCS
- Putting the pieces together
- About cluster topologies
- VCS configuration concepts
- Introducing Cluster Server
- Section II. Administration - Putting VCS to work
- About the VCS user privilege model
- Administering the cluster from the command line
- About administering VCS from the command line
- About installing a VCS license
- Administering LLT
- Displaying the cluster details and LLT version for LLT links
- Adding and removing LLT links
- Configuring aggregated interfaces under LLT
- Configuring destination-based load balancing for LLT
- Configuring heartbeat threads to improve cluster resiliency
- Configuring IPsec for encrypted communication over LLT
- Unconfiguring the IPsec communication channel for LLT data
- Administering the AMF kernel driver
- Starting VCS
- Stopping VCS
- Stopping VCS without evacuating service groups
- Stopping the VCS engine and related processes
- Logging on to VCS
- About managing VCS configuration files
- About managing VCS users from the command line
- About querying VCS
- About administering service groups
- Adding and deleting service groups
- Modifying service group attributes
- Bringing service groups online
- Taking service groups offline
- Switching service groups
- Migrating service groups
- Freezing and unfreezing service groups
- Enabling and disabling service groups
- Enabling and disabling priority based failover for a service group
- Clearing faulted resources in a service group
- Flushing service groups
- Linking and unlinking service groups
- Administering agents
- About administering resources
- About adding resources
- Adding resources
- Deleting resources
- Adding, deleting, and modifying resource attributes
- Defining attributes as local
- Defining attributes as global
- Enabling and disabling intelligent resource monitoring for agents manually
- Enabling and disabling IMF for agents by using script
- Linking and unlinking resources
- Bringing resources online
- Taking resources offline
- Probing a resource
- Clearing a resource
- About administering resource types
- Administering systems
- About administering clusters
- Configuring and unconfiguring the cluster UUID value
- Retrieving version information
- Adding and removing systems
- Changing ports for VCS
- Setting cluster attributes from the command line
- About initializing cluster attributes in the configuration file
- Enabling and disabling secure mode for the cluster
- Migrating from secure mode to secure mode with FIPS
- About cluster formation with different versions of the VCS engine
- Using the -wait option in scripts that use VCS commands
- Running HA fire drills
- Configuring applications and resources in VCS
- Configuring resources and applications
- VCS bundled agents for UNIX
- About application monitoring on single-node clusters
- Configuring NFS service groups
- About NFS
- Configuring NFS service groups
- Sample configurations
- Sample configuration for a single NFS environment without lock recovery
- Sample configuration for a single NFS environment with lock recovery
- Sample configuration for a single NFSv4 environment
- Sample configuration for a multiple NFSv4 environment
- Sample configuration for a multiple NFS environment without lock recovery
- Sample configuration for a multiple NFS environment with lock recovery
- Sample configuration for configuring NFS with separate storage
- Sample configuration when configuring all NFS services in a parallel service group
- About configuring the RemoteGroup agent
- About configuring Samba service groups
- Configuring the Coordination Point agent
- About testing resource failover by using HA fire drills
- Section III. VCS communication and operations
- About communications, membership, and data protection in the cluster
- About cluster communications
- About cluster membership
- About membership arbitration
- About membership arbitration components
- About server-based I/O fencing
- About majority-based fencing
- About making CP server highly available
- About the CP server database
- Recommended CP server configurations
- About the CP server service group
- About the CP server user types and privileges
- About secure communication between the VCS cluster and CP server
- About data protection
- About I/O fencing configuration files
- Examples of VCS operation with I/O fencing
- About cluster membership and data protection without I/O fencing
- Examples of VCS operation without I/O fencing
- Summary of best practices for cluster communications
- Administering I/O fencing
- About administering I/O fencing
- About the vxfentsthdw utility
- General guidelines for using the vxfentsthdw utility
- About the vxfentsthdw command options
- Testing the coordinator disk group using the -c option of vxfentsthdw
- Performing non-destructive testing on the disks using the -r option
- Testing the shared disks using the vxfentsthdw -m option
- Testing the shared disks listed in a file using the vxfentsthdw -f option
- Testing all the disks in a disk group using the vxfentsthdw -g option
- Testing a disk with existing keys
- Testing disks with the vxfentsthdw -o option
- About the vxfenadm utility
- About the vxfenclearpre utility
- About the vxfenswap utility
- About administering the coordination point server
- CP server operations (cpsadm)
- Cloning a CP server
- Adding and removing VCS cluster entries from the CP server database
- Adding and removing a VCS cluster node from the CP server database
- Adding or removing CP server users
- Listing the CP server users
- Listing the nodes in all the VCS clusters
- Listing the membership of nodes in the VCS cluster
- Preempting a node
- Registering and unregistering a node
- Enable and disable access for a user to a VCS cluster
- Starting and stopping CP server outside VCS control
- Checking the connectivity of CP servers
- Adding and removing virtual IP addresses and ports for CP servers at run-time
- Taking a CP server database snapshot
- Replacing coordination points for server-based fencing in an online cluster
- Refreshing registration keys on the coordination points for server-based fencing
- About configuring a CP server to support IPv6 or dual stack
- Deployment and migration scenarios for CP server
- About migrating between disk-based and server-based fencing configurations
- Migrating from disk-based to server-based fencing in an online cluster
- Migrating from server-based to disk-based fencing in an online cluster
- Migrating between fencing configurations using response files
- Sample response file to migrate from disk-based to server-based fencing
- Sample response file to migrate from server-based fencing to disk-based fencing
- Sample response file to migrate from single CP server-based fencing to server-based fencing
- Response file variables to migrate between fencing configurations
- Enabling or disabling the preferred fencing policy
- About I/O fencing log files
- Controlling VCS behavior
- VCS behavior on resource faults
- About controlling VCS behavior at the service group level
- About the AutoRestart attribute
- About controlling failover on service group or system faults
- About defining failover policies
- About AdaptiveHA
- About system zones
- About sites
- Load-based autostart
- About freezing service groups
- About controlling Clean behavior on resource faults
- Clearing resources in the ADMIN_WAIT state
- About controlling fault propagation
- Customized behavior diagrams
- About preventing concurrency violation
- VCS behavior for resources that support the intentional offline functionality
- VCS behavior when a service group is restarted
- About controlling VCS behavior at the resource level
- Changing agent file paths and binaries
- VCS behavior on loss of storage connectivity
- Service group workload management
- Sample configurations depicting workload management
- The role of service group dependencies
- About communications, membership, and data protection in the cluster
- Section IV. Administration - Beyond the basics
- VCS event notification
- VCS event triggers
- About VCS event triggers
- Using event triggers
- List of event triggers
- About the dumptunables trigger
- About the globalcounter_not_updated trigger
- About the injeopardy event trigger
- About the loadwarning event trigger
- About the nofailover event trigger
- About the postoffline event trigger
- About the postonline event trigger
- About the preonline event trigger
- About the resadminwait event trigger
- About the resfault event trigger
- About the resnotoff event trigger
- About the resrestart event trigger
- About the resstatechange event trigger
- About the sysoffline event trigger
- About the sysup trigger
- About the sysjoin trigger
- About the unable_to_restart_agent event trigger
- About the unable_to_restart_had event trigger
- About the violation event trigger
- Virtual Business Services
- Section V. Cluster configurations for disaster recovery
- Connecting clusters–Creating global clusters
- How VCS global clusters work
- VCS global clusters: The building blocks
- Visualization of remote cluster objects
- About global service groups
- About global cluster management
- About serialization - The Authority attribute
- About resiliency and "Right of way"
- VCS agents to manage wide-area failover
- About the Steward process: Split-brain in two-cluster global clusters
- Secure communication in global clusters
- Prerequisites for global clusters
- About planning to set up global clusters
- Setting up a global cluster
- Configuring application and replication for global cluster setup
- Configuring clusters for global cluster setup
- Configuring global cluster components at the primary site
- Installing and configuring VCS at the secondary site
- Securing communication between the wide-area connectors
- Gcoconfig utility support
- Configuring remote cluster objects
- Configuring additional heartbeat links (optional)
- Configuring the Steward process (optional)
- Configuring service groups for global cluster setup
- Configuring a service group as a global service group
- About IPv6 support with global clusters
- About cluster faults
- About setting up a disaster recovery fire drill
- Multi-tiered application support using the RemoteGroup agent in a global environment
- Test scenario for a multi-tiered environment
- Administering global clusters from the command line
- About administering global clusters from the command line
- About global querying in a global cluster setup
- Administering global service groups in a global cluster setup
- Administering resources in a global cluster setup
- Administering clusters in global cluster setup
- Administering heartbeats in a global cluster setup
- Setting up replicated data clusters
- Setting up campus clusters
- Connecting clusters–Creating global clusters
- Section VI. Troubleshooting and performance
- VCS performance considerations
- How cluster components affect performance
- How cluster operations affect performance
- VCS performance consideration when booting a cluster system
- VCS performance consideration when a resource comes online
- VCS performance consideration when a resource goes offline
- VCS performance consideration when a service group comes online
- VCS performance consideration when a service group goes offline
- VCS performance consideration when a resource fails
- VCS performance consideration when a system fails
- VCS performance consideration when a network link fails
- VCS performance consideration when a system panics
- VCS performance consideration when a service group switches over
- VCS performance consideration when a service group fails over
- About scheduling class and priority configuration
- VCS agent statistics
- About VCS tunable parameters
- Troubleshooting and recovery for VCS
- VCS message logging
- Log unification of VCS agent's entry points
- Enhancing First Failure Data Capture (FFDC) to troubleshoot VCS resource's unexpected behavior
- GAB message logging
- Enabling debug logs for agents
- Enabling debug logs for IMF
- Enabling debug logs for the VCS engine
- Enabling debug logs for VxAT
- About debug log tags usage
- Gathering VCS information for support analysis
- Gathering LLT and GAB information for support analysis
- Gathering IMF information for support analysis
- Message catalogs
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting issues with systemd unit service files
- If a unit service has failed and the corresponding module is still loaded, systemd cannot unload it and so its package cannot be removed
- If a unit service is active and the corresponding process is stopped outside of systemd, the service cannot be started again using 'systemctl start'
- If a unit service takes longer than the default timeout to stop or start the corresponding service, it goes into the Failed state
- Troubleshooting Intelligent Monitoring Framework (IMF)
- Troubleshooting service groups
- VCS does not automatically start service group
- System is not in RUNNING state
- Service group not configured to run on the system
- Service group not configured to autostart
- Service group is frozen
- Failover service group is online on another system
- A critical resource faulted
- Service group autodisabled
- Service group is waiting for the resource to be brought online/taken offline
- Service group is waiting for a dependency to be met.
- Service group not fully probed.
- Service group does not fail over to the forecasted system
- Service group does not fail over to the BiggestAvailable system even if FailOverPolicy is set to BiggestAvailable
- Restoring metering database from backup taken by VCS
- Initialization of metering database fails
- Error message appears during service group failover or switch
- Troubleshooting resources
- Troubleshooting sites
- Troubleshooting I/O fencing
- Node is unable to join cluster while another node is being ejected
- The vxfentsthdw utility fails when SCSI TEST UNIT READY command fails
- Manually removing existing keys from SCSI-3 disks
- System panics to prevent potential data corruption
- Cluster ID on the I/O fencing key of coordinator disk does not match the local cluster's ID
- Fencing startup reports preexisting split-brain
- Registered keys are lost on the coordinator disks
- Replacing defective disks when the cluster is offline
- The vxfenswap utility exits if rcp or scp commands are not functional
- Troubleshooting CP server
- Troubleshooting server-based fencing on the VCS cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting the steward process
- Troubleshooting licensing
- Validating license keys
- Licensing error messages
- [Licensing] Insufficient memory to perform operation
- [Licensing] No valid VCS license keys were found
- [Licensing] Unable to find a valid base VCS license key
- [Licensing] License key cannot be used on this OS platform
- [Licensing] VCS evaluation period has expired
- [Licensing] License key can not be used on this system
- [Licensing] Unable to initialize the licensing framework
- [Licensing] QuickStart is not supported in this release
- [Licensing] Your evaluation period for the feature has expired. This feature will not be enabled the next time VCS starts
- Troubleshooting secure configurations
- VCS message logging
- VCS performance considerations
- Section VII. Appendixes
Configuring IPsec using strongSwan on SLES systems
This process involves installing strongSwan and then using it to create a secure VPN configuration. All the commands described in this process need to be run with the root user privileges.
To install strongSwan
- Install strongSwan and the related RPMs.
# zypper install strongswan*
- Verify that strongSwan is installed.
# rpm -qa | grep strongswan
To create a secure host-to-host VPN configuration
- Create the
/etc/ipsec.conffile.Sample entries in the configuration file:
# ipsec.conf - strongSwan IPsec configuration file # basic configuration config setup charon.start_all=true conn %default <-------- This need not be changed as this is default. keyexchange=ikev2 ikelifetime=60m keylife=20m rekeymargin=3m keyingtries=1 authby=secret conn myvpn <-------- Connection 1 created for link1 left=10.10.10.1 <-------- Local node IP address of first private LLT link leftsubnet=10.10.255.255 <-------- Subnet of first Private LLT link of local node right=10.10.10.2 <-------- Remote node IP address of first private LLT link rightsubnet=10.10.255.255 <-------- Subnet of first Private LLT link of remote node auto=start conn myvpn1 <-------- Connection 1 created for link2 left=20.20.20.1 <-------- Local node IP address of second private LLT link leftsubnet=20.20.255.255 <-------- Subnet of second Private LLT link of local node right=20.20.20.2 rightsubnet=20.20.255.255 auto=start - Create the
/etc/ipsec.secretsfile, which contains information about the private keys and the mechanism used for the IPSec configuration.Sample entries in the secrets file:
# ipsec.secrets # This file holds the RSA private keys or the PSK preshared secrets for the IKE/IPsec authentication. See the ipsec.secrets(5) manual page. 10.10.10.1 10.10.10.2 : PSK "vcs1234" ---------- 10.10.10.1 is local/left node IP and 10.10.10.2 is remote/right node IP. ---------- Pre-shared key (PSK) is a mechanism similar to RSA, EAP.
To enable strongSwan and start the service
- To view the IPsec VPN configuration.
# systemctl enable strongswan-starter.service
- Restart the service.
# systemctl restart strongswan-starter.service
This service internally starts the IPsec service as well.
To verify the host-to-host VPN
- Enable the strongSwan service.
# ipsec status Security Associations (1 up, 0 connecting): myvpn[2]: ESTABLISHED 17 minutes ago, 10.10.10.1[10.10.10.1]...10.10.10.2[10.10.10.2] myvpn{3}: INSTALLED, TUNNEL, reqid 1, ESP SPIs: cf633b82_i cba64c78_o myvpn{3}: 10.10.255.255/32 === 10.10.255.255/32 - To view further details about the IPsec configuration.
# ipsec statusall Status of IKE charon daemon (strongSwan 5.9.11, Linux 5.14.21-150500.53-default, x86_64): uptime: 17 minutes, since May 22 14:11:46 2025 malloc: sbrk 3170304, mmap 0, used 1174208, free 1996096 worker threads: 10 of 16 idle, 6/0/0/0 working, job queue: 0/0/0/0, scheduled: 4 loaded plugins: charon ldap pkcs11 mgf1 nonce x509 revocation constraints pubkey pkcs1 pkcs7 pkcs12 pgp dnskey sshkey pem openssl pkcs8 af-alg fips-prf gmp curve25519 agent hmac kdf gcm drbg curl soup attr kernel-netlink resolve socket-default farp stroke vici smp updown eap-identity eap-sim eap-sim-pcsc eap-aka eap-aka-3gpp2 eap-simaka-pseudonym eap-simaka-reauth eap-md5 eap-gtc eap-mschapv2 eap-dynamic eap-radius eap-tls eap-ttls eap-peap eap-tnc xauth-generic xauth-eap xauth-pam tnc-imc tnc-imv tnc-tnccs tnccs-20 tnccs-11 tnccs-dynamic dhcp certexpire led duplicheck radattr addrblock unity counters Listening IP addresses: 30.30.30.30 10.10.10.1 20.20.20.1 Connections: myvpn: 10.10.10.1...10.10.10.2 IKEv2 myvpn: local: [10.10.10.1] uses pre-shared key authentication myvpn: remote: [10.10.10.2] uses pre-shared key authentication myvpn: child: 10.10.255.255/32 === 10.10.255.255/32 TUNNEL Security Associations (1 up, 0 connecting): myvpn[2]: ESTABLISHED 16 minutes ago, 10.10.10.1[10.10.10.1]...10.10.10.2[10.10.10.2] myvpn[2]: IKEv2 SPIs: b28588f4fd24c13b_i 1287cf3c3b31e804_r*, pre-shared key reauthentication in 38 minutes myvpn[2]: IKE proposal: AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/ECP_256 myvpn{3}: INSTALLED, TUNNEL, reqid 1, ESP SPIs: cf633b82_i cba64c78_o myvpn{3}: AES_CBC_128/HMAC_SHA2_256_128, 0 bytes_i, 0 bytes_o, rekeying in 14 minutes myvpn{3}: 10.10.255.255/32 === 10.10.255.255/32