Veritas NetBackup™ Flex Scale Administrator's Guide
- Product overview
- Viewing information about the NetBackup Flex Scale cluster environment
- NetBackup Flex Scale infrastructure management
- User management
- About Universal Shares
- Node and disk management
- License management
- User management
- NetBackup Flex Scale network management
- Bonding operations
- Data network configurations
- NetBackup Flex Scale infrastructure monitoring
- Resiliency in NetBackup Flex Scale
- EMS server configuration
- Site-based disaster recovery in NetBackup Flex Scale
- Performing disaster recovery using RESTful APIs
- NetBackup Flex Scale security
- Troubleshooting
- Collecting logs for cluster nodes
- Troubleshooting NetBackup Flex Scale issues
- Appendix A. Configuring NetBackup optimized duplication
- Appendix B. Disaster recovery terminologies
- Appendix C. Configuring Auto Image Replication
Handling split-brain scenario in NetBackup Flex Scale
A split-brain occurs when the cluster membership view differs among the cluster nodes, increasing the chance of data corruption. With majority-based I/O fencing, the potential for data corruption is eliminated as it provides a reliable arbitration mechanism which does not require any extra hardware. In a split-brain scenario, arbitration is done based on `majority` number of nodes among the sub-clusters. The node with the lowest node ID in the cluster is called the leader node and it a role in case of a tie.
Deciding cluster majority for majority-based I/O fencing mechanism:
If N is defined as the total number of nodes in the cluster, then majority is equal to N/2 + 1.
If there are even number of cluster nodes and both the sub-clusters have N/2 number of nodes, the partition with the leader node is treated as majority and that partition survives.
How majority-based I/O fencing works
An algorithm is used to decide the winner sub-cluster in the following way:
The node with the lowest node ID in the current cluster membership is designated as the leader node in the fencing race.
When a network partition occurs, each racer sub-cluster computes the number of nodes in its partition and compares it with the majority value.
If a racer finds that its partition does not have majority, it sends a LOST_RACE message to all the nodes in its partition including itself and all the nodes panic.
If the racer finds that it does have majority, it sends a WON_RACE message to all the nodes. Thus, the partition with majority nodes survives.