How to prevent split brain scenario in Storage Foundation for Windows High Availability (SFW HA)

Article: 100019784
Last Published: 2013-09-06
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem

How to prevent split brain scenario in Storage Foundation for Windows High Availability (SFW HA)

Solution

What is split brain?

A split brain occurs when two independent systems configured in a cluster assume they have exclusive access to resources.  In SFW HA (VERITAS Cluster Server) this scenario can be caused when all cluster heartbeat links are simultaneously lost.  Each cluster node will then mark the other cluster node as FAULTED.  This is known as a "network partition".  

This is represented in the figure below:

 

This scenario is possible when both of the LLT (Low Latency Transport cluster communication links) are connected to Node 3 via the same IP network, for example, the same network switch.  This configuration of the common network switch needs careful consideration in Replicated Data Cluster (RDC) where Node 3 may be located in another Data Centre to get LLT links over separate network infrastructure.

What happens in a split brain?

Under cluster logic, VCS will online any groups that it now considers faulted.  The service groups however will be online on the other cluster node(s), that have formed a new cluster.  This may lead to disk resources and volumes being off-lined as each cluster attempts to online the "failed" service groups.

How to tell if you have been a victim of a split brain?

Symptoms of a split brain are where a service group is attempted to be on-lined on a cluster node on the other "side" of the network partition while it is still online elsewhere.  Initial errors will involve the original node recording disk access errors and loss of reservation of the disk group.

Using the above diagram as an example, after simultaneous LLT link failure creating a network partitions:
- Partition A containing Node 0, 1, 2
- Parition B containing Node 3

a) in the system event log LLT will log Event ID 10033 for links expired in the other partition, so Node 3 will log messages such as:
ERROR   10033(0xc0072731) LLT <server> Link expired (tag=Adapter1, link=1, node=1)
ERROR   10033(0xc0072731) LLT <server> Link expired (tag=Adapter0, link=0, node=1)

for node=0, node=1, node=2, and cluster nodes in Partition A will log LLT link expired messages for node = 3

b) in the application event log the High Availability Daemon (HAD) will log that cluster nodes in the other partition have changed to state FAULTED, so Node 3 will log:
ERROR   10322(0x05dd2852) Had    <server>                 VCS ERROR V-16-1-10322 System <server> (Node '0') changed state from RUNNING to FAULTED

for Node '0', Node '1' and Node '2', and cluster node in Partition A will log these messages against Node '3'.

How to minimize chances of split brain?

VCS uses heartbeats to determine the "health" of its peers. These can be private network heartbeats and/or public (low-priority) heartbeats. Regardless of the heartbeat configuration, VCS determines that a system has faulted when all heartbeats fail simultaneously. To prevent a split brain, following measures can be taken into consideration:

- Private Heartbeat - Ensure at least 2 private heartbeats are configured and these must be completely isolated from each other so the failure of one heartbeat link cannot possibly affect the other. Configurations such as running two shared heartbeats to the same hub or switch, or using a single virtual local area network (VLAN) to trunk between two switches induce a single point of failure in the heartbeat architecture and therefore should be avoided.

Refer to the reference article in the Related Document section for additional recommendations on the private heartbeat configurations for SFW HA.

- Low-Priority Heartbeat - Heartbeat over public network does minimum traffic over the network until you get down to one normal heartbeat remaining. Then it becomes a full functional heartbeat.


 
 

 

Was this content helpful?