Search <book_title>...

Important Update: Cohesity Products Documentation

All Cohesity product documentation are now managed via the Cohesity Docs Portal: https://docs.cohesity.com/HomePage/Content/home.htm. Some documentation available here may not reflect the latest information or may no longer be accessible.

InfoScale™ 9.0 Cluster Server Administrator's Guide - Linux

Last Published: 2025-08-11

Product(s): InfoScale & Storage Foundation (9.0)

Platform: Linux

Auto-revival of service group dependency trees upon recovery of faulted resources

Consider a configuration with the following service group and resource dependencies:

Service group SG3 contains three resources, R1, R2, and R3. R1 is dependent on R2, which in turn is dependent on R3 (R1 > R2 > R3).
SG2 is dependent on SG3 (SG2 > SG3).
SG1 is dependent on SG2 (SG1 > SG2).

In such a configuration, if R3 faults, VCS triggers a failover. It takes SG1, SG2, and SG3 offline and attempts to find a target node on which to bring the service groups online. If a suitable node is not found, SG3 remains in the OFFLINE|FAULTED state and SG1 and SG2 remain in the OFFLINE state.

Figure: Effect of a resource fault on the entire service group dependency tree

Typically, leaf-level resources like R3 are infrastructure components - often persistent (OnOnly) resources (for example, NIC). Such a resource fault triggers a chain reaction and the entire dependency tree goes down. When that resource recovers automatically, without any VCS user intervention, only the state of the recovered resource changes to ONLINE. The dependent resources and its service group (R2, R1, and SG3) remain in the OFFLINE|FAULTED state and the parent service groups remain in the OFFLINE state.

Figure: Current behaviour when faulted resource recovers from fault

From release 9.0.2 onwards, InfoScale provides the auto-revival feature for service group dependency trees. When this feature is enabled, and a faulted resource comes online, the dependent resources and service groups - whose states were directly affected by the fault, and not due to other conditions or events - are automatically brought online. For example, in this sample configuration (SG1 > SG2 > SG3 (R1 > R2 > R3)), VCS revives only those dependent resources and service groups that went into the OFFLINE|FAULTED or OFFLINE state specifically due to that resource (R3) fault.

Figure: Auto-revival of service group dependency tree

The auto-revival of a service group dependency tree works only when the following prerequisites are met:

The cluster protocol version is 11100 or higher.
The cluster version is 9.0.2 or higher.
The dependencies between the service groups are set to any of the following:
- online local hard
- online local firm
- online remote firm
- online global firm
- online site firm

Note:

The auto-revival feature is not supported with global service groups.