How to modify the Global Atomic Broadcast (GAB) timeout in Storage Foundation for Windows with High Availability 4.x and 5.x versions.

Article: 100022869
Last Published: 2015-09-14
Ratings: 0 0
Product(s): InfoScale & Storage Foundation

Problem

How to modify the Global Atomic Broadcast (GAB) timeout in Storage Foundation for Windows with High Availability 4.x and 5.x versions.

Error Message

Event Type: Error
Event Source: Service Control Manager
Event ID: 7034
The VERITAS High availability engine service terminated unexpectedly. It has done this 1 time(s).

Solution

The default values for GAB timeout should suit almost all client environments. In rare instances, this may not be true and it may become necessary to modify the default values specified during installation. Some prime examples would be environments where the systems are heavily taxed by excessively high CPU and system resource usage. In such environments, VERITAS Cluster Server (VCS) will not be allocated the resources it needs to operate effectively. By default, the Gab timeout is set for 15 seconds (15000 milliseconds). This indicates that if after 15 seconds GAB has not responded, the High Availability Daemon (HAD)  will be restarted.

In versions prior to SFW - HA 4.x VCS will log messages in the engine log indicating that there are issues. One such error is listed below:

VCS:10485:EXCESSIVE DELAY BETWEEN SUCCESSIVE CALLS TO GAB HEARTBEAT (X SECONDS)

In Storage Foundation for Windows with High Availability 4.x and 5.x versions, there will be following entries in the application log:

Event Type: Information, Event Source: Gab, Event ID: 18
System log may contain following entries:
Event Type: Error, Event Source: Service Control Manager, Event ID: 7034
Description:
The VERITAS High availability engine service terminated unexpectedly. It has done this 1 time(s).
In these instances, it may be necessary to modify the default timeout value to alleviate the issues. Please be advised that while this will eliminate or allow for longer gaps in response time for GAB, it will NOT correct the underlying resource issue on the clients system. To modify Gab timeout values, perform the following steps on ALL clustered nodes:

1. Right-click My Computer and select Properties.
2 . On the System Properties window, select the Advanced tab.  
3 . On the Advanced tab, click the Environment Variables button.
4 . On the Environment Variables window, locate System Variables and select New. (Figure 1)

Figure 1
 

5 . On the New System Variable window, enter VCS_GAB_TIMEOUT for the Variable Name.
6 . For the Variable Value, enter a number that is higher than the default of 15000* (15 seconds), then click OK. (Figure 2)

Figure 2
 

7 . On the Environment Variables window, click OK.
8 . On the System Properties window, click OK.
9 . Reboot the servers.

*This VCS_GAB_TIMEOUT value may take some experimentation as it is entirely dependent upon the system load.

Was this content helpful?