Event ID: 18 is reported by VCS GAB in the Windows Event Viewer

Article: 100018553
Last Published: 2023-09-20
Ratings: 1 1
Product(s): InfoScale & Storage Foundation

Problem

For a Veritas Cluster Server (VCS) for Windows, the Windows System event logs reports Event ID:18 for a VCS module named Global Atomic Broadcast (GAB).

These events can appear in batches of 3-4 consecutive events or can appear randomly, which are followed by a restart of the Veritas High Availability Daemon (HAD).

 

Error Message

Log Name: System

Source: GAB

Event ID: 18

Level: Information

Description:

The description for Event ID 18 from source GAB cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

Checking the Details tab for the Event ID will report information below in the Friendly View

0000: 00 00 3c 00 01 00 88 00 ..<...ˆ.
0008: 00 00 00 00 12 00 07 40 .......@
0010: 04 00 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: 47 41 42 20 57 41 52 4e GAB WARN
0030: 49 4e 47 20 56 2d 31 35 ING V-15
0038: 2d 31 2d 32 30 30 35 37 -1-20057
0040: 20 50 6f 72 74 20 68 20 Port h
0048: 70 72 6f 63 65 73 73 20 process
0050: 31 20 69 6e 61 63 74 69 1 inacti
0058: 76 65 20 37 20 73 65 63 ve 7 sec

0060: 0a 00 00 00 ....

As seen above, it reports GAB WARNING V-15-1-20057 Port h process 1 inactive 7 sec.

Other common messages in this sequence in the Details tab for the Event ID when seen in the Friendly View, can include any or all of the following:

GAB WARNING V-15-1-20057 Port h process 1 inactive 8 sec

GAB WARNING V-15-1-20057 Port h process 1 inactive 15 sec

GAB WARNING V-15-1-20058 Port h process 1: heart beat failed, killing process....

GAB INFO V-15-1-20059 Port h heartbeat interval 30000 msec. Statistics:.

GAB INFO V-15-1-20129 Port h: heartbeats in 0 ~ 6000 msec: xxxx

GAB INFO V-15-1-20129 Port h: heartbeats in 6000 ~ 12000 msec: 0.

GAB INFO V-15-1-20129 Port h: heartbeats in 12000 ~ 18000 msec: 0.

GAB INFO V-15-1-20129 Port h: heartbeats in 18000 ~ 24000 msec: 0

GAB INFO V-15-1-20129 Port h: heartbeats in 24000 ~ 30000 msec: 0.

GAB INFO V-15-1-20041 Port h: client process failure: killing process...

GAB INFO V-15-1-20032 Port h closed.

GAB INFO V-15-1-20032 Port c closed.

 

The High Availability Daemon being terminated will be reported with the event below. 

Log Name: System

Source: Service Control Manager

Event ID: 7034

Level: Error

Description: 

The VERITAS High availability engine service terminated unexpectedly. It has done this 1 time(s).

 

The following event may likely be logged in the Engine_A.log at the %VCS_HOME%/log path

VCS WARNING V-16-1-10485 Excessive delay between successive calls to GAB heartbeat (XX seconds)

 

Cause

This issue occurs when GAB has not received heartbeats from the HAD process within the timeout period.

GAB is a kernel process that monitors the user mode process had.exe. In situations where the server is under heavy load or is low on virtual memory/resources, HAD may be unable to send or acknowledge heartbeats in a timely manner. 

The GAB process will always be able to run and maintain cluster membership. If GAB determines HAD is not responding to heartbeats, then it will close port h which effectively terminates the HAD process. 

The message for the High Availability Daemon being terminated will follow the full sequence of above event messages.

 

Solution

1. Review the Windows Event Viewer to determine if any low memory issues are reported or any activity is taking place at the time of the GAB events which may cause the server to become busy.

2. If this is an issue existing in real-time, then tools like Windows Performance Monitor can be used to track processes, processor, and memory for the affected server.

3. As a workaround, the GAB timeout threshold can be increased. Further information about this can be found in  How to modify the Global Atomic Broadcast (GAB) timeout 
 
NOTE: The increase in GAB timeout is a tuning at VCS level to tolerate the delayed response. The root cause of the server being busy or low on resources should be addressed at the server level. 
 

Was this content helpful?