Problem
For a Veritas Cluster Server (VCS) for Windows, the Windows System event logs reports Event ID:18 for a VCS module named Global Atomic Broadcast (GAB).
These events can appear in batches of 3-4 consecutive events or can appear randomly, which are followed by a restart of the Veritas High Availability Daemon (HAD).
Error Message
Log Name: System
Source: GAB
Event ID: 18
Level: Information
Description:
The description for Event ID 18 from source GAB cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
Checking the Details tab for the Event ID will report information below in the Friendly View
0000: 00 00 3c 00 01 00 88 00 ..<...ˆ.
0008: 00 00 00 00 12 00 07 40 .......@
0010: 04 00 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: 47 41 42 20 57 41 52 4e GAB WARN
0030: 49 4e 47 20 56 2d 31 35 ING V-15
0038: 2d 31 2d 32 30 30 35 37 -1-20057
0040: 20 50 6f 72 74 20 68 20 Port h
0048: 70 72 6f 63 65 73 73 20 process
0050: 31 20 69 6e 61 63 74 69 1 inacti
0058: 76 65 20 37 20 73 65 63 ve 7 sec0060: 0a 00 00 00
....
As seen above, it reports GAB WARNING V-15-1-20057 Port h process 1 inactive 7 sec
.
Other common messages in this sequence in the Details tab for the Event ID when seen in the Friendly View, can include any or all of the following:GAB WARNING V-15-1-20057 Port h process 1 inactive 8 sec
GAB WARNING V-15-1-20057 Port h process 1 inactive 15 sec
GAB WARNING V-15-1-20058 Port h process 1: heart beat failed, killing process....
GAB INFO V-15-1-20059 Port h heartbeat interval 30000 msec. Statistics:.
GAB INFO V-15-1-20129 Port h: heartbeats in 0 ~ 6000 msec: xxxx
GAB INFO V-15-1-20129 Port h: heartbeats in 6000 ~ 12000 msec: 0.
GAB INFO V-15-1-20129 Port h: heartbeats in 12000 ~ 18000 msec: 0.
GAB INFO V-15-1-20129 Port h: heartbeats in 18000 ~ 24000 msec: 0
GAB INFO V-15-1-20129 Port h: heartbeats in 24000 ~ 30000 msec: 0.
GAB INFO V-15-1-20041 Port h: client process failure: killing process...
GAB INFO V-15-1-20032 Port h closed.
GAB INFO V-15-1-20032 Port c closed.
The High Availability Daemon being terminated will be reported with the event below.
Log Name: System
Source: Service Control Manager
Event ID: 7034
Level: Error
Description:
The VERITAS High availability engine service terminated unexpectedly. It has done this 1 time(s).
The following event may likely be logged in the Engine_A.log at the %VCS_HOME%/log path
VCS WARNING V-16-1-10485 Excessive delay between successive calls to GAB heartbeat (XX seconds)
Cause
This issue occurs when GAB has not received heartbeats from the HAD process within the timeout period.
GAB is a kernel process that monitors the user mode process had.exe. In situations where the server is under heavy load or is low on virtual memory/resources, HAD may be unable to send or acknowledge heartbeats in a timely manner.
The GAB process will always be able to run and maintain cluster membership. If GAB determines HAD is not responding to heartbeats, then it will close port h which effectively terminates the HAD process.
The message for the High Availability Daemon being terminated will follow the full sequence of above event messages.
Solution
2. If this is an issue existing in real-time, then tools like Windows Performance Monitor can be used to track processes, processor, and memory for the affected server.
3. As a workaround, the GAB timeout threshold can be increased. Further information about this can be found in How to modify the Global Atomic Broadcast (GAB) timeout