Replicated file systems become unresponsive when latency protection is enabled in VVR (Volume Replicator)

Problem

Intermittent or persistent hangs in a replicated file system can cause upper-level applications (such as a database) to fail due to a timeout. This may be caused by the latency protection feature, if the high_water_mark and low_water_mark settings are too far apart and the network has a low bandwidth relative to the actual write-rates generated by the upper-level application. 

Solution

If the configuration is not a synchronous replication configuration, the latency protection feature is not advised (contrary to the implied advice in the Admin Guide) and should be left OFF.  If used as advised in the Admin Guide, the high_water_mark and low_water_mark values ( value is 'number of updates') should be spaced closely as a slow link could cause the SRL to drain slowly and be unable to reach the low_water_mark in time to prevent an application timeout.

Stopping or pausing the replication will clear the apparent hang, but the correct solution is to not enable the latency protection feature, or experiment with high_water_mark and low_water_mark values closer to each other. Wide variations in network quality or throughput can cause this feature fail to meet stability expectations.

 

Applies To

VERITAS Volume Replicator (VVR) with latency protection enabled in an Asynchronous Replication configuration.

Terms of use for this information are found in Legal Notices.

Search

Survey

Did this article answer your question or resolve your issue?

No
Yes

Did this article save you the trouble of contacting technical support?

No
Yes

How can we make this article more helpful?

Email Address (Optional)