Clustered File Server unresponsive in a hung state following large number of FSA Placeholder recalls.
ProblemAn issue has been reported where upon retrieval of a large number of archived items while recalling placeholders, the cluster node on which the file server is online becomes unresponsive due to a deadlock.
Investigation around the cause of the deadlock has led to a functional change within the methods used by the FSA mini filter driver.
If CSC (Client Side Caching) is running on the client, then the server sees the file I/O requests. On the file server, a PAGED-read operation is initiated because of the Cache Manager’s read ahead mechanism. In due course, the Cache Manager acquires the lock on PagingIoResource in shared mode. A PreRead callback is being hit in the FSA Driver because of the PAGED-read operation.
In PreRead or PreWrite callbacks, the file is recalled and converted back to a normal file. In the process of converting the file back to normal, the reparse point is removed [done by calling FltUntagFile()] and the actual file is replaced.
The call to FltUntagFile is synchronous, which attempts to acquire the same lock, and results in a deadlock.
The FSA driver is being modified so that a deadlock no longer occurs in this situation.
This issue has been addressed as part of the following release:
Enterprise Vault 10.0.3 - Release Details
Enterprise Vault 9.0.5 - Release Details
Enterprise Vault (EV) File System Archiving (FSA) installed on Microsoft Windows 2008 R2 Failover Cluster
Was this content helpful?
Rating submitted. Please provide additional feedback (optional):