VMware snapshots may fail to back up Microsoft SQL server databases due to VSS timeout

Article: 100039869
Last Published: 2020-03-27
Ratings: 1 1
Product(s): NetBackup & Alta Data Protection

Problem

If a VMware snapshot takes longer than 60 seconds to quiesce the Virtual Machine running Microsoft SQL server, the databases may fail to be backed up.

 

Error Message

The following informational events can be seen in Windows Application log for each running database: 

Information    22/07/2017 12:39:12    MSSQLSERVER    3197    Server    I/O is frozen on database <database name>. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.

After exactly 60 seconds, the following error events can be seen:

Error    22/07/2017 12:40:12    SQLVDI    1    None    SQLVDI: Loc=SignalAbort. Desc=Client initiates abort. ErrorCode=(0). Process=380. Thread=12084. Client. Instance=. VD=Global\{6B8F9287-4802-4C88-B30B-EAAC0ED768E5}1_SQLVDIMemoryName_0. 

Error    22/07/2017 12:40:12    MSSQLSERVER    3041    Backup    BACKUP failed to complete the command BACKUP DATABASE <database name>. Check the backup application log for detailed messages.

Error    22/07/2017 12:40:12    MSSQLSERVER    18210    Server    BackupVirtualDeviceFile::TakeSnapshot:  failure on backup device '{6B8F9287-4802-4C88-B30B-EAAC0ED768E5}1'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

Error    22/07/2017 12:40:12    SQLVDI    1    None    SQLVDI: Loc=TriggerAbort. Desc=invoked. ErrorCode=(0). Process=2260. Thread=9244. Server. Instance=MSSQLSERVER. VD=Global\{6B8F9287-4802-4C88-B30B-EAAC0ED768E5}1_SQLVDIMemoryName_0. 

Error    22/07/2017 12:40:12    MSSQLSERVER    18210    Server    BackupVirtualDeviceFile::TakeSnapshot:  failure on backup device '{6B8F9287-4802-4C88-B30B-EAAC0ED768E5}1'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

Information    22/07/2017 12:40:12    MSSQLSERVER    3198    Server    I/O was resumed on database <database name>. No user action is required.

Error    22/07/2017 12:40:13    SQLWRITER    24583    None    "Sqllib error: OLEDB Error encountered calling ICommandText::Execute. hr = 0x80040e14. SQLSTATE: 42000, Native Error: 3013
Error state: 1, Severity: 16
Source: Microsoft SQL Server Native Client 11.0
Error message: BACKUP DATABASE is terminating abnormally.
SQLSTATE: 42000, Native Error: 3271
Error state: 1, Severity: 16
Source: Microsoft SQL Server Native Client 11.0
Error message: A nonrecoverable I/O error occurred on file ""{6B8F9287-4802-4C88-B30B-EAAC0ED768E5}1:"" 995(The I/O operation has been aborted because of either a thread exit or an application request.).
"

 

Cause

Microsoft VSS Writer Freeze operations did not complete within a fixed 60-second interval, therefore the databases could not be brought to a consistent state for the snapshot.

 

Solution

As per this Microsoft article: https://msdn.microsoft.com/en-us/library/windows/desktop/aa384615(v=vs.85).aspx:
 
  • "The full sequence from PreCommitSnapshots to the return of PostCommitSnapshots maps to the window between writers receiving the Freeze and Thaw events. The writer default for this window is 60 seconds, but a writer may override this value with a smaller timeout. For example, the Microsoft Exchange Server writer changes the timeout to 20 seconds. Providers should not spend more than a second or two in this method."

If this error is encountered, attempt any of the following:

  • Simply retry the operation, as it may be successful.
  • Try to reschedule the policy to run at a less busy time for the server.
  • Switch to an application backup (as if the application was hosted on a physical machine).

 

Was this content helpful?