Oracle RMAN backup fails with status code 13 or 6

Article: 100011691
Last Published: 2015-02-03
Ratings: 0 1
Product(s): NetBackup & Alta Data Protection

Problem

Oracle RMAN Incremental Level 0 (full) backup often fails with status code 13 or 6.
 

Error Message

The RMAN output reports a failure during sbtwrite() or sbtwrite2() processing.

RMAN-03009: failure of backup command on ch02 channel at 01/20/2014 14:43:56
ORA-27192: skgfcls: sbtclose2 returned error - failed to close file
ORA-19511: Error received from media manager layer, error text:
   Failed to process backup file <bk_21535_1_837350379>
ORA-19502: write error on file "bk_21535_1_837350379", blockno 265380097 (blocksize=1024)
ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: Error received from media manager layer, error text:
   VxBSASendData: Failed with error:

Cause

In this particular case, the Oracle data files were pre-initialized to a size much larger than the data that they housed.  While Oracle was searching for additional real data to send, the media server timed out because the backup appeared to be hung.

This same symptom may occur if the network connection to the media server has significant packet loss and retransmission so that the backup appears to hang.
 

Solution

The best solution is to isolate and remove the Oracle or network delays.

Work-around

If the external delays cannot be minimized, a workaround is available to avoid the status 13 failure; the delays will still occur but not cause a failure. 

The goal is to extend the timer in bpbrm to a value long enough to allow Oracle and the network to find and send more of the backup piece.  This can be done two ways.

  • If the client is NetBackup version 7.6.1.1 or higher, then configure the Server Read Timeout on the client to a sufficiently large value.  The timeout change will affect only this Oracle backup job.  See the related article for details.
  • Otherwise, configure the Client Read Timeout on the media server to a sufficiently large value.  This will affect all jobs using that media server, and hang conditions for those jobs will not be detected for a longer period of time.

Be aware that increasing the timeout can mask other issues in the environment so if the backup is suddenly taking much longer than it did before, this should be researched and addressed rather than changing the timeout. 

 


Applies To

Any version of NetBackup.
Any version of Oracle, but typically with bigfile tablespaces.
 

Was this content helpful?