Getting allocation of sparsearray failed during a backup of multiple sparse files.

Article: 100029989
Last Published: 2011-12-30
Ratings: 0 0
Product(s): NetBackup

Problem

BUG REPORT: Getting allocation of sparsearray failed during a backup of multiple sparse files.

Bug: 1069999
 
Symptoms:
Backing up large or multiple sparse files can result in allocation failed error (exit status 10).

 

Log Files:

bpbkar log on the client shows the following error -

18:02:28.247 [352294] <2> bpbkar process_file: INF - <file> is sparse: stat.st_size = 253108813824, stat.st_blocks * 512 = 253108760576
18:20:06.057 [352294] <16> allocate_sparsearray_space: ERR - allocation of 33554432 entry sparsearray failed.
18:20:06.103 [352294] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 10:allocation failed

Error Message

18:20:06.057 [352294] <16> allocate_sparsearray_space: ERR - allocation of 33554432 entry sparsearray failed.
18:20:06.103 [352294] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 10:allocation failed

Cause

The issue happens because of the way the memory is handled during a backup of sparse files.

The sparse file handling in bpbkar creates the entire sparse header in the memory rather than
writing out each sparse header when it is full. NetBackup starts with a size of 256 elements 512 bytes each for
the sparse header. Every time the size of the header is exceeded, it is doubled in size,
allocating a new header in the memory and freeing up the previous one. When the backup of a
sparse file is finished, the sparse header is freed up from the memory.

If backing up multiple sparse files, the process of growing of the sparse header is
restarted for each file. The process of reallocating memory to the new larger sparse header
and freeing up the previous smaller one is fragmenting the memory space such that
at some point there is no more enough large fragments to get allocated.


Definition:
NetBackup assumes that the file is sparse, if the size of a file in bytes is greater than
the product of the number of blocks and the block size in bytes

to list the number of blocks of a file use ls -s command.

For example,
To list the size of the file in bytes use ls -l:
# ls -l <file>
-rw-r--r--   1 root     system   264850374656 Jun  8 03:35 <file>

To list the number of blocks use ls -s:
# ls -s <file>
258642920 <file>

Take the number of blocks from the above output (258642920) and multiply it by
the block size (1024), which equals to 264850350080. This result is smaller
than the size in bytes from the ls -l output, so this file is treated as sparse by NetBackup.

Solution

Note about AIX clients:
If the problem is encountered on AIX clients and the files being backed up were restored
using AIX restore command, use the restore -e option to restore the non sparse files as non sparse.
If the -e option is not used, the files will be restored as sparse even if they were non sparse before
running AIX backup command.
 
 
Fix:
Contact Veritas Technical Support to request a binary for ET1069999.
 
After applying the binary, create a file which will determine the mimimum size of holes in sparse files.
The file is called: /usr/openv/netbackup/HOLE_GFACTOR
 
Place an integer value in this file (e.g. 4 or 8), this value is then multiplied by 512 bytes,
which then represents the minimum size of holes NetBackup will detect in sparse files.
 
For example, if a "4" is put in the file, then the minimum size of a hole will be 2048 bytes (4 * 512 = 2048).
Set the value such that the minimum hole size to the file system block size makes sense,
since holes less than that are still stored on disk.
 
If setting HOLE_GFACTOR to "4" still produces errors, try setting it to "8" next.
 
Without the HOLE_GFACTOR file the default is -
256 bytes
 

Explanation of how backup works with HOLE_GFACTOR in place -

bpbkar does preprocessing of the file looking for a set of sparse data (all 0s) inside of the file.
HOLE_GFACTOR is the size of the smallest hole that bpbkar will look for during the preprocessing of the file.

The sparse part of the file with all 0s is what bpbkar is looking for and trying to omit in the backup.
HOLE_GFACTOR determines how large of a hole (of all 0s) bpbkar will omit in the backup.


For example, if the size of the hole is 1024 bytes, but HOLE_GFACTOR is
2k (i.e. 2028), that hole of 1024 will get backed up as its smaller than the HOLE_GFACTOR.


But if the HOLE_GFACTOR is set to a too large number, it may happen that bpbkar will end up backing up only the data.
The recommendation is to try to set HOLE_GFACTOR to as small number as possible to avoid the allocation failures and only increase it if still getting allocation failures.


The file that is backed up is different from the source as the sparse areas of the files are omitted. The file does get backed up just without some sparse areas of the file.


References

Etrack : 1069999

Was this content helpful?