VMware Linux client backups with ext3/4 filesystems consume excessive amount of memory on NetBackup appliance backup host
Problem
On a NetBackup Appliance being used as a VMware backup host where the client being backed up is Linux based and uses either the ext3 or ext4 filesystem, the bpbkarv process can consume excessive amounts of memory. In this event, system memory can become exhausted and the Operating System Out of Memory process ( oom-killer) starts to terminate processes (normally bpbkarv or spoold depending on which has the current largest memory footprint).Results can be individual backup jobs failing all or jobs going to the MSDP disk pool if the spoold process is terminated.
Error Message
An oom-killer message could be recorded in the /var/log/messages file for the first process which is unable to allocate memory (In this case, mongod, but note it is not the process using the memory):Feb 4 16:11:31 nbu-app2 kernel: mongod invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
<snip>
Feb 4 16:11:31 doh1-nbu-app2 kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj nameFeb 4 16:12:11 doh1-nbu-app2 kernel: [283282] 0 283282 11046483 10826375 21 0 0 bpbkarv
Looking at the rss (resident memory usage) value for this bpbkarv process, it can be seen that it is 11046483 (which is in 4k blocks) equalling around 41GB of memory for this one process.
Reviewing the system live with the top -a command (which sorts the output in order of decreasing memory usage) will also show that bpbkarv process(es) are consuming a large % of the system memory (RES):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22639 root 20 0 62.6g 61g 92m S 36.6 49.1 127:24.96 bpbkarv
Cause
There is a memory leak within the VxMS mapping function for ext3/ext4 filesystems where the filesystem contains a large (millions) of files.Solution
Veritas Technologies LLC has acknowledged this issue (Etrack 3910424). We are committed to product quality and satisfied customers.
This issue may be resolved in a future major revision of the software at a later time. However, this particular issue is not currently scheduled for any release. If you feel this issue has a direct business impact for you and your continued use of the product, please contact your Veritas Sales representative or the Veritas Sales group to discuss these concerns. For information on how to contact Veritas Sales, please see http://www.veritas.com
Workaround:
If this issue is experienced, please contact Veritas technical support, referencing this document ID and Etrack 3926633 to determine Emergency Engineering Binary (EEB) availability to resolve this issue.