Improving poor Media Server Deduplication (MSDP) performance on systems with large memory

Article: 100047479
Last Published: 2025-08-20
Ratings: 2 1
Product(s): Appliances, NetBackup

Problem

Jobs to MSDP either perform slower than expected or fail with memory allocation type errors.

 

Error Message

CRQP failures in /<storage>/log/spoold/storaged.log:

March 25 02:59:23 ERR [140581301278464]: 25022: CRQP_ProcessTlog: Could not process spool entry 624331 in /msdp/data/dp1/pdvol/queue/41364.tlog: out of memory
March 25 02:59:23 ERR [140581301278464]: 25022: AddEntry: out of memory.
March 25 02:59:23 ERR [140581301278464]: 25022: processDcTRefImpl: Failed to add entry to reference swap file

Also in storaged:

April 05 14:50:26 ERR [140350948976384]: 25004: ProcessBuffer: exception happens while processing tlog 55628 ~ 55942 (Some CRQP thread failed to update refdb.).
April 05 14:50:27 ERR [140350948976384]: 25022: BatchProcessTlogRefOp: Failed to update refdb with REFOP entries in tlog [55628 ~ 55942] out of memory.
April 05 14:50:27 ERR [140350948976384]: 25022: __storageclassSinglePass: Could not process tlog files 55628-55942:out of memory

A third symptom in storaged:

April 05 14:45:17 ERR [140267995903744]: 22: RefDBEngine::write_prepare fail to malloc 1101868bytes buffer to serialize reference
April 05 14:45:17 ERR [140267995903744]: 4: RefDBManager::write_prepare fail to prepare CRQP.0 transaction for refdb 11381

In /var/log/messages vxdmp also has memory allocation failures:

Mar 26 05:27:07 kernel: VxVM vxdmp V-5-3-0 memory allocation failed for size =0xe8000

During fingerprint cache load in /<storage>/log/spoold/spoold.log:

April 01 03:37:31 INFO [140371459786496]: cacheLoadMain: Data Store 4 loaded 3998000 containers into index cache, current container is 101679.
April 01 03:37:38 ERR [140371459786496]: 25012: preallocator_slab_add: could not allocate memory (../preallocator.c:266)
April 01 03:37:38 ERR [140371459786496]: 25012: preallocator_alloc: could not allocate memory

Another behavior was spoold restarting itself during cache load or seemingly randomly under load.

 

Cause

Spoold by default allocates memory in 2 MB chunks. Most Linux based operating systems have vm.max_map_count set to 65530, referring to the maximum number of memory allocations possible. That value of 65530 times 2 MB is about 128 GB memory that spoold could use. If a system has 1.5 TB memory, then more than 1 TB of it would not be eligible for use by spoold.

 

Solution

To diagnose if this is the issue:

1. check the vm.max_map_count setting of the host:

  $ grep vm.max_map_count /etc/sysctl.conf
  vm.max_map_count = 65530


  $ cat /proc/sys/vm/max_map_count
  65530

2. Next, find the PID of spoold and check how many maps it is using. There is an overhead of 2. In the below output, spoold is using the max number of maps (65530 + 2) which confirms this issue:

  $ wc /proc/180690/maps
  65532 1310837 12851554

 

3. Another layer of evidence is buddyinfo. This command output shows how many free memory pages exist per size. The rows containing 'Normal' are of interest. Note the high number of zeroes in the example below where Node 0 had 0 32 KB chunks free and Node 1 had 0 64 KB chunks free.

$ cat /proc/buddyinfo
Node 0, zone      DMA      1      0      1      0      0      1      1      0      1      1      3
Node 0, zone    DMA32     13     11     10     16     10      5      7      7      8     10    311
Node 0, zone   Normal 2760370 1699017  51484      0      0      0      0      0      0      0      0
Node 1, zone   Normal 1293146 1898701 1993623   9694      0      0      0      0      0    
0      0

More information on how to read this can be found at the link below.

https://www.supportsages.com/what-is-proc-buddyinfo/

4. This can be solved by increasing the allocation unit size for spoold to cause it to allocate larger contiguous pieces of memory.

a. Update vm.max_map_count = 262144 with the following operations. These operations require the root permission. 

Add the following line to /etc/sysctl.conf if the line of vm.max_map_count does not exist, otherwise, just update it. 

vm.max_map_count = 262144  

And then run the following command to apply the change: 

sysctl -p 
 

b. Make a backup copy of /<storage>/etc/puredisk/contentrouter.cfg

c. Edit the file, locating the line containing 'AllocationUnitSize' and changing it to a higher value (32 MB is used below)

    AllocationUnitSize=32MiB

d. Save/quit the file and restart spoold:
    $ /usr/openv/pdde/pdconfigure/etc/init.d/RedHat/pdservice restart spoold

5. A change in behavior should be realized soon after the restart. Checking buddyinfo should show larger chunks of memory available which spoold is now able to use.

 $ cat /proc/buddyinfo

  • Node 0, zone DMA 0 0 1 0 2 1 1 0 0 1 3
  • Node 0, zone DMA32 17 16 16 13 12 9 10 13 8 10 306
  • Node 0, zone Normal 1212 115 1187136 1691773 1422004 949352 515242 165411 33413 14674 313 Node 1, zone
  • Normal 5232448 8389951 4961591 1385326 497407 312687 163422 60920 18253 2732 331

Internal Notes

 

Was this content helpful?