DMP IO statistic thread may cause out of memory issue resulting in Linux OOM (Out Of Memory) killer and causes system panic.

Problem

System panics after the process of expanding DMP IO statistic queue size. The following stack message can be observed in syslog before panic:

oom_kill_process+0x8a/0x2c0
select_bad_process+0xe1/0x120
out_of_memory+0x220/0x3c0
__alloc_pages_nodemask+0x89e/0x940
alloc_pages_current+0xaa/0x110
__vmalloc_area_node+0xe6/0x190
dmp_alloc+0x176/0x220 [vxdmp]
__vmalloc_node+0xa2/0xb0
dmp_alloc+0x176/0x220 [vxdmp]
vmalloc_32+0x2c/0x30
dmp_alloc+0x176/0x220
dmp_zalloc+0x1e/0x50
dmp_iostatq_add+0xef/0x690
dmp_iostatq_op+0x2cc/0x800
dmp_process_stats+0x0/0xe60
dmp_daemons_loop+0x1d7/0x260

Error Message

From syslog:

Sep 19 12:31:55 node4 kernel: dmp_daemon invoked oom-killer: gfp_mask=0xd4, order=0, oom_adj=0, oom_score_adj=0
Sep 19 12:31:55 node4 kernel: Pid: 12488, comm: dmp_daemon Tainted: P           ----------------   2.6.32-220.7.1.el6.x86_64#1
Sep 19 12:31:55 node4 kernel: Call Trace:
Sep 19 12:31:55 node4 kernel: [<ffffffff810c2c61>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Sep 19 12:31:55 node4 kernel: [<ffffffff811139e0>] ? dump_header+0x90/0x1b0
Sep 19 12:31:55 node4 kernel: [<ffffffff8120d7ac>] ? security_real_capable_noaudit+0x3c/0x70
Sep 19 12:31:55 node4 kernel: [<ffffffff81113e6a>] ? oom_kill_process+0x8a/0x2c0
Sep 19 12:31:55 node4 kernel: [<ffffffff81113da1>] ? select_bad_process+0xe1/0x120
Sep 19 12:31:55 node4 kernel: [<ffffffff811142c0>] ? out_of_memory+0x220/0x3c0
Sep 19 12:31:55 node4 kernel: [<ffffffff81123fde>] ? __alloc_pages_nodemask+0x89e/0x940
Sep 19 12:31:55 node4 kernel: [<ffffffff81158b2a>] ? alloc_pages_current+0xaa/0x110
Sep 19 12:31:55 node4 kernel: [<ffffffff81149c36>] ? __vmalloc_area_node+0xe6/0x190
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c6656>] ? dmp_alloc+0x176/0x220 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81149b42>] ? __vmalloc_node+0xa2/0xb0
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c6656>] ? dmp_alloc+0x176/0x220 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81149d9c>] ? vmalloc_32+0x2c/0x30
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c6656>] ? dmp_alloc+0x176/0x220 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c671e>] ? dmp_zalloc+0x1e/0x50 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa07f8b5f>] ? dmp_iostatq_add+0xef/0x690 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81279000>] ? __bitmap_shift_right+0x130/0x160
Sep 19 12:31:55 node4 kernel: [<ffffffffa07fa66c>] ? dmp_iostatq_op+0x2cc/0x800 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa07fb845>] ? dmp_process_stats+0xaa5/0xe60 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81012b59>] ? read_tsc+0x9/0x20
Sep 19 12:31:55 node4 kernel: [<ffffffff8109b310>] ? getnstimeofday+0x60/0xf0
Sep 19 12:31:55 node4 kernel: [<ffffffffa07fada0>] ? dmp_process_stats+0x0/0xe60 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa0801ca7>] ? dmp_daemons_loop+0x1d7/0x260 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
Sep 19 12:31:55 node4 kernel: [<ffffffffa0801ad0>] ? dmp_daemons_loop+0x0/0x260 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
 

Cause

This issue is tracked via Symantec etrack incident # 2943637.

In the process of expanding DMP IO statistic queue size, memory is allocated in sleep/block way. When Linux kernel can’t satisfy the memory allocation request, i.e. system under high load and the amount of per-CPU memory chunk can be large since amounts of CPU, it will invoke OOM killer to kill other processes/threads to free more memory, which may cause system panic.

Solution

Symantec has made code changes to allocate memory in non-sleep way in the process of expanding DMP IO statistic queue size, hence, it will return fail quickly if the system can’t satisfy the request but not invoke OOM killer.

The fix will be available in VxVM 5.1SP1RP3P1 patch which is available from SORT website.

https://sort.symantec.com/patch/detail/6984

 

Until the patch is installed, it is suggested to implement the below workaround.

Workaround

To stop DMP IO statistics collection:

# vxdmpadm iostat stop

 


Applies To

This issue is only noticed on Linux systems running:

-VxVM 5.1SP1 and above

Terms of use for this information are found in Legal Notices.

Search

Survey

Did this article answer your question or resolve your issue?

No
Yes

Did this article save you the trouble of contacting technical support?

No
Yes

How can we make this article more helpful?

Email Address (Optional)