Linux RedHat / Suse Kernel Panic analysis with kdump and crash.

  • Modified Date:
  • Article ID:000009942


Some VxVM commands can be very I/O intensive and occasionally panic the system.  The root cause of the panic can be determined by examining the system core file.  However the system must be previously configured to generate and save the system core file.

Error Message

If the root cause is an I/O related hang, there may be no indication in the messages file about the hang or panic.  A core analysis may be needed.


If the problem is repeatable, then enable kdump and load crash and kernel debug rpms on your machine.  In this example we are running Red Hat Enterprise Linux Server release 5.5 (Tikanga).  Verify that you have kernel headers, kernel-debuginfo-common and kernel-debuginfo, kdump and crash:


Some of these RPMs are on the install disk, others must be downloaded from RedHat at:

For kernel-debug and kernel-debug-common rpms.

 First, configure kdump.  Few admins seem to do this on linux system, but it can be done with the graphic GUI tool :   /usr/bin/system-config-kdump.  Using this utility will reserve 128 MB from your system memory for the "crash kernel" that does the dump.

Add or modify /etc/sysctl.conf to add these parameters:


# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 1

# Enable auto system reboot after system crash
kernel.panic = 60

Set these parameters interactivly on the OS command line if desired:

[]# sysctl -w kernel.sysrq=1
[]# sysctl -w kernel.panic=60

-----  After the Crash ----

In this test case, the panic problem was recreated and the system core was written to /var/crash/<date:time>/vmcore.  Crash analysis can begin with the following command:

# crash /boot/ /usr/lib/debug/lib/modules/2.6.18-194.el5/vmlinux ./vmcore 

This shows that the last command issued was the OS "vol_id" command.  (This information is shown by default when the utility is run.)

 SYSTEM MAP: /boot/
DEBUG KERNEL: /usr/lib/debug/lib/modules/2.6.18-194.el5/vmlinux (2.6.18-194.el5)
    DUMPFILE: ./vmcore
        CPUS: 4
        DATE: Thu Nov 11 13:09:34 2010
      UPTIME: 00:05:03
LOAD AVERAGE: 0.05, 0.23, 0.12
       TASKS: 313
     RELEASE: 2.6.18-194.el5
     VERSION: #1 SMP Tue Mar 16 21:52:39 EDT 2010
     MACHINE: x86_64  (1596 Mhz)
      MEMORY: 2 GB
       PANIC: "Oops: 0000 [1] SMP " (check log for details)
         PID: 7957
     COMMAND: "vol_id"
        TASK: ffff81005214b820  [THREAD_INFO: ffff810051fe8000]
         CPU: 3

  The most useful piece of information is a so called stacktrace, or "backtrace." Typing "bt" at the prompt asks crash/gdb to print one:

crash> bt
PID: 7957   TASK: ffff81005214b820  CPU: 3   COMMAND: "vol_id"
 #0 [ffff810051fe9730] crash_kexec at ffffffff800aeb6b
 #1 [ffff810051fe97f0] __die at ffffffff80066157
 #2 [ffff810051fe9830] do_page_fault at ffffffff80067dd7
 #3 [ffff810051fe9920] error_exit at ffffffff8005ede9
    [exception RIP: part_round_stats+19]
    RIP: ffffffff801447a1  RSP: ffff810051fe99d8  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: ffff81007ab57ac0  RCX: d600000000000000
    RDX: 0000000000000000  RSI: 8000000000000000  RDI: ffff81007ab57ac0
    RBP: 0000000100000d7e   R8: 000000000000000f   R9: 0000000000000000
    R10: ffff810009930388  R11: ffffffff8014c80a  R12: 0000000000000000
    R13: 0000000000000001  R14: 00000000013efd00  R15: 0000000000800032
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #4 [ffff810051fe99f0] drive_stat_acct at ffffffff80144969


  It shows that the exception occurred in part_round.  Searching RedHat for these codes gives a possible match for a known bugzilla:

 This concludes the research necessary to find the cause of the crash.  In this case, Redhat will provide a fix or workround to this problem.





Applies To

This document uses the following configuration:

Redhat 5.5, SF 5.1, EMC Clariion Disk with multipath.

Terms of use for this information are found in Legal Notices.



Did this article answer your question or resolve your issue?


Did this article save you the trouble of contacting technical support?


How can we make this article more helpful?

Email Address (Optional)