Enable KDUMP on RHEL server

How to enable kdump on RHEL Server

Pre-requisites :

For dumping cores to a network target, access to a server over NFS or ssh is required

Whether dumping locally or to a network target, a device or directory with enough free disk space is needed to hold the core.

For configuring kdump on a system running a Xen kernel, it is required to have a regular kernel of the same version as the running Xen kernel installed on the system. (If the system is 32-bit with more than 4GB of RAM, kernel-pae should be installed alongside kernel-xen instead of kernel.) Note: The kernel need only be installed. You can continue running the Xen kernel, and no reboot is required.


Installation :

1) Install the kexec-tools

   yum install kexec-tools

2) Need to add boot parameters :

The option crashkernel must be added to the kernel command line parameters in order to reserve memory for the kdump kernel:

The following is an example of /boot/grub/grub.conf with the kdump options added for RHEL 5:

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You do not have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /, eg.
#          root (hd0,0)
#          kernel /boot/vmlinuz-version ro root=/dev/hda1
#          initrd /boot/initrd-version.img
#boot=/dev/hda
default=0
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title Red Hat Enterprise Linux Client (2.6.17-1.2519.4.21.el5)
        root (hd0,0)
        kernel /boot/vmlinuz-2.6.17-1.2519.4.21.el5 ro root=LABEL=/ rhgb quiet crashkernel=128M@16M
        initrd /boot/initrd-2.6.17-1.2519.4.21.el5.img


 for RHEL 6:

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/mapper/vg_example-lv_root
#          initrd /initrd-[generic-]version.img
# boot=/dev/vda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Red Hat Enterprise Linux Server (2.6.32-71.7.1.el6.x86\_64)
       root (hd0,0)
       kernel /vmlinuz-2.6.32-71.7.1.el6.x86_64 ro root=/dev/mapper/vg_example-lv_root rd_LVM_LV=vg_example/lv_root rd_LVM_LV=vg_example/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us crashkernel=128M rhgb quiet
       initrd /initramfs-2.6.32-71.7.1.el6.x86_64.img

For RHEL 5 when using a Xen kernel:

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You do not have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /, eg.
#          root (hd0,0)
#          kernel /boot/vmlinuz-version ro root=/dev/hda1
#          initrd /boot/initrd-version.img
# boot=/dev/hda
default=0
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title Red Hat Enterprise Linux Server (2.6.18-194.17.1.el5xen)
       root (hd0,0)
       kernel /xen.gz-2.6.18-194.17.1.el5 crashkernel=128M@16M
       module /vmlinuz-2.6.18-194.17.1.el5xen ro root=/dev/myvg/rootvol
       module /initrd-2.6.18-194.17.1.el5xen.img

3) Specifying the kdump location :

Location :/etc/kdump.conf
The location of the kdump vmcore can be specified in /etc/kdump.conf
Default kdump store location : /var/crash

Dumping to a SAN Device ( For RHEL6 with multipath device)

Note: This method is supported by Red Hat. Please read below sentences.


This configuration is only vaildate from kexec-tools-2.0.0-245.el6.x86_64 version,if user uses old kexec-tools package,user can not use multipath device for kdump.



# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.3 (Santiago)

# uname -a
Linux xxxxx 2.6.32-279.22.1.el6.x86_64 #1 SMP Sun Jan 13 09:21:40 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

# rpm -qa | grep kexec
kexec-tools-2.0.0-245.el6.x86_64

# rpm -qa | grep multipath
device-mapper-multipath-0.4.9-56.el6_3.1.x86_64
device-mapper-multipath-libs-0.4.9-56.el6_3.1.x86_64

1.Checking multipath status


# multipath -ll
mpathf (3600144f08c3d8b000000511a51b10002) dm-7
size=100G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 12:0:0:1 sdk 8:160 active ready running
  |- 13:0:0:1 sdm 8:192 active ready running
  |- 14:0:0:1 sdo 8:224 active ready running
  |- 15:0:0:1 sdq 65:0  active ready running
  `- 16:0:0:1 sds 65:32 active ready running

2.Now let's get a partition created on our lun, make sure you have the right one


# fdisk -l 
# fdisk /dev/mapper/mpath<x>

3.Create linux partition on the disk


# partprobe /dev/mapper/mpath<x>
# multipath -r

4.Validate the partition is there


# fdisk -l

5.Put an ext3 fs on it (probably could do ext4)


# mkfs.ext3  /dev/mapper/mpath<x>p1

6.Now update /etc/fstab adding the following to the end of the file
Using UUID.

7.Check uuid with blkid command.


# blkid


8.Make the entries in fstab

# vi /etc/fstab
  Ex:
        UUID=b2d74f2e-2dbf-4714-9787-ba1c147c4386           /var/crash            ext4     defaults,_netdev 0 0    <---for iscsi multipath
       
        UUID=b2d74f2e-2dbf-4714-9787-ba1c147c4386           /var/crash            ext4     defaults               0 0    <---for SAN Multipath

9.Validate that the partition will mount automatically


# mount -a
# mount

10.Now edit /etc/kdump.conf accordingly


ext3 UUID=b2d74f2e-2dbf-4714-9787-ba1c147c4386

Restart kdump and chkconfig on.
Raw

# service kdump restart

# chkconfig kdump on

11.Make sure sysrq is enabled and test the crash. This will crash the system, so do it at the right time if this a production system.

#############################################
How do I enable and disable the SysRq key?

    For security reasons, Red Hat Enterprise Linux disables the SysRq key by default. To enable it, run:
    Raw

    # echo 1 > /proc/sys/kernel/sysrq

    To disable it:
    Raw

    # echo 0 > /proc/sys/kernel/sysrq  

    To enable it permanently, set the kernel.sysrq value in /etc/sysctl.conf to 1.
    Raw

    # grep sysrq /etc/sysctl.conf
    kernel.sysrq = 1

    To make this change live and persistent, run:
    Raw

    # sysctl -p
#################################################

# echo 'c' > /proc/sysrq-trigger

12.Once system boots back check to confirm that it worked


# tree /var/crash/
/var/crash/
├── 127.0.0.1-2013-02-12-21:11:03
│   └── vmcore
└── lost+found



Comments

Popular posts from this blog

[SOLVED]* Please wait for the system Event Notification service

Rebuild the initial ramdisk image in Red Hat Enterprise Linux

Python reference Interview questions