For a long time, Linux did not have a standard mechanism for capturing crash dumps. Different distributions included different techniques. In the 2.6 kernels we finally have a common method for managing crash dumps using kexec and kdump. This talk will describe the technologies involved and how to configure and manage a production environment to capture vmcores. Kdump is a new kernel crash dumping mechanism and is very reliable because the crash dump is captured from the context of a freshly booted kernel and not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever the system crashes. This second kernel, often called the capture kernel, boots with very little memory and captures the dump image. The first kernel reserves a section of memory that the second kernel uses to boot. Kexec enables booting the capture kernel without going through BIOS hence contents of the first kernel's memory are preserved, which is essentially the kernel crash dump. About Subhendu Ghosh: Subhendu Ghosh is currently a Sr. Solutions Architect with Red Hat and has been involved in Open Source for a while, including as Project Admin for Nagios-Plugins. Prior to Red Hat, Subhendu Worked at Qwest Communications and AT&T amongst other companies. He previously presented to NYLUG in April 2008 on the Cobbler Provisioning System. The slides used in this meeting are here:
Presentation slides