-
Story
-
Resolution: Done
-
Normal
-
rhel-10.0.beta
-
kexec-tools-2.0.28-10.el10
-
None
-
FutureFeature, TestOnly
-
2
-
rhel-kernel-debug
-
ssg_core_kernel
-
16
-
22
-
3
-
False
-
False
-
-
No
-
CK-June-2024, CK-July-2024
-
Pass
-
RegressionOnly
-
If docs needed, set a value
-
-
x86_64
-
None
-
57,005
This bug was initially created as a copy of Bug #2118897
I am copying this bug because:
Corresponding to kernel change, the user space need adjustment too to make the feature take effect. For kexec_file_load, things as below need bedone:
- Prevent udev from updating kdump crash kernel on hot un/plug changes.
Add the following as the first lines to the udev rule file
/usr/lib/udev/rules.d/98-kexec.rules:
- The kernel handles updates to crash elfcorehdr for cpu and memory changes
SUBSYSTEM=="cpu", ATTRS {crash_hotplug}=="1", GOTO="kdump_reload_end"
SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
These lines will cause cpu and memory hot un/plug events to be
skipped within this rule file, if the kernel has these changes
enabled.
Description of problem:
When kdump service is loaded, if a CPU or memory is hot un/plugged, the crash elfcorehdr, which describes the CPUs and memory in the system, must also be updated, else the resulting vmcore is inaccurate (eg. missing either CPU context or memory regions).
The current solution utilizes udev to initiate an unload-then-reload of the kdump image (e. kernel, initrd, boot_params, puratory and elfcorehdr) by the userspace kexec utility. This brings significant performance problems related to offloading this activity to userspace.
In upstream, a patchset introduces a generic crash hot un/plug handler that registers with the CPU and memory notifiers. Upon CPU or memory changes, this generic handler is invoked and invokes architecture specific handler to do the appropriate updates.
In the case of x86_64, the arch specific handler generates a new
elfcorehdr, and overwrites the old one in memory. No involvement
with userspace needed.
[PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug
https://lore.kernel.org/all/20220721181747.1640-1-eric.devolder@oracle.com/T/#u
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
- external trackers
- links to
-
RHBA-2024:131448
kexec-tools bug fix and enhancement update