Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-8732

[RHEL 10] crash: Kernel handling of CPU and memory hot un/plug -- kexec-tools part

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • rhel-kernel-debug
    • ssg_core_kernel
    • 16
    • 22
    • 3
    • False
    • False
    • Hide

      None

      Show
      None
    • No
    • CK-June-2024, CK-July-2024
    • If docs needed, set a value
    • None
    • 57,005

      This bug was initially created as a copy of Bug #2118897

      I am copying this bug because:
      Corresponding to kernel change, the user space need adjustment too to make the feature take effect. For kexec_file_load, things as below need bedone:

      • Prevent udev from updating kdump crash kernel on hot un/plug changes.
        Add the following as the first lines to the udev rule file
        /usr/lib/udev/rules.d/98-kexec.rules:
      1. The kernel handles updates to crash elfcorehdr for cpu and memory changes
        SUBSYSTEM=="cpu", ATTRS {crash_hotplug}=="1", GOTO="kdump_reload_end"
        SUBSYSTEM=="memory", ATTRS{crash_hotplug}

        =="1", GOTO="kdump_reload_end"

      These lines will cause cpu and memory hot un/plug events to be
      skipped within this rule file, if the kernel has these changes
      enabled.

      Description of problem:
      When kdump service is loaded, if a CPU or memory is hot un/plugged, the crash elfcorehdr, which describes the CPUs and memory in the system, must also be updated, else the resulting vmcore is inaccurate (eg. missing either CPU context or memory regions).

      The current solution utilizes udev to initiate an unload-then-reload of the kdump image (e. kernel, initrd, boot_params, puratory and elfcorehdr) by the userspace kexec utility. This brings significant performance problems related to offloading this activity to userspace.

      In upstream, a patchset introduces a generic crash hot un/plug handler that registers with the CPU and memory notifiers. Upon CPU or memory changes, this generic handler is invoked and invokes architecture specific handler to do the appropriate updates.

      In the case of x86_64, the arch specific handler generates a new
      elfcorehdr, and overwrites the old one in memory. No involvement
      with userspace needed.

      [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug
      https://lore.kernel.org/all/20220721181747.1640-1-eric.devolder@oracle.com/T/#u

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:

      Expected results:

      Additional info:

              bhe@redhat.com Baoquan He
              bhe@redhat.com Baoquan He
              Baoquan He Baoquan He
              Xiaoying Yan Xiaoying Yan
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: