Uploaded image for project: 'CoreOS OCP'
  1. CoreOS OCP
  2. COS-2747

Impact Upgrading baremetal UPI cluster with different CPUs failed, the node won't boot with new kernel

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • 0
    • 0

      Impact assessment for OCPBUGS-31320

      Which 4.y.z to 4.y'.z' updates increase vulnerability?

      • Any update to 4.12.45 through 4.12.51
      • Any update to 4.11.54 through 4.11.58

      Which types of clusters?

      • Clusters with nodes using AMD 19h family CPUs (Bergamo, Milan, etc)

      What is the impact? Is it serious enough to warrant removing update recommendations?

      • Nodes fail to boot after applying OS updates

      How involved is remediation?

      • Apply the cluster update 4.12.52 or later then on any nodes which fail to boot append the kernel parameter `dis_ucode_ldr` to disable firmware updates for that boot. Once the node is subsequently updated verify that the kernel parameter is no longer present
      • Note, we're not recommending 4.11 clusters update to 4.11.59 because OCP 4.11 is EOL, however that version similarly includes the fix.

      Is this a regression?

      • Yes, linux-firmware-20220210-112.git6342082c.el8_6 used in the versions referenced above may cause nodes to fail to boot. This is fixed in linux-firmware-20220210-114.git6342082c.el8_6 or later introduced in 4.12.52.

            sdodson_jira Scott Dodson
            trking W. Trevor King
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: