Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-81029

RHEL9-gcp SEV-SNP boot flaky, stuck at bootloader

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Normal Normal
    • None
    • rhel-9.5.z
    • grub2
    • None
    • Yes
    • None
    • rhel-bootloader
    • ssg_core_services
    • 8
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • x86_64
    • None

      What were you trying to do that didn't work?

      We have a grid of partner-provided OS images that we routinely test for quality. When AMD SEV-SNP is enabled, we see that the boot periodically gets stuck in the bootloader waiting for a keypress ("Press Enter to boot")

      • Red Hat Enterprise Linux (5.14.0-503.23.2.el9_5.x86_64) 9.5 (Plow)
      • Red Hat Enterprise Linux (0-rescue-37889fda123049acabf06d4beaf19121) 9.4

      It will time out after 900 seconds.
      This flake is not good for customer boot times on instance creation.

      '' "  Booting `Red Hat Enterprise Linux (5.14.0-503.23.2.el9_5.x86_64) 9.5 (Plow)'"
      '' 'error: ../../grub-core/disk/efi/efidisk.c:615:failure reading sector 0xd59840'
      "from `hd0'." 'error: ../../grub-core/loader/i386/efi/linux.c:258:you need to load the kernel' 'first.' ''
      'Press any key to continue...' '\x1b[0m\x1b[30m\x1b[40m\x1b[2J\x1b[01;01H\x1b[0m\x1b[37m\x1b[40m\x1b[02;32HGRUB version 2.06'
      '' '\x1b[04;02HÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿\x1b[05;02H³\x1b[05;79H³\x1b[06;02H³\x1b[06;79H³\x1b[07;02H³\x1b[07;79H³\x1b[08;02H³\x1b[08;79H³\x1b[09;02H³\x1b[09;79H³\x1b[10;02H³\x1b[10;79H³\x1b[11;02H³\x1b[11;79H³\x1b[12;02H³\x1b[12;79H³\x1b[13;02H³\x1b[13;79H³\x1b[14;02H³\x1b[14;79H³\x1b[15;02H³\x1b[15;79H³\x1b[16;02H³\x1b[16;79H³\x1b[17;02H³\x1b[17;79H³\x1b[18;02H³\x1b[18;79H³\x1b[19;02H³\x1b[19;79H³\x1b[20;02H³\x1b[20;79H³\x1b[21;02H³\x1b[21;79H³\x1b[22;02H³\x1b[22;79H³\x1b[23;02H³\x1b[23;79H³\x1b[24;02H³\x1b[24;79H³\x1b[25;02H³\x1b[25;79H³\x1b[26;02H³\x1b[26;79H³\x1b[27;02H³\x1b[27;79H³\x1b[28;02H³\x1b[28;79H³\x1b[29;02H³\x1b[29;79H³\x1b[30;02H³\x1b[30;79H³\x1b[31;02H³\x1b[31;79H³\x1b[32;02H³\x1b[32;79H³\x1b[33;02H³\x1b[33;79H³\x1b[34;02H³\x1b[34;79H³\x1b[35;02H³\x1b[35;79H³\x1b[36;02H³\x1b[36;79H³\x1b[37;02H³\x1b[37;79H³\x1b[38;02H³\x1b[38;79H³\x1b[39;02H³\x1b[39;79H³\x1b[40;02H³\x1b[40;79H³\x1b[41;02H³\x1b[41;79H³\x1b[42;02HÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ\x1b[43;02H\x1b[44;02H     Use the ^ and v keys to select which entry is highlighted.          '
      "      Press enter to boot the selected OS, `e' to edit the commands       "
      "      before booting or `c' for a command-line. ESC to return previous    "
      '      menu.                                                               \x1b[05;80H \x1b[0m\x1b[30m\x1b[47m\x1b[05;03H*Red Hat Enterprise Linux (5.14.0-503.23.2.el9_5.x86_64) 9.5 (Plow)         \x1b[0m\x1b[37m\x1b[40m\x1b[05;78H\x1b[06;03H Red Hat Enterprise Linux (5.14.0-427.13.1.el9_4.x86_64) 9.4 (Plow)         \x1b[06;78H\x1b[07;03H Red Hat Enterprise Linux (0-rescue-37889fda123049acabf06d4beaf19121) 9.4 (>'

      What is the impact of this issue to you?

      Normal impact. We have low usage of RHEL9 on SEV-SNP.

      Please provide the package NVR for which the bug is seen:

      N/A

      How reproducible is this bug?:

      5% of runs fail.

      Steps to reproduce

      gcloud compute instances create test-boot --image-project=rhel-cloud --image-family=rhel-9 --zone=us-central1-a --confidential-compute-type=SEV_SNP --maintenance-policy=TERMINATE --min-cpu-platform="AMD Milan"

      Try `gcloud compute ssh test-boot` for a few minutes.
      If successful, delete the instance and try again.

      Expected results

      Consistent boot behavior that doesn't get stuck in the boot menu.

      Actual results

      Flaky boot behavior that gets stuck in the boot menu. This is not a failure we see in other Linux distros that use grub2.

      summary

      RHEL9-gcp SEV-SNP boot flaky, stuck at bootloader

              bootloader-eng-team bootloader -eng-team
              dionnaglaze Dionna Glaze (Inactive)
              bootloader -eng-team bootloader -eng-team
              Release Test Team Release Test Team
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: