Uploaded image for project: 'CoreOS OCP'
  1. CoreOS OCP
  2. COS-2120

[coreos/fedora-coreos-tracker] old bootloader versions don't boot new aarch64 6.2+ kernels

XMLWordPrintable

    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • Sprint 233 - Team FirstBoot
    • 0
    • 0.0

      [1626176406] Upstream Reporter: Dusty Mabe
      Upstream issue status: Closed
      Upstream description:

      I just pro-actively updated my `t4g.medium` AWS instance to `38.20230310.1.0` and it didn't come back. Upon inspecting the serial console I see:

      ```
      error: ../../grub-core/loader/arm64/linux.c:58:invalid magic number.
      error: ../../grub-core/loader/arm64/linux.c:278:you need to load the kernel
      first.

      Press any key to continue...
      ```

      Pressing a key and selecting the older boot entry (thankfully I had console access) allowed me to re-connect with my system.

      This system was provisioned a long time ago with `34.20210904.2.0` (`testing` stream; later moved over to the `next` stream to allow for earlier testing).

      The problem here is that by default the bootloader on machines isn't updated so it keeps the one from when you first installed the machine. [bootupd](https://github.com/coreos/bootupd) was created to solve this problem, but is [still a work in progress](https://github.com/coreos/bootupd#status) so not widely used.

      Here's what it shows on my system:

      ```
      [core@dustymabe ~]$ sudo bootupctl status
      Component EFI
      Installed: grub2-efi-aa64-1:2.06-2.fc34.aarch64,shim-aa64-15.4-4.aarch64
      Update: Available: grub2-efi-aa64-1:2.06-88.fc37.aarch64,shim-aa64-15.6-2.aarch64
      No components are adoptable.
      CoreOS aleph image ID: fedora-coreos-34.20210904.2.0-qemu.aarch64.qcow2
      Boot method: EFI
      ```

      After updating the bootloader...

      ```
      [core@dustymabe ~]$ sudo bootupctl update
      Updated EFI: grub2-efi-aa64-1:2.06-88.fc37.aarch64,shim-aa64-15.6-2.aarch64
      ```

      I am able to boot the system:

      ```
      [core@dustymabe ~]$ rpm-ostree status
      State: idle
      AutomaticUpdatesDriver: Zincati
      DriverState: active; periodically polling for updates (last checked Wed 2023-03-15 19:38:43 UTC)
      Deployments:
      ? fedora:fedora/aarch64/coreos/next
      Version: 38.20230310.1.0 (2023-03-10T22:51:50Z)
      Commit: b0fdf736cdbbd3971380d5549635e30155f07af6100925d987de623b4722637f
      GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464

      fedora:fedora/aarch64/coreos/next
      Version: 37.20230303.1.0 (2023-03-06T18:55:26Z)
      Commit: 0e785d34bddf7ff985fe49a4a9bdf2e88050c366f02b19f28df74b67fb3792ae
      GPGSignature: Valid signature by ACB5EE4E831C74BB7C168D27F55AD3FB5323552A

      ```

      This is most likely due to recent changes for aarch64 kernels around [EFI_ZBOOT](https://cateee.net/lkddb/web-lkddb/EFI_ZBOOT.html), which we also think is the root cause for https://github.com/coreos/fedora-coreos-tracker/issues/1430.

            rhn-coreos-bgilbert Benjamin Gilbert (Inactive)
            upstream-sync Upstream Sync
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: