Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-13538

FFU upgrade from 16.2.6 to 17.1.4 - openstack leapp upgrade of compute with NVIDIA GPU card failed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • rhos-17.1.z
    • rhos-17.1.5
    • documentation
    • None
    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • None
    • Hide
      .Leapp does not support NVIDIA drivers for operating system upgrades
      If you attempt to perform a Leapp OS upgrade with NVIDIA drivers, the system upgrade fails with the following error in `/var/log/leapp/leapp-report.txt`:
      ----
      Summary: Leapp has detected that the NVIDIA proprietary driver has been loaded, which also means the nouveau driver is blacklisted. If you upgrade now, you will end up without a graphical session, as the newer kernel won't be able to load the NVIDIA driver module and nouveau will still be blacklisted.
      Please uninstall the NVIDIA graphics driver before upgrading to make sure you have a graphical session after upgrading.
      ----
      *Workaround:*

      . Remove the NVIDIA driver. For example:
      +
      ----
      $ sudo dnf remove -y NVIDIA-vGPU-rhel-8.4-525.105.14.x86_64
      ----
      . Remove the loaded module kernels:
      +
      ----
      $ rmmod nvidia_vgpu_vfio
      $ rmmod nvidia
      ----
      . Upgrade the Compute node:
      +
      ----
      $ openstack overcloud upgrade run --tag system_upgrade --limit <compute-0>
      ----
      . After the server reboot, re-install the NVIDIA drivers for the appropriate operating system (RHEL 9.2).
      . If necessary, re-create the `mdev` devices.
      Show
      .Leapp does not support NVIDIA drivers for operating system upgrades If you attempt to perform a Leapp OS upgrade with NVIDIA drivers, the system upgrade fails with the following error in `/var/log/leapp/leapp-report.txt`: ---- Summary: Leapp has detected that the NVIDIA proprietary driver has been loaded, which also means the nouveau driver is blacklisted. If you upgrade now, you will end up without a graphical session, as the newer kernel won't be able to load the NVIDIA driver module and nouveau will still be blacklisted. Please uninstall the NVIDIA graphics driver before upgrading to make sure you have a graphical session after upgrading. ---- *Workaround:* . Remove the NVIDIA driver. For example: + ---- $ sudo dnf remove -y NVIDIA-vGPU-rhel-8.4-525.105.14.x86_64 ---- . Remove the loaded module kernels: + ---- $ rmmod nvidia_vgpu_vfio $ rmmod nvidia ---- . Upgrade the Compute node: + ---- $ openstack overcloud upgrade run --tag system_upgrade --limit <compute-0> ---- . After the server reboot, re-install the NVIDIA drivers for the appropriate operating system (RHEL 9.2). . If necessary, re-create the `mdev` devices.
    • Known Issue
    • Done
    • Important

      After completing the upgrade to RHEL 9.2 of the Controller and CEPH Storage nodes, the openstack leapp upgrade of the compute nodes with NVIDIA GPU card failed on:

      ~~~
      Summary: Leapp has detected that the NVIDIA proprietary driver has been loaded, which also means the nouveau driver is blacklisted. If you upgrade now, you will end up without a graphical session, as the newer kernel won't be able to load the NVIDIA driver module and nouveau will still be blacklisted.

      Please uninstall the NVIDIA graphics driver before upgrading to make sure you have a graphical session after upgrading.
      ~~~

      This requirement should be tracked in the FFU procedure (from 16.2 to 17.1) in such way that all customers with NVIDIA GPU cards installed and used can identify the steps/actions to be performed in preventive manner ... this before starting the FFU and not in the middle of it.

      Meanwhile, do you have any knowledge of the steps required to resolve this issue ?

        1. leapp-report.json
          61 kB
          Riccardo Bruzzone
        2. leapp-report.txt
          14 kB
          Riccardo Bruzzone
        3. leapp-upgrade.log
          4.20 MB
          Riccardo Bruzzone

              kgilliga@redhat.com Katie Gilligan
              rhn-support-rbruzzon Riccardo Bruzzone
              rhos-dfg-upgrades
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: