Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8967

Requirements for nvidia GPU driver container for driver toolkit

XMLWordPrintable

    • Critical
    • None
    • Rejected
    • Unspecified
    • If docs needed, set a value

      Description of problem:

      This bug is created with reference to 1976302.

      The Errata has been released in 4.7.21 version but its not working as expected.

      The NVIDIA GPU operator currently requires entitlements to build the driver container, but the driver-toolkit could be used to prevent this. The problem is that the driver-toolkit is currently missing the gcc package needed for this driver-container. This is needed for RHODS.

      The fix is to install gcc, specifically the version used to compile the kernel when available.

      Version-Release number of selected component (if applicable):

      4.7.21

      Steps to Reproduce:
      1. Install Nvidia GPU drivers as part of the node hardware
      2. Installed Node Feature Discovery and NVIDIA GPU operators, then created NodeFeatureDiscovery operand and then created ClusterPolicy after that nvidia-driver-daemonset-* pods are failing.

      Actual results:

      Following errors are observed in nvidia-driver-daemonset-* pod logs and pods are in crashloopbackoff status.
      ...
      + echo 'Installing elfutils...'
      + dnf install -q -y elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64
      Error: Unable to find a match: elfutils-libelf-devel.x86_64
      ...

      Expected results: The pod nvidia-driver-daemonset-* should be up and running without any errors

      Following errors should not be seen.

      ...
      + echo 'Installing elfutils...'
      + dnf install -q -y elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64
      Error: Unable to find a match: elfutils-libelf-devel.x86_64
      ...

      Additional info:

      Reference KCS : https://access.redhat.com/solutions/5820151

              tocampbe@redhat.com Tony Campbell
              rhn-support-skanniha1 Sphoorthi Kanni Hanumantharya
              Tony Campbell Tony Campbell
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: