-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.7
-
Critical
-
None
-
Rejected
-
Unspecified
-
If docs needed, set a value
Description of problem:
This bug is created with reference to 1976302.
The Errata has been released in 4.7.21 version but its not working as expected.
The NVIDIA GPU operator currently requires entitlements to build the driver container, but the driver-toolkit could be used to prevent this. The problem is that the driver-toolkit is currently missing the gcc package needed for this driver-container. This is needed for RHODS.
The fix is to install gcc, specifically the version used to compile the kernel when available.
Version-Release number of selected component (if applicable):
4.7.21
Steps to Reproduce:
1. Install Nvidia GPU drivers as part of the node hardware
2. Installed Node Feature Discovery and NVIDIA GPU operators, then created NodeFeatureDiscovery operand and then created ClusterPolicy after that nvidia-driver-daemonset-* pods are failing.
Actual results:
Following errors are observed in nvidia-driver-daemonset-* pod logs and pods are in crashloopbackoff status.
...
+ echo 'Installing elfutils...'
+ dnf install -q -y elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64
Error: Unable to find a match: elfutils-libelf-devel.x86_64
...
Expected results: The pod nvidia-driver-daemonset-* should be up and running without any errors
Following errors should not be seen.
...
+ echo 'Installing elfutils...'
+ dnf install -q -y elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64
Error: Unable to find a match: elfutils-libelf-devel.x86_64
...
Additional info:
Reference KCS : https://access.redhat.com/solutions/5820151