-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.17
-
None
-
None
-
False
-
Description
Installing AMD GPU operator (amd-gpu-operator.v1.1.1) from certified-operators (channel 'alpha') in an air-gapped env (AWS).
Problem
The problem seems to be with the DeviceConfig which triggers a "build" pod (pls correct me if I'm wrong), in my case
dc-internal-registry-build-6rvp6-build
Unfortunately, one of the steps in that build is
[1/2] STEP 5/9: RUN dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y && crb enable && sed -i "s/\$releasever/9/g" /etc/yum.repos.d/epel*.repo && dnf install dnf-plugin-config-manager -y && dnf clean all
which fails on
Curl error (28): Timeout was reached for https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi9/9/x86_64/baseos/os/repodata/repomd.xml [Failed to connect to cdn-ubi.redhat.com port 443: Connection timed out] Error: Failed to download metadata for repo 'ubi-9-baseos': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried error: build error: building at STEP "RUN dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y && crb enable && sed -i "s/\$releasever/9/g" /etc/yum.repos.d/epel*.repo && dnf install dnf-plugin-config-manager -y && dnf clean all": while running runtime: exit status 1
Obviously, we cannot do such steps in a disconnected env.
EDIT:
To avoid the building step, we could use a pre-compiled image as described at
https://dcgpu.docs.amd.com/projects/gpu-operator/en/main/drivers/precompiled-driver.html#using-pre-compiled-images