-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
rhelai-1.3
-
None
-
False
-
-
False
-
-
To Reproduce Steps to reproduce the behavior:
- Enter into your favorite RHEL AI image
-
- registry.redhat.io/rhelai1/instructlab-nvidia-rhel9:1.3.1-1733951397
- Try to install a package within the container
- dnf install bc
- Errors out since it can't find RHELAI 1.3 RHEL 9 RPMs metadata
Expected behavior
- DNF should not error out within InstructLab containers
Screenshots
Device Info (please complete the following information):
- Hardware Specs: Intel Icelake Xeon, 1.3T Mem, IBM Cloud system
- OS Version: RHEL 9.4, RHELAI Bootc 1.3.1
- InstructLab Version: ilab, version 0.21.2
- Provide the output of these two commands:
- sudo bootc status --format json | jq .status.booted.image.image.image to print the name and tag of the bootc image, should look like registry.stage.redhat.io/rhelai1/bootc-intel-rhel9:1.3-1732894187
- registry.redhat.io/rhelai1/bootc-nvidia-rhel9:1.3.1
- Also observed using RHEL and starting an instructlab container
- ilab system info to print detailed information about InstructLab version, OS, and hardware – including GPU / AI accelerator hardware
- Platform:
sys.version: 3.11.7 (main, Oct 9 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)]
sys.platform: linux
os.name: posix
platform.release: 5.14.0-427.48.1.el9_4.x86_64
platform.machine: x86_64
platform.node: 8c85982038d8
platform.python_version: 3.11.7
os-release.ID: rhel
os-release.VERSION_ID: 9.4
os-release.PRETTY_NAME: Red Hat Enterprise Linux 9.4 (Plow)
memory.total: 1259.37 GB
memory.available: 1245.61 GB
memory.used: 3.55 GB
- Platform:
- sudo bootc status --format json | jq .status.booted.image.image.image to print the name and tag of the bootc image, should look like registry.stage.redhat.io/rhelai1/bootc-intel-rhel9:1.3-1732894187
InstructLab:
instructlab.version: 0.22.1
instructlab-dolomite.version: 0.2.0
instructlab-eval.version: 0.4.2
instructlab-quantize.version: 0.1.0
instructlab-schema.version: 0.4.1
instructlab-sdg.version: 0.6.2
instructlab-training.version: 0.6.1
Torch:
torch.version: 2.4.1
torch.backends.cpu.capability: AVX512
torch.version.cuda: 12.4
torch.version.hip: None
torch.cuda.available: False
torch.backends.cuda.is_built: True
torch.backends.mps.is_built: False
torch.backends.mps.is_available: False
llama_cpp_python:
llama_cpp_python.version: 0.2.79
llama_cpp_python.supports_gpu_offload: True
Bug impact
- Customers cannot install additional tools in InstructLab container should they desire. This can cause issues with debugging issues if the needed tooling is not built into the InstructLab image.
- This causes issues with automation should it need to install dependencies within the running container and the system is outside the Red Hat network (such as in public clouds)
Known workaround
- Delete, Move, or Disable /etc/yum.repos.d/rhelai.repo
Additional context
- There is no public DNS for rhsm-pulp.corp.redhat.com, it can only resolve within Red Hat network or VPN.
- I've observed this behavior on all released Instructlab containers including prelease containers