-
Bug
-
Resolution: Done
-
Critical
-
RHELAI 1.3 GA
-
False
-
-
False
-
-
-
Approved
RHEL 9.5 was released very recently and all standard repos have switched from 9.4 to 9.5 content. RHELAI 1.3 is not updating to 9.5. Instead it uses containers and RPMs from RHEL 9.4 EUS (extended update support). EUS needs different repo configuration for two reasons:
- The default 9.4 repos are frozen. Updates and security fixes are delivered to special EUS repos
- Repo URLs with $releasever now point to 9.5 repos. By default, $releasever resolves to 9, which is an alias for latest RHEL 9 repo.
The 1.3 branch of instructlab-amd, instructlab-nvidia, and nvidia-bootc were updated to EUS around November 15. Several git repos were not updated, though. As a consequence, these repos have been generating container images with a Frankenstein mix of RHEL 9.4 and 9.5 content.
Kudos to ebelarte@redhat.com and lmilbaum for figuring out the bug.
To Reproduce
Steps to reproduce the behavior:
- examine Containerfile and repos/redhat.repo files of a repo
- look for string eus in the repo file, e.g. rhel-9-for-x86_64-baseos-eus-rpms. If there are no or less than three EUS repos, it's a problem.
- look for string echo "9.4" > /etc/dnf/vars/releasever in the Containerfile. If the container does not override the DNF var, then there might be a problem.
Expected behavior
- Containerfile should configure releasever DNF var to be 9.4
- repo file should contain enabled EUS repos for BaseOS, AppStream, and CRB
Affected git repos
- https://gitlab.com/redhat/rhel-ai/wheels/builder/ (main, 1.3.)
- https://gitlab.com/redhat/rhel-ai/containers/amd-bootc/ (1.3 only, main is fixed)
- https://gitlab.com/redhat/rhel-ai/containers/intel-bootc/ (main, 1.3.)
- https://gitlab.com/redhat/rhel-ai/containers/instructlab-intel/ (main, 1.3)
Additionally, the 1.3 amd-bootc container is pulling content from RHELAI 1.2 instead of 1.3 RPM repo, https://gitlab.com/redhat/rhel-ai/containers/amd-bootc/-/blob/1.3/repos/rhelai-1.2.repo
How to fix
The fixes for amd-bootc and the Intel repos are trivial: update the Containerfile, copy new redhat.repo and rebuild the containers. The builder repo is a much bigger concern. It is possible that we have compiled Python wheels against RHEL 9.5 ABI (headers and shared libraries). This scenario is not supported and we might have to recompile all wheels for all three variants.
We have a system in place to force recreation of all wheels. Let's not do this now. It would take a very long time and block build pipelines for other updates. Let's monitor the situation and keep an eye out for import errors and new crashers.
- causes
-
RHELAI-2431 redhat.repo use hard-coded arch x86_64 in repo id and name
- New
- is related to
-
RHELAI-2381 RHELAI container files must not override "releasever" DNF var
- New
-
RHELAI-2251 Enable RHEL 9.4 EUS repositories for NVIDIA bootc
- Closed
- mentioned on