-
Bug
-
Resolution: Done-Errata
-
Critical
-
None
-
4.13.z, 4.14
-
Critical
-
No
-
MCO Sprint 239
-
1
-
Rejected
-
False
-
Updated Description:
The MCD, during a node lifespan, can go through multiple iterations of RHEL8 and RHEL9. This was not a problem until we turned on fips enabled golang with dynamic linking. This requires the MCD binary running (either in container or on host) to always match the host built version. As an additional complication, we have an early boot process (machine-config-daemon-pull/firstboot.service) that can be different from the rest of the cluster node versions (bootimage version is not updated) as well as the fact that we chroot (dynamically go from rhel8 to rhel9) in the container, so we need a better process to ensure the right binary is always used.
Current testing of this flow in https://github.com/openshift/machine-config-operator/pull/3799
Description of problem:
MCO CI started failing this week, and 4.14 nightlies have also made it into 4.14 nightlies. See also: https://issues.redhat.com/browse/TRT-1143. The failure manifests as a warning in the MCO. Looking at a MCD log, you will see a failure like: W0712 08:52:15.475268 7971 daemon.go:1089] Got an error from auxiliary tools: kubelet health check has failed 3 times: Get "http://localhost:10248/healthz": dial tcp: lookup localhost: device or resource busy The root cause so far seems to be that 4.14 switched from a regular 1.20.3 golang to 1.20.5 with FIPS and dynamic linking in the builder, causing the failures to begin. Most functionality is not broken, but the daemon subroutine that does the kubelet health check appears to be unable to reach the localhost endpoint One possibility is that the rhel8-daemon chroot'ing into the rhel9-host and running these commands is causing the issue. Regardless, there are a bunch of issues with rhel8/rhel9 duality in the MCD that we would need to address in 4.13/4.14 Also tangentially related: https://issues.redhat.com/browse/MCO-663
Version-Release number of selected component (if applicable):
4.14
How reproducible:
Always
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
- blocks
-
OTA-1000 Impact of RHSB-2023-001 OpenShift misconfiguration of FIPS cryptographic library
- Closed
- clones
-
OCPBUGS-16128 4.13/4.14 MCDs do not work with FIPS enabled golang builders
- Closed
- depends on
-
OCPBUGS-16128 4.13/4.14 MCDs do not work with FIPS enabled golang builders
- Closed
- links to
-
RHSA-2023:4456 OpenShift Container Platform 4.13.z security update