-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.17.z
-
Quality / Stability / Reliability
-
False
-
-
3
-
Moderate
-
None
-
None
-
None
-
None
-
MCO Sprint 270, MCO Sprint 271, MCO Sprint 272, MCO Sprint 273
-
4
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I am working with a customer that is attempting to test the OCP Image Based Install method on a SNO in a disconnected environment. High level, the method involves creating a seed image on a server, placing that installed seed image to an identical server, shipping that server to a remote location, and then adding final configuration while the system is remote. In this scenario the configuration ISO is installed prior to the server being connected to the network (it relies on this configuration to make connectivity possible). The SNO comes up without issue during the process but when pushing the machine configuration for the remote network configuration the machine-config-daemon goes into a crash loop and stays in a degraded state. In the daemon logs we see it try to reach to the remote registry prior to crashing. Since we are in a test environment this can fixed by connecting the SNO back to the remote registry, but this is not a possible workaround in production.
}} [2025-04-18T12:15:17Z INFO nmstatectl::persist_nic] /etc/systemd/network does not exist, no need to clean up I0418 12:15:17.061053 863964 daemon.go:1591] Previous boot ostree-finalize-staged.service appears successful I0418 12:15:17.061062 863964 daemon.go:1708] Current+desired config: rendered-master-d1f75bd73f470aa323026b76394ae706 I0418 12:15:17.061067 863964 daemon.go:1723] state: Degraded I0418 12:15:17.061080 863964 update.go:2607] Running: rpm-ostree cleanup -r Deployments unchanged. I0418 12:15:17.092223 863964 daemon.go:2127] Validating against current config rendered-master-d1f75bd73f470aa323026b76394ae706 I0418 12:15:17.092309 863964 daemon.go:2039] SSH key location ("/home/core/.ssh/authorized_keys.d/ignition") up-to-date! I0418 12:16:16.096354 863964 certificate_writer.go:303] Certificate was synced from controllerconfig resourceVersion 832384 time="2025-04-18T12:17:17Z" level=warning msg="Failed, retrying in 1s ... (1/2). Error: (Mirrors also failed: [quay-registry.apps.cmghub.cmglab1.t-mobile.lab/ocp41722/openshift/release@sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d: pinging container registry quay-registry.apps.cmghub.cmglab1.t-mobile.lab: Get \"[https://quay-registry.apps.cmghub.cmglab1.t-mobile.lab/v2/]\": dial tcp: lookup quay-registry.apps.cmghub.cmglab1.t-mobile.lab: i/o timeout]): quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d: pinging container registry quay.io: Get \"[https://quay.io/v2/]\": dial tcp: lookup quay.io: i/o timeout" I0418 12:18:15.448701 863964 daemon.go:1391] Shutting down MachineConfigDaemon {{
}}
rpm-ostree status State: idle Deployments: ● ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d Digest: sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d Version: 417.94.202503172033-0 (2025-03-17T20:38:31Z) StateRoot: rhcos_4.17.22 LayeredPackages: libreswan NetworkManager-libreswan
{{
One additional note is that the libreswan layer seen in the rpm-ostree is because ipsec is involved in the final networking configuration. The server is pretty much air-gapped until that configuration is in place. We know the image being reached is a part of the base rhcos install on the SNO and it is available locally, so we are wondering what is causing the MCO to have a dependency on reaching the remote registry? It seems it should not need to maybe we are misunderstanding something.