Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17.z
Component/s: Machine Config Operator
Labels:
- mco-triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
3
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
MCO Sprint 270, MCO Sprint 271, MCO Sprint 272, MCO Sprint 273
sprint_count:
4

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

I am working with a customer that is attempting to test the OCP Image Based Install method on a SNO in a disconnected environment. High level, the method involves creating a seed image on a server, placing that installed seed image to an identical server, shipping that server to a remote location, and then adding final configuration while the system is remote. In this scenario the configuration ISO is installed prior to the server being connected to the network (it relies on this configuration to make connectivity possible). The SNO comes up without issue during the process but when pushing the machine configuration for the remote network configuration the machine-config-daemon goes into a crash loop and stays in a degraded state. In the daemon logs we see it try to reach to the remote registry prior to crashing. Since we are in a test environment this can fixed by connecting the SNO back to the remote registry, but this is not a possible workaround in production.

}}
[2025-04-18T12:15:17Z INFO nmstatectl::persist_nic] /etc/systemd/network does not exist, no need to clean up I0418 12:15:17.061053 863964 daemon.go:1591] Previous boot ostree-finalize-staged.service appears successful I0418 12:15:17.061062 863964 daemon.go:1708] Current+desired config: rendered-master-d1f75bd73f470aa323026b76394ae706 I0418 12:15:17.061067 863964 daemon.go:1723] state: Degraded I0418 12:15:17.061080 863964 update.go:2607] Running: rpm-ostree cleanup -r Deployments unchanged. I0418 12:15:17.092223 863964 daemon.go:2127] Validating against current config rendered-master-d1f75bd73f470aa323026b76394ae706 I0418 12:15:17.092309 863964 daemon.go:2039] SSH key location ("/home/core/.ssh/authorized_keys.d/ignition") up-to-date! I0418 12:16:16.096354 863964 certificate_writer.go:303] Certificate was synced from controllerconfig resourceVersion 832384 time="2025-04-18T12:17:17Z" level=warning msg="Failed, retrying in 1s ... (1/2). Error: (Mirrors also failed: [quay-registry.apps.cmghub.cmglab1.t-mobile.lab/ocp41722/openshift/release@sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d: pinging container registry quay-registry.apps.cmghub.cmglab1.t-mobile.lab: Get \"[https://quay-registry.apps.cmghub.cmglab1.t-mobile.lab/v2/]\": dial tcp: lookup quay-registry.apps.cmghub.cmglab1.t-mobile.lab: i/o timeout]): quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d: pinging container registry quay.io: Get \"[https://quay.io/v2/]\": dial tcp: lookup quay.io: i/o timeout" I0418 12:18:15.448701 863964 daemon.go:1391] Shutting down MachineConfigDaemon
{{

}}
rpm-ostree status State: idle Deployments: ● ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d Digest: sha256:74392ac770144406bc790e908bf3d12d8d25988beadc1cc3821780fb9f4b101d Version: 417.94.202503172033-0 (2025-03-17T20:38:31Z) StateRoot: rhcos_4.17.22 LayeredPackages: libreswan NetworkManager-libreswan
{{

One additional note is that the libreswan layer seen in the rpm-ostree is because ipsec is involved in the final networking configuration. The server is pretty much air-gapped until that configuration is in place. We know the image being reached is a part of the base rhcos install on the SNO and it is available locally, so we are wondering what is causing the MCO to have a dependency on reaching the remote registry? It seems it should not need to maybe we are misunderstanding something.

Assignee:: Urvashi Mohnani

Reporter:: Christopher Lindo

Need Info From:: None

Contributors:: None

QA Contact:: Sergio Regidor de la Rosa

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/04/21 5:49 PM

Updated:: 2025/07/14 1:08 PM

Resolved:: 2025/06/23 10:03 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates