Loading...

XML

Word

Printable

Type: Bug
Resolution: Can't Do
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.18
Component/s: Machine Config Operator
Labels:
- MCD
- cee.neXT
- mcd
- mco-triaged
- openshift-4.18

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
1
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
MCO Sprint 277
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

OCP 4.18 HUB cluster fails with error where the component running rpm-ostree on HCP worker nodes do not retry when encountering an issue, causing the cluster to get stuck during the upgrade process from a single failed image pull. The customer when using a pull-through registry cache, the first image pull for the first node timeout, but never retries. resulting in the entire node pool become stuck.

In addition due to unnecessarily transient nature of the MCD pods in hypershift it is impossible to capture pod logs for diagnosis making this issue extremely difficult to diagnose with very little gained by this feature.

Upon troubleshooting, it was noted that the `machineconfig` daemon `pod` is not always running like in a regular cluster. Instead, when the desired `mcd` config is different than current config, we deploy the `pod` onto the worker `node` to do the upgrade in place.

Please refer to this.

The inspect of the upgrade namespace from the hosted cluster is available here.

Please also refer to MCO-358, as it seems to be related to this request.

Assignee:: David Joshy

Reporter:: Rafael de Oliveira Rosa

Involved:: Tim Dawson

Need Info From:: None

Votes:: 2 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/06/03 2:08 AM

Updated:: 2025/09/22 1:21 PM

Resolved:: 2025/09/22 1:21 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates