Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: 4.22.0
Affects Version/s: 4.18, 4.19, 4.20, 4.21, 4.22
Component/s: Machine Config Operator
Labels:
- mco-triaged
- pre-merge-tested

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
3
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:

4.22.0
Release Blocker:
None
Sprint:
MCO Sprint 281
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

When a pivot error happens (MCO cannot apply a new osImage), the MCP should be degraded and a PivotError alert should be triggered.

We can see the alert being triggered, but the MCP is not correctly degraded, it remains in "working" status instead.

It looks like that eventually, after a good amount of time (15 minutes in our cluster), it is fixed.

{noformat}
E1013 11:57:13.938404    2596 writer.go:231] Marking Degraded due to: "Failed to update OS to quay.io/mcoqe/layering@sha256:5177a092968e50b2be8d98c15a68bc65016de18dacfc693f99187d2a1457ac85 after retries: timed out waiting for the condition"
I1013 11:57:13.950238    2596 daemon.go:784] Transitioned from state: Done -> Working


I1013 12:12:16.822035    2596 daemon.go:784] Transitioned from state: Working -> Degraded
{noformat}

Version-Release number of selected component (if applicable):

4.20.0-0.nightly-2025-10-13-053645

How reproducible:

Frequently

Steps to Reproduce:

    1. Break the "rpm-ostree upgrade" using the steps defined in https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-63866

    2. Apply a new osImage to the worker pool

Actual results:

The worker pools should be degraded and a pivot alert should be raised. The alert is properly raised but MCO is not able to properly degrade the worker MCP.

Expected results:

The worker MCP should be properly degraded with the right message.

Additional info:

blocks

OCPBUGS-68403 MCP is not correctly degraded when a pivotError happens

Closed

is cloned by

OCPBUGS-68403 MCP is not correctly degraded when a pivotError happens

Closed

links to

openshift/machine-config-operator#5492: OCPBUGS-62984: MCP is not correctly degraded when a pivotError happens

Assignee:: Dalia Khater

Reporter:: Sergio Regidor de la Rosa

QA Contact:: Sergio Regidor de la Rosa

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/10/13 12:24 PM

Updated:: 2025/12/18 9:14 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates