Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.20
Component/s: Machine Config Operator
Labels:
- mco-triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
3
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
MCO Sprint 276
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

  For OCL based cluster, when MCP is updating it is getting degreaded with error to update OS -image.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

I dont know exactly how to reproduce this error but was able to see multiple time in CI job
https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.20-amd64-nightly-gcp-ipi-longduration-tp-mco-p3-f7/1957093328067497984

https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.20-amd64-nightly-aws-ipi-longduration-mco-critical-f7/1955520016115830784

While verifying the PR I encountered this below steps to generate the error, but not sure this is exact way to reproduce this.

    1.Apply MOSC with wrong container file
oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
  name: worker
spec:
  machineConfigPool:
    name: worker
  imageBuilder:
    imageBuilderType: Job
  baseImagePullSecret:
    name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
  renderedImagePushSecret:
    name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
  renderedImagePushSpec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
  containerFile:
  - content: |-
      FROM alpine:3.18
      RUN apt update && apt install -y cowsayEOF
Error from server (AlreadyExists): error when creating "STDIN": machineosconfigs.machineconfiguration.openshift.io "worker" already exists
    2. The MOSB is failed and MCP too but with diffrent error which is expected
    3. Then correct the Containerfile in above MOSC
    4. MOSB is build successful
    5. MCP is degraded with error

Actual results:

    Error seen

  - lastTransitionTime: "2025-08-20T06:51:17Z"
    message: 'Node ip-10-0-9-181.us-east-2.compute.internal is reporting: "Node ip-10-0-9-181.us-east-2.compute.internal
      upgrade failure. Failed to update OS to image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:8b12f9092364afc2b2116f9dcf7cb3f0cffe8753c13ccb73332ae2b88650fcd1
      after retries: timed out waiting for the condition", Node ip-10-0-9-181.us-east-2.compute.internal
      is reporting: "Failed to update OS to image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:8b12f9092364afc2b2116f9dcf7cb3f0cffe8753c13ccb73332ae2b88650fcd1
      after retries: timed out waiting for the condition"'
    reason: 1 nodes are reporting degraded status on sync
    status: "True"
    type: NodeDegraded
  - lastTransitionTime: "2025-08-20T06:51:17Z"
    message: 'Node ip-10-0-9-181.us-east-2.compute.internal is reporting: "Node ip-10-0-9-181.us-east-2.compute.internal
      upgrade failure. Failed to update OS to image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:8b12f9092364afc2b2116f9dcf7cb3f0cffe8753c13ccb73332ae2b88650fcd1
      after retries: timed out waiting for the condition", Node ip-10-0-9-181.us-east-2.compute.internal
      is reporting: "Failed to update OS to image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:8b12f9092364afc2b2116f9dcf7cb3f0cffe8753c13ccb73332ae2b88650fcd1
      after retries: timed out waiting for the condition"'
    reason: ""
    status: "True"
    type: Degraded

Expected results:

Additional info:

must-gather: https://drive.google.com/drive/folders/1SwyfNWYHZ-PQECU2l5KE-tnCwU9NRhEt?usp=sharing

Assignee:: Urvashi Mohnani

Reporter:: Prachiti Talgulkar

Need Info From:: None

Contributors:: None

QA Contact:: Sergio Regidor de la Rosa

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/08/20 7:18 AM

Updated:: 2025/10/01 3:19 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide