Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.20
Component/s: Machine Config Operator
Labels:
- mco-triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Rejected
Sprint:
MCO Sprint 279
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

 For kernel support OCL enabled cluster when we delete the MOSC the MCP is getting degreaded

Version-Release number of selected component (if applicable):

How reproducible:

    Always

Steps to Reproduce:

1. apply realtime MC

oc apply -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "infra"
  name: 99-infra-realtime
spec:
  kernelType: realtime
EOF
machineconfig.machineconfiguration.openshift.io/99-infra-realtime created

2. wait for MCP update

oc debug node/ip-10-0-73-37.us-east-2.compute.internal -- chroot /host rpm -qa | grep kernel
Starting pod/ip-10-0-73-37us-east-2computeinternal-debug-n48pd ...
To use host binaries, run `chroot /host`Removing debug pod ...
kernel-rt-modules-core-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-core-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-modules-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-modules-extra-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-kvm-5.14.0-570.49.1.el9_6.x86_64

3. apply MOSC

oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
  name: infra
spec:
  machineConfigPool:
    name: infra
  imageBuilder:
    imageBuilderType: Job
  baseImagePullSecret:
    name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy")
  renderedImagePushSecret:
    name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
  renderedImagePushSpec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
      
EOF
machineosconfig.machineconfiguration.openshift.io/infra created

4. wait for MCP update

oc debug node/ip-10-0-73-37.us-east-2.compute.internal -- chroot /host rpm -qa | grep kernel
Starting pod/ip-10-0-73-37us-east-2computeinternal-debug-mx5vx ...
To use host binaries, run `chroot /host`
kernel-rt-modules-core-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-core-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-modules-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-modules-extra-5.14.0-570.49.1.el9_6.x86_64
kernel-rt-kvm-5.14.0-570.49.1.el9_6.x86_64Removing debug pod ...

oc debug node/ip-10-0-73-37.us-east-2.compute.internal -- chroot /host rpm-ostree status
Starting pod/ip-10-0-73-37us-east-2computeinternal-debug-npjf6 ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* ostree-unverified-registry:image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:19134e67ac9cfefbe510182d90112404a8514c1df8868d6701c5395f412ee3a1
                   Digest: sha256:19134e67ac9cfefbe510182d90112404a8514c1df8868d6701c5395f412ee3a1
                  Version: 9.6.20250925-0 (2025-09-30T14:54:03Z)Removing debug pod ...

5. Delete the MOSC

oc delete machineosconfig infra
machineosconfig.machineconfiguration.openshift.io "infra" deleted

wait for a while able to see MCP is getting degraded

 - lastTransitionTime: "2025-09-30T15:06:47Z"
    message: 'Node ip-10-0-73-37.us-east-2.compute.internal is reporting: "Node ip-10-0-73-37.us-east-2.compute.internal
      upgrade failure. failed to update OS from local storage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
      error running rpm-ostree rebase --experimental ostree-unverified-image:containers-storage:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
      error: Updating rpm-md repo ''rhel-9-for-x86_64-appstream-rpms'': cannot update
      repo ''rhel-9-for-x86_64-appstream-rpms'': Cannot download repomd.xml: Cannot
      download repodata/repomd.xml: All mirrors were tried; Last error: Curl error
      (58): Problem with the local SSL certificate for https://cdn.redhat.com/content/dist/rhel9/9/x86_64/appstream/os/repodata/repomd.xml
      [could not load PEM client certificate, OpenSSL error error:80000002:system
      library::No such file or directory, (no key found, wrong pass phrase, or wrong
      file format?)]\n: exit status 1", Node ip-10-0-73-37.us-east-2.compute.internal
      is reporting: "failed to update OS from local storage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
      error running rpm-ostree rebase --experimental ostree-unverified-image:containers-storage:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
      error: Updating rpm-md repo ''rhel-9-for-x86_64-appstream-rpms'': cannot update
      repo ''rhel-9-for-x86_64-appstream-rpms'': Cannot download repomd.xml: Cannot
      download repodata/repomd.xml: All mirrors were tried; Last error: Curl error
      (58): Problem with the local SSL certificate for https://cdn.redhat.com/content/dist/rhel9/9/x86_64/appstream/os/repodata/repomd.xml
      [could not load PEM client certificate, OpenSSL error error:80000002:system
      library::No such file or directory, (no key found, wrong pass phrase, or wrong
      file format?)]\n: exit status 1"'
    reason: 1 nodes are reporting degraded status on sync
    status: "True"
    type: NodeDegraded

Expected results:

    MCP should update without any error

Additional info:

  When applied different MC at first and followed the steps able to see the same error, so this is not limited to kernel

Assignee:: Isabella Janssen

Reporter:: Prachiti Talgulkar

Need Info From:: None

Contributors:: None

QA Contact:: Sergio Regidor de la Rosa

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/09/30 3:14 PM

Updated:: 2025/10/21 2:16 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates