Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62479

MCP getting degraded in kernel supported OCL enabled cluster after deleting MOSC

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • Rejected
    • MCO Sprint 279
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

       For kernel support OCL enabled cluster when we delete the MOSC the MCP is getting degreaded

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          Always

      Steps to Reproduce:

      1. apply realtime MC

      oc apply -f - << EOF
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: "infra"
        name: 99-infra-realtime
      spec:
        kernelType: realtime
      EOF
      machineconfig.machineconfiguration.openshift.io/99-infra-realtime created 

      2. wait for MCP update

      oc debug node/ip-10-0-73-37.us-east-2.compute.internal -- chroot /host rpm -qa | grep kernel
      Starting pod/ip-10-0-73-37us-east-2computeinternal-debug-n48pd ...
      To use host binaries, run `chroot /host`Removing debug pod ...
      kernel-rt-modules-core-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-core-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-modules-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-modules-extra-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-kvm-5.14.0-570.49.1.el9_6.x86_64 

      3. apply MOSC

      oc create -f - << EOF
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineOSConfig
      metadata:
        name: infra
      spec:
        machineConfigPool:
          name: infra
        imageBuilder:
          imageBuilderType: Job
        baseImagePullSecret:
          name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy")
        renderedImagePushSecret:
          name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
        renderedImagePushSpec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
            
      EOF
      machineosconfig.machineconfiguration.openshift.io/infra created 

      4. wait for MCP update

      oc debug node/ip-10-0-73-37.us-east-2.compute.internal -- chroot /host rpm -qa | grep kernel
      Starting pod/ip-10-0-73-37us-east-2computeinternal-debug-mx5vx ...
      To use host binaries, run `chroot /host`
      kernel-rt-modules-core-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-core-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-modules-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-modules-extra-5.14.0-570.49.1.el9_6.x86_64
      kernel-rt-kvm-5.14.0-570.49.1.el9_6.x86_64Removing debug pod ...
      
      oc debug node/ip-10-0-73-37.us-east-2.compute.internal -- chroot /host rpm-ostree status
      Starting pod/ip-10-0-73-37us-east-2computeinternal-debug-npjf6 ...
      To use host binaries, run `chroot /host`
      State: idle
      Deployments:
      * ostree-unverified-registry:image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:19134e67ac9cfefbe510182d90112404a8514c1df8868d6701c5395f412ee3a1
                         Digest: sha256:19134e67ac9cfefbe510182d90112404a8514c1df8868d6701c5395f412ee3a1
                        Version: 9.6.20250925-0 (2025-09-30T14:54:03Z)Removing debug pod ...

      5. Delete the MOSC

      oc delete machineosconfig infra
      machineosconfig.machineconfiguration.openshift.io "infra" deleted
      1. wait for a while able to see MCP is getting degraded
       - lastTransitionTime: "2025-09-30T15:06:47Z"
          message: 'Node ip-10-0-73-37.us-east-2.compute.internal is reporting: "Node ip-10-0-73-37.us-east-2.compute.internal
            upgrade failure. failed to update OS from local storage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
            error running rpm-ostree rebase --experimental ostree-unverified-image:containers-storage:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
            error: Updating rpm-md repo ''rhel-9-for-x86_64-appstream-rpms'': cannot update
            repo ''rhel-9-for-x86_64-appstream-rpms'': Cannot download repomd.xml: Cannot
            download repodata/repomd.xml: All mirrors were tried; Last error: Curl error
            (58): Problem with the local SSL certificate for https://cdn.redhat.com/content/dist/rhel9/9/x86_64/appstream/os/repodata/repomd.xml
            [could not load PEM client certificate, OpenSSL error error:80000002:system
            library::No such file or directory, (no key found, wrong pass phrase, or wrong
            file format?)]\n: exit status 1", Node ip-10-0-73-37.us-east-2.compute.internal
            is reporting: "failed to update OS from local storage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
            error running rpm-ostree rebase --experimental ostree-unverified-image:containers-storage:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f9d1f649ed9bc155cd6b43aa43a606846b996b6577a7bcc3d872a91e415abc98:
            error: Updating rpm-md repo ''rhel-9-for-x86_64-appstream-rpms'': cannot update
            repo ''rhel-9-for-x86_64-appstream-rpms'': Cannot download repomd.xml: Cannot
            download repodata/repomd.xml: All mirrors were tried; Last error: Curl error
            (58): Problem with the local SSL certificate for https://cdn.redhat.com/content/dist/rhel9/9/x86_64/appstream/os/repodata/repomd.xml
            [could not load PEM client certificate, OpenSSL error error:80000002:system
            library::No such file or directory, (no key found, wrong pass phrase, or wrong
            file format?)]\n: exit status 1"'
          reason: 1 nodes are reporting degraded status on sync
          status: "True"
          type: NodeDegraded 

      Expected results:

          MCP should update without any error

      Additional info:

        When applied different MC at first and followed the steps able to see the same error, so this is not limited to kernel

              rh-ee-ijanssen Isabella Janssen
              rh-ee-ptalgulk Prachiti Talgulkar
              None
              None
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: