Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-52656

Pinnedimageset fails when there is a custom MCP

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 5
    • Low
    • No
    • None
    • None
    • MCO Sprint 269
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When we apply a PIS to the worker pool and a custom pool exists in the cluster, the nodes that belong to the custom pool report a failure in the MCD
      
      I0307 17:59:22.343736    2544 pinned_image_set.go:304] Reconciling pinned image set: tc-73623-worker-pinned-images: generation: 1
      I0307 17:59:22.343864    2544 pinned_image_set.go:304] Reconciling pinned image set: tc-73623-worker-pinned-images: generation: 1
      I0307 17:59:22.345356    2544 pinned_image_set.go:529] CRI-O config file is up to date, no reload required
      E0307 17:59:22.357626    2544 upgrade_monitor.go:289] Error applying MCN status: failed to create typed patch object (/ip-10-0-23-33.us-east-2.compute.internal; machineconfiguration.openshift.io/v1alpha1, Kind=MachineConfigNode): .status.pinnedImageSets: duplicate entries for key [name="tc-73623-worker-pinned-images"]
      E0307 17:59:22.357638    2544 pinned_image_set.go:607] Failed to updated machine config node: failed to create typed patch object (/ip-10-0-23-33.us-east-2.compute.internal; machineconfiguration.openshift.io/v1alpha1, Kind=MachineConfigNode): .status.pinnedImageSets: duplicate entries for key [name="tc-73623-worker-pinned-images"]
      
      
      And in the MCN
      
        pinnedImageSets:
        - currentGeneration: 0
          desiredGeneration: 1
          lastFailedGeneration: 1
          lastFailedGenerationErrors:
          - 'failed to get image status for "quay.io/openshifttest/busybox@sha256:0415f56ccc05526f2af5a7ae8654baec97d4a614f24736e8eef41a4591f08019":
            rpc error: code = Canceled desc = context canceled'
          name: tc-73623-worker-pinned-images
      
      
      
      Nevertheless, we can see the images pinned in the nodes
      
      sh-5.1# crictl images --pinned
      IMAGE                                                    TAG                 IMAGE ID            SIZE                PINNED
      quay.io/openshifttest/alpine                             <none>              45683da4f97c2       5.87MB              true
      quay.io/openshifttest/busybox                            <none>              b97242f89c8a2       1.45MB              true
      
      
          

      Version-Release number of selected component (if applicable):

      IPI on AWS
      4.19.0-0.test-2025-03-06-132141-ci-ln-2hfyl8b-latest
      
      We use the pre-merge version corresponding to the PR that enables the use of custom pools for MCN https://issues.redhat.com/browse/MCO-1501
      
          

      How reproducible:

      Always
          

      Steps to Reproduce:

          1. Create an infra custom pool
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfigPool
      metadata:
        name: infra
      spec:
        machineConfigSelector:
          matchExpressions:
            - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
        nodeSelector:
          matchLabels:
            node-role.kubernetes.io/infra: ""
      
          2. Add a node to the infra pool
      
      oc label node $(oc get nodes -l node-role.kubernetes.io/worker -ojsonpath="{.items[0].metadata.name}") node-role.kubernetes.io/infra=
      
          3. Create a PIS to pin some images in all the nodes with the "worker" label
      
       oc create -f - << EOF
      apiVersion: machineconfiguration.openshift.io/v1alpha1
      kind: PinnedImageSet
      metadata:
        labels:
          machineconfiguration.openshift.io/role: worker
        name: tc-73623-worker-pinned-images
      spec:
        pinnedImages:
        - name: "quay.io/openshifttest/busybox@sha256:0415f56ccc05526f2af5a7ae8654baec97d4a614f24736e8eef41a4591f08019"
        - name: quay.io/openshifttest/alpine@sha256:be92b18a369e989a6e86ac840b7f23ce0052467de551b064796d67280dfa06d5
      EOF
      
      
          

      Actual results:

      The images are correctly pinned in the nodes belonging to the worker pool. However, the node belonging to the infra pool reports and error in MCD and in its MCN
      
      Images are pinned in the custom pool too. The problem seems to be the reported errors only.
      
          

      Expected results:

      Now that we support custom pools in MCN, we can support custom pools in PIS too.
      
      MCO will use the pools machineconfig selector to find which PIS should be applied to each pool. Hence, the PIS that we used to reproduce this issue should be applied to the infra pool as well. And they are correctly pinned in the custom pool's node.
      
      However, the nodes belonging to the custom pools should not report a failure due to duplicated images in the MCD and another error due to "context cancelled" in the MCN
      
          

      Additional info:

      Note that it seems that there is no way to apply a PIS to the worker pool but not to the custom pools, since we select the PIS the same way we select the MCs, and by design all custom pools have to use all worker pool's MCs. 
          

              rh-ee-rsaini Rishabh Saini
              sregidor@redhat.com Sergio Regidor de la Rosa
              None
              None
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: