Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56792

NodePool fails to start if a management cluster has ImageContentSourcePolicy with specific mirrors

XMLWordPrintable

    • Critical
    • None
    • Approved
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, when an IDMS or ICSP in the management cluster defined a source that pointed to `registry.redhat.io` or `registry.redhat.io/redhat`, and the mirror registry did not contain the required OLM catalog images, the provisioning of the `HostedCluster` object stalled due to unauthorized image pulls. As a consequence, the `HostedCluster` object was not deployed, and was blocked from pulling essential catalog images from the mirrored registry.
      +
      With this release, the provisioning explicitly fails and blocks if a required image cannot be pulled due to authorization errors. In addition, the logic for registry overrides is improved to allow matches on the root of the registry, such as `registry.redhat.io`, for OLM CatalogSource image resolution. Also, a fallback mechanism is introduced to use the original image reference if the registry override does not yield a working image.
      +
      As a result, the `HostedCluster` object is deployed, even in scenarios where the mirror registry lacks the required OLM catalog images, because the system correctly falls back to pull from the original source when appropriate. (link:https://issues.redhat.com/browse/OCPBUGS-56792[OCPBUGS-56792]).
      Show
      * Previously, when an IDMS or ICSP in the management cluster defined a source that pointed to `registry.redhat.io` or `registry.redhat.io/redhat`, and the mirror registry did not contain the required OLM catalog images, the provisioning of the `HostedCluster` object stalled due to unauthorized image pulls. As a consequence, the `HostedCluster` object was not deployed, and was blocked from pulling essential catalog images from the mirrored registry. + With this release, the provisioning explicitly fails and blocks if a required image cannot be pulled due to authorization errors. In addition, the logic for registry overrides is improved to allow matches on the root of the registry, such as `registry.redhat.io`, for OLM CatalogSource image resolution. Also, a fallback mechanism is introduced to use the original image reference if the registry override does not yield a working image. + As a result, the `HostedCluster` object is deployed, even in scenarios where the mirror registry lacks the required OLM catalog images, because the system correctly falls back to pull from the original source when appropriate. (link: https://issues.redhat.com/browse/OCPBUGS-56792 [ OCPBUGS-56792 ]).
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-56492. The following is the description of the original issue:

      Description of problem:

      Using an ImageContentSourcePolicy on the management cluster with mirrors that don't have catalog images (e.g. community-operator-index) blocks hosted cluster creation. Can be seen in this CI run.

      This seems to occur since May 15 after this commit was merged.

      Version-Release number of selected component (if applicable):

          4.19

      How reproducible:

          Always

      Steps to Reproduce:

      1) Create a management cluster that has ImageContentSourcePolicy pointing to Brew images:

      apiVersion: operator.openshift.io/v1alpha1
      kind: ImageContentSourcePolicy
      metadata:
        name: brew-registry
      spec:
        repositoryDigestMirrors:
        - mirrors:
          - brew.registry.redhat.io
          source: registry.redhat.io
        - mirrors:
          - brew.registry.redhat.io
          source: registry.stage.redhat.io
        - mirrors:
          - brew.registry.redhat.io
          source: registry-proxy.engineering.redhat.com    

      2) Create a hosted cluster with release payload from May 15 (https://amd64.ocp.releases.ci.openshift.org/releasestream/4.19.0-0.nightly/release/4.19.0-0.nightly-2025-05-15-043906) or newer.

      hypershift create cluster ... --release-image=registry.ci.openshift.org/ocp/release:4.19.0-0.nightly-2025-05-15-043906 

      Actual results:

      NodePool reporting status:

      - lastTransitionTime: "2025-05-19T19:44:24Z"
        message: expected 2 core ignition configs, found 0
        observedGeneration: 1
        reason: ValidationFailed
        status: "False"
        type: ValidMachineConfig   

      ControlPlaneOperator logs:

      {"level":"info","ts":"2025-05-19T06:18:54Z","msg":"registry override found (root registry match)","controller":"hostedcontrolplane","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedControlPlane","HostedControlPlane":{"name":"ef7f955d44027ba7b210","namespace":"clusters-ef7f955d44027ba7b210"},"namespace":"clusters-ef7f955d44027ba7b210","name":"ef7f955d44027ba7b210","reconcileID":"cdef128c-7550-4cea-9b07-a17d769982ae","original":"registry.redhat.io/redhat/community-operator-index:v4.19","mirror":"brew.registry.redhat.io","composed":"brew.registry.redhat.io/redhat/community-operator-index:v4.19"}
      
      {"level":"error","ts":"2025-05-19T06:18:54Z","msg":"Reconciler error","controller":"hostedcontrolplane","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedControlPlane","HostedControlPlane":{"name":"ef7f955d44027ba7b210","namespace":"clusters-ef7f955d44027ba7b210"},"namespace":"clusters-ef7f955d44027ba7b210","name":"ef7f955d44027ba7b210","reconcileID":"cdef128c-7550-4cea-9b07-a17d769982ae","error":"failed to update control plane: failed to reconcile olm: failed to get catalog images: failed to get image digest: Head \"https://brew.registry.redhat.io/v2/redhat/community-operator-index/manifests/v4.19\": unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials.

      Expected results:

          Hosted cluster starts successfully

      Additional info:

      I did some additional verification

      Get the image digest:

      ᐅ skopeo inspect --authfile=/home/mgencur/hypershift/mgencur-pullsecret/pullsecret.json docker://registry.redhat.io/redhat/community-operator-index:v4.19 | head
      {
          "Name": "registry.redhat.io/redhat/community-operator-index",
          "Digest": "sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386",
          "RepoTags": [
              "latest",
              "latest_old",
              "v4.10",
              "v4.10_tmp",
              "v4.11",
              "v4.11-1702914346",
      

      Pull it from Brew registry:

      ᐅ podman pull --authfile=/home/mgencur/repos/ocf-ci-ops/keys/brew-registry-redhat-io.auth.json brew.registry.redhat.io/redhat/community-operator-index@sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386
      Trying to pull brew.registry.redhat.io/redhat/community-operator-index@sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386...
      Error: initializing source docker://brew.registry.redhat.io/redhat/community-operator-index@sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386: reading manifest sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386 in brew.registry.redhat.io/redhat/community-operator-index: name unknown: Digest not found

      The image is not in the Brew registry which serves as the mirror.

      OpenShift docs for ImageContentSourcePolicy mention that
      The order of mirrors in this list is treated as the user's desired priority, while source is by default considered lower priority than all mirrors.

      The mirrors should take precedence and if they don't include the desired image, the original source should be used.

      However, Hypershift immediately fails if the image is not in the mirror: https://github.com/openshift/hypershift/blob/main/support/catalogs/images.go#L104

              jparrill@redhat.com Juan Manuel Parrilla Madrid
              openshift-crt-jira-prow OpenShift Prow Bot
              Martin Gencur Martin Gencur
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: