Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-57123

NodePool fails to start if a management cluster has ImageContentSourcePolicy with specific mirrors

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • 4.19.0
    • HyperShift
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • In Progress
    • Bug Fix
    • Hide
      Cause: When an IDMS or ICSP in the management OpenShift cluster defines a source pointing to registry.redhat.io or registry.redhat.io/redhat, and the mirror registry does not contain the required OLM catalog images, the HostedCluster provisioning gets stuck due to unauthorized image pulls.

      Consequence: The HostedCluster deployment fails to complete and remains in a blocked state when it cannot pull essential catalog images from the mirrored registry.

      Fix:
      • The provisioning now explicitly fails and blocks if a required image cannot be pulled due to authorization errors.
      • The logic for registry override has been improved to allow matches on the root of the registry (e.g., registry.redhat.io) for OLM CatalogSource image resolution.
      • A fallback mechanism is introduced to use the original ImageReference if the registry override does not yield a working image.

      Result: The HostedCluster deployment now completes successfully, even in scenarios where the mirror registry lacks the required OLM catalog images, as the system correctly falls back to pulling from the original source when appropriate.
      Show
      Cause: When an IDMS or ICSP in the management OpenShift cluster defines a source pointing to registry.redhat.io or registry.redhat.io/redhat, and the mirror registry does not contain the required OLM catalog images, the HostedCluster provisioning gets stuck due to unauthorized image pulls. Consequence: The HostedCluster deployment fails to complete and remains in a blocked state when it cannot pull essential catalog images from the mirrored registry. Fix: • The provisioning now explicitly fails and blocks if a required image cannot be pulled due to authorization errors. • The logic for registry override has been improved to allow matches on the root of the registry (e.g., registry.redhat.io) for OLM CatalogSource image resolution. • A fallback mechanism is introduced to use the original ImageReference if the registry override does not yield a working image. Result: The HostedCluster deployment now completes successfully, even in scenarios where the mirror registry lacks the required OLM catalog images, as the system correctly falls back to pulling from the original source when appropriate.
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-56955. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-56792. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-56492. The following is the description of the original issue:

      Description of problem:

      Using an ImageContentSourcePolicy on the management cluster with mirrors that don't have catalog images (e.g. community-operator-index) blocks hosted cluster creation. Can be seen in this CI run.

      This seems to occur since May 15 after this commit was merged.

      Version-Release number of selected component (if applicable):

          4.19

      How reproducible:

          Always

      Steps to Reproduce:

      1) Create a management cluster that has ImageContentSourcePolicy pointing to Brew images:

      apiVersion: operator.openshift.io/v1alpha1
      kind: ImageContentSourcePolicy
      metadata:
        name: brew-registry
      spec:
        repositoryDigestMirrors:
        - mirrors:
          - brew.registry.redhat.io
          source: registry.redhat.io
        - mirrors:
          - brew.registry.redhat.io
          source: registry.stage.redhat.io
        - mirrors:
          - brew.registry.redhat.io
          source: registry-proxy.engineering.redhat.com    

      2) Create a hosted cluster with release payload from May 15 (https://amd64.ocp.releases.ci.openshift.org/releasestream/4.19.0-0.nightly/release/4.19.0-0.nightly-2025-05-15-043906) or newer.

      hypershift create cluster ... --release-image=registry.ci.openshift.org/ocp/release:4.19.0-0.nightly-2025-05-15-043906 

      Actual results:

      NodePool reporting status:

      - lastTransitionTime: "2025-05-19T19:44:24Z"
        message: expected 2 core ignition configs, found 0
        observedGeneration: 1
        reason: ValidationFailed
        status: "False"
        type: ValidMachineConfig   

      ControlPlaneOperator logs:

      {"level":"info","ts":"2025-05-19T06:18:54Z","msg":"registry override found (root registry match)","controller":"hostedcontrolplane","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedControlPlane","HostedControlPlane":{"name":"ef7f955d44027ba7b210","namespace":"clusters-ef7f955d44027ba7b210"},"namespace":"clusters-ef7f955d44027ba7b210","name":"ef7f955d44027ba7b210","reconcileID":"cdef128c-7550-4cea-9b07-a17d769982ae","original":"registry.redhat.io/redhat/community-operator-index:v4.19","mirror":"brew.registry.redhat.io","composed":"brew.registry.redhat.io/redhat/community-operator-index:v4.19"}
      
      {"level":"error","ts":"2025-05-19T06:18:54Z","msg":"Reconciler error","controller":"hostedcontrolplane","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedControlPlane","HostedControlPlane":{"name":"ef7f955d44027ba7b210","namespace":"clusters-ef7f955d44027ba7b210"},"namespace":"clusters-ef7f955d44027ba7b210","name":"ef7f955d44027ba7b210","reconcileID":"cdef128c-7550-4cea-9b07-a17d769982ae","error":"failed to update control plane: failed to reconcile olm: failed to get catalog images: failed to get image digest: Head \"https://brew.registry.redhat.io/v2/redhat/community-operator-index/manifests/v4.19\": unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials.

      Expected results:

          Hosted cluster starts successfully

      Additional info:

      I did some additional verification

      Get the image digest:

      ᐅ skopeo inspect --authfile=/home/mgencur/hypershift/mgencur-pullsecret/pullsecret.json docker://registry.redhat.io/redhat/community-operator-index:v4.19 | head
      {
          "Name": "registry.redhat.io/redhat/community-operator-index",
          "Digest": "sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386",
          "RepoTags": [
              "latest",
              "latest_old",
              "v4.10",
              "v4.10_tmp",
              "v4.11",
              "v4.11-1702914346",
      

      Pull it from Brew registry:

      ᐅ podman pull --authfile=/home/mgencur/repos/ocf-ci-ops/keys/brew-registry-redhat-io.auth.json brew.registry.redhat.io/redhat/community-operator-index@sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386
      Trying to pull brew.registry.redhat.io/redhat/community-operator-index@sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386...
      Error: initializing source docker://brew.registry.redhat.io/redhat/community-operator-index@sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386: reading manifest sha256:9112c6e76f8bdc1ad76c0c24c8f64425bbde9a2b341a97067d9b553252ee1386 in brew.registry.redhat.io/redhat/community-operator-index: name unknown: Digest not found

      The image is not in the Brew registry which serves as the mirror.

      OpenShift docs for ImageContentSourcePolicy mention that
      The order of mirrors in this list is treated as the user's desired priority, while source is by default considered lower priority than all mirrors.

      The mirrors should take precedence and if they don't include the desired image, the original source should be used.

      However, Hypershift immediately fails if the image is not in the mirror: https://github.com/openshift/hypershift/blob/main/support/catalogs/images.go#L104

              jparrill@redhat.com Juan Manuel Parrilla Madrid
              openshift-crt-jira-prow OpenShift Prow Bot
              None
              None
              Martin Gencur Martin Gencur
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: