Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36596

IDMS with multiple OCP references are not properly processed by the ignition server

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.14, 4.15
    • HyperShift
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • No
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      At our local lab, we mirror multiple OCP versions into different different namespaces. Once a hub is installed, we want to use to to hosted clusters with the same version installed in th the hub or any other supported by hypershift.
      
      The HCP is properly deployed, but the nodes failed because the ignition data was not properly generated. 
      
      The issue does not happen if the all the OCP release are stored in the same namespace at the local registry.

      Version-Release number of selected component (if applicable):

      OCP version: 4.14
      HC Provider: kubevirt

      How reproducible:

      Always

      Steps to Reproduce:

      1.Deploy an ACM hub.
      2.Mirror other OCP releases into separated orgs at the local registry <registry>/ocp/ocp-4.30, <registry>/ocp/ocp-4.14.20,<registry>/ocp/ocp-4.14.20
      3.Apply the generated IDMS to the hub cluster
      4.Deploy hc clusters using the additional mirrored releases 

       

      Actual results:

      The control control plane completes the installation but the nodepool instaces got stacks because the ignition server is unable to get the proper OCP image digest.

       

      Expected results:

       The control plane and the nodepool instances should sucessfully deploy.  

      Additional info:

      # Hub cluster version is 4.14.0-0.nightly-2024-07-01-185610
      
      $ oc get idms image-digest-mirror -o yaml
      apiVersion: config.openshift.io/v1
      kind: ImageDigestMirrorSet
      metadata:
        creationTimestamp: "2024-07-03T05:27:46Z"
        generation: 3
        name: image-digest-mirror
        resourceVersion: "1159967"
        uid: e4aea507-78da-44ae-8632-48f244f09f8c
      spec:
        imageDigestMirrors:
        - mirrors:
          - registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610
          - registry.local.lab:4443/ocp-4.14/4.14.31
          source: quay.io/openshift-release-dev/ocp-release
        - mirrors:
          - registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610
          - registry.local.lab:4443/ocp-4.14/4.14.31    
          source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
      
      # Try to create a HC cluster using 4.14.31
      $ hcp create cluster kubevirt --name=7cl6ajews8 --pull-secret=/tmp/hps_JxTYoj1x.json --namespace=clusters --cluster-cidr=10.132.0.0/14 --memory="8Gi" --node-pool-replicas=2 --release-image=registry.local.lab:4443/ocp-4.14/4.14.31@sha256:e4424eeec8a386241a5348d556bdd6dd82ea68f4f19f30f71d18963fb5924e9e --image-content-sources /tmp/hcp_tmp0sbpwl51/hcp_ics --additional-trust-bundle /tmp/hcp_tmp0sbpwl51/hcp_ca_bundle.pem --annotations hypershift.openshift.io/control-plane-operator-image=registry.local.lab:4443/ocp-4.14/4.14.31@sha256:5bd7cded05470d84bc61dc0f7cbcc83015bc5cc9c7c6ce39bc544f9ba32192b5  --ssh-key /tmp/hcp_tmp0sbpwl51/hcp_pub_key.pem
      
      # Control plane is complete
      $ oc get hc -A
      NAMESPACE   NAME         VERSION   KUBECONFIG                    PROGRESS   AVAILABLE   PROGRESSING   MESSAGE
      clusters    7cl6ajews8             7cl6ajews8-admin-kubeconfig   Partial    True        False         The hosted control plane is available
      
      $ oc exec -t ignition-server-57dc8ff4dd-hrff9 -c ignition-server -- env
      <snip>
      OPENSHIFT_IMG_OVERRIDES=quay.io/openshift-release-dev/ocp-release=registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610,quay.io/openshift-release-dev/ocp-release=registry.local.lab:4443/ocp-4.14/4.14.31,quay.io/openshift-release-dev/ocp-v4.0-art-dev=registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610,quay.io/openshift-release-dev/ocp-v4.0-art-dev=registry.local.lab:4443/ocp-4.14/4.14.31,
      <snip>
      
      # Generated OPENSHIFT_IMG_OVERRIDE for ignition-server
      $ oc exec -t ignition-server-57dc8ff4dd-hrff9 -c ignition-server -- env
      <snip>
      OPENSHIFT_IMG_OVERRIDES=quay.io/openshift-release-dev/ocp-release=registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610,quay.io/openshift-release-dev/ocp-release=registry.local.lab:4443/ocp-4.14/4.14.31,quay.io/openshift-release-dev/ocp-v4.0-art-dev=registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610,quay.io/openshift-release-dev/ocp-v4.0-art-dev=registry.local.lab:4443/ocp-4.14/4.14.31,
      <snip>
      
      # Ignition server logs, unable get the release image, because it is looking on only one of the mirrors listed in the IDMS
      $ oc logs ignition-server-57dc8ff4dd-hrff9
      <snip>
      {"level":"info","ts":"2024-07-04T03:05:18Z","msg":"Reconciling","controller":"secret","controllerGroup":"","controllerKind":"Secret","Secret":{"name":"token-7cl6ajews8-4cd8fcbe","namespace":"clusters-7cl6ajews8"},"namespace":"clusters-7cl6ajews8","name":"token-7cl6ajews8-4cd8fcbe","reconcileID":"a08e002d-00b7-402e-ab9e-82c2d46b8529"}
      {"level":"error","ts":"2024-07-04T03:05:18Z","msg":"Reconciler error","controller":"secret","controllerGroup":"","controllerKind":"Secret","Secret":{"name":"token-7cl6ajews8-4cd8fcbe","namespace":"clusters-7cl6ajews8"},"namespace":"clusters-7cl6ajews8","name":"token-7cl6ajews8-4cd8fcbe","reconcileID":"a08e002d-00b7-402e-ab9e-82c2d46b8529","error":"error getting ignition payload: failed to determine if image is manifest listed: failed to retrieve manifest registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610@sha256:eb0b2484d1868eca54ea7f489d90d1d4275a1ea0770411cd1e2319e91a5353f2: manifest unknown: manifest unknown","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:326\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}
      2024/07/04 03:05:20 User Agent: Ignition/2.16.2. Requested: /ignition
      2024/07/04 03:05:20 Token not found
      {"level":"debug","ts":"2024-07-04T03:05:20Z","logger":"events","msg":"Token not found in cache","type":"Warning","object":{"kind":"Secret","namespace":"clusters-7cl6ajews8","name":"token-7cl6ajews8-4cd8fcbe","apiVersion":"v1"},"reason":"GetPayloadFailed"}
      2024/07/04 03:05:25 User Agent: Ignition/2.16.2. Requested: /ignition
      2024/07/04 03:05:25 Token not found
      <snip>
      
      # The VMs do not get IP addresses because the ignition config was not generated.
      $ oc get vmi
      NAME                        AGE     PHASE     IP    NODENAME   READY
      7cl6ajews8-09e45d40-gcl9v   8m23s   Running         worker-2   True
      7cl6ajews8-09e45d40-zldpc   8m23s   Running         worker-1   True# VM console outputs
      $ virtctl console 7cl6ajews8-09e45d40-gcl9v 
      Successfully connected to 7cl6ajews8-09e45d40-gcl9v console. The escape sequence is ^]
      [  507.932306] ignition[840]: GET https://ignition-server-clusters-7cl6ajews8.apps.cluster21.local.lab/ignition: attempt #105
      [  507.948927] ignition[840]: GET result: Network Authentication Required
      <snip>
      
      

      Maybe the ignition server should be able to go through every entry with the same  <source> entry until it gets the image with the required digest, as happens with the entries  in the /etc/containers/registry.conf file.

      quay.io/openshift-release-dev/ocp-release=registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610,quay.io/openshift-release-dev/ocp-release=registry.local.lab:4443/ocp-4.14/4.14.31,quay.io/openshift-release-dev/ocp-v4.0-art-dev=registry.local.lab:4443/ocp-4.14/4.14.0-0.nightly-2024-07-01-185610,quay.io/openshift-release-dev/ocp-v4.0-art-dev=registry.local.lab:4443/ocp-4.14/4.14.31

      The deployment works if we have and IDMS like the following and we mirror all the releases into ocp-4.14/4.14.30

      $ oc get idms  image-digest-mirror  -o yaml
      apiVersion: config.openshift.io/v1
      kind: ImageDigestMirrorSet
      spec:
        imageDigestMirrors:
        - mirrors:
          - registry.local.lab:4443/ocp-4.14/4.14.30
          source: quay.io/openshift-release-dev/ocp-release
        - mirrors:
          - registry.local.lab:4443/ocp-4.14/4.14.30
          source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
      
      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.14.30   True        False         11h     Cluster version is 4.14.30
      $ oc get hc -A
      NAMESPACE   NAME         VERSION   KUBECONFIG                    PROGRESS    AVAILABLE   PROGRESSING   MESSAGE
      clusters    79yd1ahcss   4.15.20   79yd1ahcss-admin-kubeconfig   Completed   True        False         The hosted control plane is available
      clusters    gxtjxkdt9r   4.14.31   gxtjxkdt9r-admin-kubeconfig   Completed   True        False         The hosted control plane is available
      clusters    m906079yja   4.14.25   m906079yja-admin-kubeconfig   Completed   True        False         The hosted control plane is available
      clusters    wlfc3u4u03   4.14.30   wlfc3u4u03-admin-kubeconfig   Completed   True        False         The hosted control plane is available
      $ oc get vmi -A
      NAMESPACE             NAME                        AGE     PHASE     IP             NODENAME   READY
      clusters-79yd1ahcss   79yd1ahcss-732e09b4-2fmkx   124m    Running   10.128.2.168   worker-0   True
      clusters-79yd1ahcss   79yd1ahcss-732e09b4-p8fnn   124m    Running   10.129.2.188   worker-1   True
      clusters-gxtjxkdt9r   gxtjxkdt9r-10b319b0-h96gw   3h56m   Running   10.131.0.162   worker-2   True
      clusters-gxtjxkdt9r   gxtjxkdt9r-10b319b0-w587x   3h56m   Running   10.129.2.86    worker-1   True
      clusters-m906079yja   m906079yja-2739d400-ctkbm   3h30m   Running   10.131.1.1     worker-2   True
      clusters-m906079yja   m906079yja-2739d400-g9jl8   3h30m   Running   10.129.2.129   worker-1   True
      clusters-wlfc3u4u03   wlfc3u4u03-3631152e-p5tmr   3h31m   Running   10.129.2.127   worker-1   True
      clusters-wlfc3u4u03   wlfc3u4u03-3631152e-xlx2q   3h31m   Running   10.131.0.253   worker-2   True

      Thanks

              Unassigned Unassigned
              josearod@redhat.com Jose Alberto Rodriguez
              None
              None
              Jie Zhao Jie Zhao
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: