Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48092

CGU became completed before MNO cluster is ready, resulting in no policy applied to spoke when ZTP is done

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • 4.17, 4.18
    • TALM Operator
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      In ZTP, sometimes CGU became completed before the spoke cluster is ready, causing no policies to be applied to spoke.  

      Version-Release number of selected component (if applicable):

      Hub OCP: 4.18 rc3
      ACM: 2.12.2
      MCE: 2.7.3
      Spoke OCP: 4.18 rc3
      TALM: 4.18/4.17

      How reproducible:

      Intermittent - so far only seen a couple times on MNO spoke cluster

      Steps to Reproduce:

          1. MNO cluster was previously installed via ZTP with all nodes up
          2. Clean up resources on hub via cascade deletion of argocd apps (spoke NS, managedcluster, policies, etc)
          3. Trigger new ZTP deployment by pushing contents to git and deploy/configure argocd apps.
          4. Wait for ZTP to complete 
          

      Actual results:

          OCP payload installed, but policies are not applied to spoke. CGU shows completed before the cluster install started. 

      Expected results:

          CGU should not start before spoke is ready

      Additional info:

      CGU started/completed before the cluster installation started. 
      
          history:
          - completionTime: "2025-01-06T20:02:15Z"
            image: registry.hlxcl11.lab.eng.tlv2.redhat.com:5000/openshift-release-dev/ocp-release@sha256:668c92b06279cb5c7a2a692860b297eeb9013af10d49d2095f2c3fe9ad02baaa
            startedTime: "2025-01-06T19:32:31Z"
      
      
      [kni@registry ~]$ oc get cgu -n ztp-install kni-qe-26 -o yaml
      apiVersion: ran.openshift.io/v1alpha1
      kind: ClusterGroupUpgrade
      ...
      spec:
        actions:
          afterCompletion:
            addClusterLabels:
              ztp-done: ""
            deleteObjects: true
            removeClusterLabels:
            - ztp-running
          beforeEnable:
            addClusterLabels:
              ztp-running: ""
        backup: false
        clusters:
        - kni-qe-26
        enable: true
        managedPolicies:
        - common-config-policy
        - common-subscriptions-policy
        - group-du-standard-config-policy
        - group-du-standard-site-config-policy
        - group-du-standard-validator-du-policy
        preCaching: false
        preCachingConfigRef: {}
        remediationStrategy:
          maxConcurrency: 1
          timeout: 240
      status:
      ...
        status:
          completedAt: "2025-01-06T19:02:04Z"
          startedAt: "2025-01-06T19:01:23Z"    

              jche@redhat.com Jun Chen
              rhn-support-yliu1 Yang Liu
              None
              None
              Dwaine Gonyier Dwaine Gonyier
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: