Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9984

cloud-controller-manager operator degraded after upgrade on nutanix

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • No
    • None
    • None
    • Approved
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      After Failed to resync - cloud-controller-manager operator in degraded state during upgrade.

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-03-08-014115

      How reproducible:

      1 on 1 try.

      Steps to Reproduce:

      1. Install OVN cluster on nutanix
      install-config:
      
      ---
       apiVersion: v1
       controlPlane:
         architecture: amd64
         hyperthreading: Enabled
         name: master
         platform: {}
         replicas: 3
       compute:
       - architecture: amd64
         hyperthreading: Enabled
         name: worker
         platform: {}
         replicas: 2
       metadata:
         name: skordas-39
       platform:
         nutanix:
           apiVIP: 10.0.132.9
           ingressVIP: 10.0.132.10
           subnetUUIDs:
           - efe26e93-f6cf-4d89-8104-009e85201fa8
           prismCentral:
             username: sgao
             password: HIDDEN
             endpoint:
               address: prismcentral.lts-cluster.nutanix-dev.devcluster.openshift.com
               port: 9440
           prismElements:
           - uuid: 0005d9a4-8e4f-7c33-58d1-e9d0e2d48853
             endpoint:
               address: 10.0.128.159
               port: 9440
       pullSecret: HIDDEN
       networking:
         clusterNetwork:
         - cidr: 10.128.0.0/14
           hostPrefix: 23
         serviceNetwork:
         - 172.30.0.0/16
         machineNetwork:
         - cidr: 10.0.0.0/16
         networkType: OVNKubernetes
       publish: External
       credentialsMode: Manual
       fips: true
       baseDomain: qe.devcluster.openshift.com
       sshKey: HIDDEN
      
      Jenkins job: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/184353/ 
      
      2. Scale up cluster to 40 working nodes.
      jenkins job:
      https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/cluster-workers-scaling/2271/ 
      
      3. Load cluster with 160 (4 * 40 nodes) projects.
      Jenkins job: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/kube-burner/2182/ 
      
      4. Upgrade OCP to 4.12.6 version.
      5. Upgrade OCP to 4.13.0-0.nightly-2023-03-08-014115

      Actual results:

      cloud-controller-manager cluster operator in degraded state:
      
      Message:               Failed to resync for operator: 4.13.0-0.nightly-2023-03-08-014115 because &{%!e(string=failed to apply resources because CloudConfigControllerDegraded condition is set to True)}
      Reason:                SyncingFailed
      Status:                True
      Type:                  Degraded

              mimccune@redhat.com Michael McCune
              skordas Simon Kordas (Inactive)
              Zhaohua Sun Zhaohua Sun
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: