Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-32023

[4.13] - Revision controllers spinning 10+ revisions

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Normal Normal
    • None
    • 4.13
    • Node / Kubelet
    • Important
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      
      KCM-O failing with error below in our ci runs
      
      [36m  - lastTransitionTime: "2024-04-08T20:46:54Z"[0m
            [36m    message: "MissingStaticPodControllerDegraded: static pod lifecycle failure - static[0m
            [36m      pod: \"kube-controller-manager\" in namespace: \"openshift-kube-controller-manager\"[0m
            [36m      for revision: 12 on node: \"ci-op-g6x1qf87-6198f-7ktkb-master-1\" didn't show[0m
            [36m      up, waited: 3m0s\nStaticPodsDegraded: pod/kube-controller-manager-ci-op-g6x1qf87-6198f-7ktkb-master-1[0m
            [36m      container \"cluster-policy-controller\" is terminated: Completed: \nStaticPodsDegraded:[0m
            [36m      pod/kube-controller-manager-ci-op-g6x1qf87-6198f-7ktkb-master-1 container \"kube-controller-manager\"[0m
            [36m      is terminated: Completed: \nStaticPodsDegraded: pod/kube-controller-manager-ci-op-g6x1qf87-6198f-7ktkb-master-1[0m
            [36m      container \"kube-controller-manager-cert-syncer\" is terminated: Error: 2168[0m
            [36m      \      1 certsync_controller.go:66] Syncing configmaps: [{aggregator-client-ca[0m
            [36m      false} {client-ca false} {trusted-ca-bundle true}]\nStaticPodsDegraded: I0408[0m
            [36m      20:43:54.572531       1 certsync_controller.go:170] Syncing secrets: [{kube-controller-manager-client-cert-key[0m
            [36m      false} {csr-signer false}]\nStaticPodsDegraded: I0408 20:43:55.567744       1[0m
            [36m      certsync_controller.go:66] Syncing configmaps: [{aggregator-client-ca false}[0m
      
          

      Version-Release number of selected component (if applicable):

           4.13.0-0.nightly-2024-04-04-020752
      
          

      How reproducible:

           Hit it twice
          

      Steps to Reproduce:

          1. Happened to see it in CI runs
          2.
          3.
          

      Actual results:

      
      [36m  - lastTransitionTime: "2024-04-08T20:46:54Z"[0m
            [36m    message: "MissingStaticPodControllerDegraded: static pod lifecycle failure - static[0m
            [36m      pod: \"kube-controller-manager\" in namespace: \"openshift-kube-controller-manager\"[0m
            [36m      for revision: 12 on node: \"ci-op-g6x1qf87-6198f-7ktkb-master-1\" didn't show[0m
            [36m      up, waited: 3m0s\nStaticPodsDegraded: pod/kube-controller-manager-ci-op-g6x1qf87-6198f-7ktkb-master-1[0m
            [36m      container \"cluster-policy-controller\" is terminated: Completed: \nStaticPodsDegraded:[0m
            [36m      pod/kube-controller-manager-ci-op-g6x1qf87-6198f-7ktkb-master-1 container \"kube-controller-manager\"[0m
            [36m      is terminated: Completed: \nStaticPodsDegraded: pod/kube-controller-manager-ci-op-g6x1qf87-6198f-7ktkb-master-1[0m
            [36m      container \"kube-controller-manager-cert-syncer\" is terminated: Error: 2168[0m
            [36m      \      1 certsync_controller.go:66] Syncing configmaps: [{aggregator-client-ca[0m
            [36m      false} {client-ca false} {trusted-ca-bundle true}]\nStaticPodsDegraded: I0408[0m
            [36m      20:43:54.572531       1 certsync_controller.go:170] Syncing secrets: [{kube-controller-manager-client-cert-key[0m
            [36m      false} {csr-signer false}]\nStaticPodsDegraded: I0408 20:43:55.567744       1[0m
            [36m      certsync_controller.go:66] Syncing configmaps: [{aggregator-client-ca false}[0m
          

      Expected results:

      
           KCM-O operator should run fine
          

      Additional info:

      https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.13-amd64-nightly-gcp-ipi-oidc-rt-fips-f28-destructive/1777139617133236224/artifacts/gcp-ipi-oidc-rt-fips-f28-destructive/gather-extra/artifacts/pods/openshift-kube-controller-manager-operator_kube-controller-manager-operator-68788bf76f-qgx2j_kube-controller-manager-operator.log
      
      Must-gather.log link: https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.13-amd64-nightly-gcp-ipi-oidc-rt-fips-f28-destructive/1777139617133236224/artifacts/gcp-ipi-oidc-rt-fips-f28-destructive/gather-must-gather/artifacts/must-gather.tar
      
      wondering if the issue is similar to https://github.com/openshift/cluster-kube-controller-manager-operator/pull/797
          

            aos-node@redhat.com Node Team Bot Account
            knarra@redhat.com Rama Kasturi Narra
            ying zhou ying zhou
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: