Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-30195

Failed revision-pruner pods after OCP 4.16.0-ec.3 deployment on Power

    XMLWordPrintable

Details

    • No
    • ppc64le
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      Multiple revision-pruner pods are in the Error state after deploying the OCP 4.16.0-ec.3 cluster on Power architecture.
      
      # oc get pods -A | grep -v "Completed\|Running"
      NAMESPACE                                          NAME                                                              READY   STATUS      RESTARTS        AGE
      openshift-kube-controller-manager                  revision-pruner-27-master-0                                       0/1     Error       0               3d3h
      openshift-kube-controller-manager                  revision-pruner-27-master-2                                       0/1     Error       0               3d3h
      openshift-kube-controller-manager                  revision-pruner-28-master-2                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-11-master-0                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-11-master-1                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-11-master-2                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-12-master-0                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-12-master-1                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-12-master-2                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-13-master-0                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-13-master-1                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-13-master-2                                       0/1     Error       0               3d3h
      openshift-kube-scheduler                           revision-pruner-14-master-2                                       0/1     Error       0               3d3h

      Version-Release number of selected component (if applicable):

      # oc version
      Client Version: 4.16.0-ec.3
      Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
      Server Version: 4.16.0-ec.3
      Kubernetes Version: v1.29.1+edc2c12

      How reproducible:

      Always

      Steps to Reproduce:

      1. Deploy OCP 4.16.0-ec.3
      2. Check the pods under openshift-kube-controller-manager and openshift-kube-scheduler namespace

      Actual results:

      Multiple revision-pruner pods are in Error state

      Expected results:

      revision-pruner pods should be in Running state or in Completed state

      Additional info:

      Pod logs:

      
          State:      Terminated
            Reason:   Error
            Message:  I0301 06:23:18.545953       1 cmd.go:41] &{<nil> true {false} prune true map[cert-dir:0xc00088ae60 max-eligible-revision:0xc00088abe0 protected-revisions:0xc00088ac80 resource-dir:0xc00088ad20 static-pod-name:0xc00088adc0 v:0xc00088b540] [0xc00088b540 0xc00088abe0 0xc00088ac80 0xc00088ad20 0xc00088ae60 0xc00088adc0] [] map[cert-dir:0xc00088ae60 help:0xc00088b900 log-flush-frequency:0xc00088b4a0 max-eligible-revision:0xc00088abe0 protected-revisions:0xc00088ac80 resource-dir:0xc00088ad20 static-pod-name:0xc00088adc0 v:0xc00088b540 vmodule:0xc00088b5e0] [0xc00088abe0 0xc00088ac80 0xc00088ad20 0xc00088adc0 0xc00088ae60 0xc00088b4a0 0xc00088b540 0xc00088b5e0 0xc00088b900] [0xc00088ae60 0xc00088b900 0xc00088b4a0 0xc00088abe0 0xc00088ac80 0xc00088ad20 0xc00088adc0 0xc00088b540 0xc00088b5e0] map[104:0xc00088b900 118:0xc00088b540] [] -1 0 0xc00088fd10 true 0x11dca510 []}
      I0301 06:23:18.546140       1 cmd.go:42] (*prune.PruneOptions)(0xc0006c3f40)({
       MaxEligibleRevision: (int) 12,
       ProtectedRevisions: ([]int) (len=6 cap=6) {
        (int) 7,
        (int) 8,
        (int) 9,
        (int) 10,
        (int) 11,
        (int) 12
       },
       ResourceDir: (string) (len=36) "/etc/kubernetes/static-pod-resources",
       CertDir: (string) (len=20) "kube-scheduler-certs",
       StaticPodName: (string) (len=18) "kube-scheduler-pod"
      })
      F0301 06:23:18.546198       1 cmd.go:48] open /etc/kubernetes/static-pod-resources: no such file or directory
      
      

      Must-gather logs: https://drive.google.com/file/d/1Uk1WEtHaKKBUNsgKD-gzVeYIKeZrbg98/view?usp=sharing

      Attachments

        Issue Links

          Activity

            People

              team-mco Team MCO
              vahirwad Varad Ahirwadkar
              Rama Kasturi Narra Rama Kasturi Narra
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: