Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-34161

kubelet-serving CSRs in Pending state on SNO with Telco DU with disabled capabilities

XMLWordPrintable

    • Critical
    • No
    • 1
    • OCPEDGE Sprint 254
    • 1
    • Proposed
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      This is a clone of issue OCPBUGS-33644. The following is the description of the original issue:

      Description of problem:

      After running tests on an SNO with Telco DU profile for a couple of hours kubernetes.io/kubelet-serving CSRs in Pending state start showing up and accumulating in time. 

      Version-Release number of selected component (if applicable):

      4.16.0-rc.1    

      How reproducible:

      once so far    

      Steps to Reproduce:

          1. Deploy SNO with DU profile with disabled capabilities:
      
          installConfigOverrides:  "{\"capabilities\":{\"baselineCapabilitySet\": \"None\", \"additionalEnabledCapabilities\": [ \"NodeTuning\", \"ImageRegistry\", \"OperatorLifecycleManager\" ] }}"
      
      2. Leave the node running tests overnight for a couple of hours
      
      3. Check for Pending CSRs
      

      Actual results:

      oc get csr -A | grep Pending | wc -l 
      27    

      Expected results:

      No pending CSRs    
      
      Also oc logs will return a tls internal error:
      
      oc -n openshift-cluster-machine-approver --insecure-skip-tls-verify-backend=true logs machine-approver-866c94c694-7dwks 
      Defaulted container "kube-rbac-proxy" out of: kube-rbac-proxy, machine-approver-controller
      Error from server: Get "https://[2620:52:0:8e6::d0]:10250/containerLogs/openshift-cluster-machine-approver/machine-approver-866c94c694-7dwks/kube-rbac-proxy": remote error: tls: internal error
      

      Additional info:

      Checking the machine-approver-controller container logs on the node we can see the reconciliation is failing be cause it cannot find the Machine API which is disabled from the capabilities.
      
      I0514 13:25:09.266546       1 controller.go:120] Reconciling CSR: csr-dw9c8
      E0514 13:25:09.275585       1 controller.go:138] csr-dw9c8: Failed to list machines in API group machine.openshift.io/v1beta1: no matches for kind "Machine" in version "machine.openshift.io/v1beta1"
      E0514 13:25:09.275665       1 controller.go:329] "Reconciler error" err="Failed to list machines: no matches for kind \"Machine\" in version \"machine.openshift.io/v1beta1\"" controller="certificatesigningrequest" controllerGroup="certificates.k8s.io" controllerKind="CertificateSigningRequest" CertificateSigningRequest="csr-dw9c8" namespace="" name="csr-dw9c8" reconcileID="6f963337-c6f1-46e7-80c4-90494d21653c"
      I0514 13:25:43.792140       1 controller.go:120] Reconciling CSR: csr-jvrvt
      E0514 13:25:43.798079       1 controller.go:138] csr-jvrvt: Failed to list machines in API group machine.openshift.io/v1beta1: no matches for kind "Machine" in version "machine.openshift.io/v1beta1"
      E0514 13:25:43.798128       1 controller.go:329] "Reconciler error" err="Failed to list machines: no matches for kind \"Machine\" in version \"machine.openshift.io/v1beta1\"" controller="certificatesigningrequest" controllerGroup="certificates.k8s.io" controllerKind="CertificateSigningRequest" CertificateSigningRequest="csr-jvrvt" namespace="" name="csr-jvrvt" reconcileID="decbc5d9-fa10-45d1-92f1-1c999df956ff" 

              bzamalut@redhat.com Bulat Zamalutdinov
              openshift-crt-jira-prow OpenShift Prow Bot
              Milind Yadav Milind Yadav
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: