-
Bug
-
Resolution: Done-Errata
-
Major
-
4.16
-
Critical
-
No
-
1
-
OCPEDGE Sprint 254
-
1
-
Proposed
-
False
-
-
Release Note Not Required
-
In Progress
-
This is a clone of issue OCPBUGS-33644. The following is the description of the original issue:
—
Description of problem:
After running tests on an SNO with Telco DU profile for a couple of hours kubernetes.io/kubelet-serving CSRs in Pending state start showing up and accumulating in time.
Version-Release number of selected component (if applicable):
4.16.0-rc.1
How reproducible:
once so far
Steps to Reproduce:
1. Deploy SNO with DU profile with disabled capabilities: installConfigOverrides: "{\"capabilities\":{\"baselineCapabilitySet\": \"None\", \"additionalEnabledCapabilities\": [ \"NodeTuning\", \"ImageRegistry\", \"OperatorLifecycleManager\" ] }}" 2. Leave the node running tests overnight for a couple of hours 3. Check for Pending CSRs
Actual results:
oc get csr -A | grep Pending | wc -l 27
Expected results:
No pending CSRs Also oc logs will return a tls internal error: oc -n openshift-cluster-machine-approver --insecure-skip-tls-verify-backend=true logs machine-approver-866c94c694-7dwks Defaulted container "kube-rbac-proxy" out of: kube-rbac-proxy, machine-approver-controller Error from server: Get "https://[2620:52:0:8e6::d0]:10250/containerLogs/openshift-cluster-machine-approver/machine-approver-866c94c694-7dwks/kube-rbac-proxy": remote error: tls: internal error
Additional info:
Checking the machine-approver-controller container logs on the node we can see the reconciliation is failing be cause it cannot find the Machine API which is disabled from the capabilities. I0514 13:25:09.266546 1 controller.go:120] Reconciling CSR: csr-dw9c8 E0514 13:25:09.275585 1 controller.go:138] csr-dw9c8: Failed to list machines in API group machine.openshift.io/v1beta1: no matches for kind "Machine" in version "machine.openshift.io/v1beta1" E0514 13:25:09.275665 1 controller.go:329] "Reconciler error" err="Failed to list machines: no matches for kind \"Machine\" in version \"machine.openshift.io/v1beta1\"" controller="certificatesigningrequest" controllerGroup="certificates.k8s.io" controllerKind="CertificateSigningRequest" CertificateSigningRequest="csr-dw9c8" namespace="" name="csr-dw9c8" reconcileID="6f963337-c6f1-46e7-80c4-90494d21653c" I0514 13:25:43.792140 1 controller.go:120] Reconciling CSR: csr-jvrvt E0514 13:25:43.798079 1 controller.go:138] csr-jvrvt: Failed to list machines in API group machine.openshift.io/v1beta1: no matches for kind "Machine" in version "machine.openshift.io/v1beta1" E0514 13:25:43.798128 1 controller.go:329] "Reconciler error" err="Failed to list machines: no matches for kind \"Machine\" in version \"machine.openshift.io/v1beta1\"" controller="certificatesigningrequest" controllerGroup="certificates.k8s.io" controllerKind="CertificateSigningRequest" CertificateSigningRequest="csr-jvrvt" namespace="" name="csr-jvrvt" reconcileID="decbc5d9-fa10-45d1-92f1-1c999df956ff"
- clones
-
OCPBUGS-33644 kubelet-serving CSRs in Pending state on SNO with Telco DU with disabled capabilities
- Closed
- is blocked by
-
OCPBUGS-33644 kubelet-serving CSRs in Pending state on SNO with Telco DU with disabled capabilities
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update