-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.22
-
None
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
When we configure the TLS in the "cluster" apiserver resource, some operator pods start reporting a backoff state like this:
$ omc get pods
NAME READY STATUS RESTARTS AGE
etcd-operator-5fb7c9bc-r66js 0/1 CrashLoopBackOff 63 10h
$ omc logs etcd-operator-5fb7c9bc-r66js
2026-02-16T22:15:29.874393249Z I0216 22:15:29.874325 1 cmd.go:253] Using service-serving-cert provided certificates
2026-02-16T22:15:29.874393249Z I0216 22:15:29.874375 1 leaderelection.go:121] The leader election gives 4 retries and allows for 30s of clock skew. The kube-apiserver downtime tolerance is 78s. Worst non-graceful lease acquisition is 2m43s. Worst graceful lease acquisition is {26s}.
2026-02-16T22:15:29.874528883Z F0216 22:15:29.874507 1 cmd.go:182] open /var/run/secrets/kubernetes.io/serviceaccount/token: permission denied
Version-Release number of selected component (if applicable):
4.22
How reproducible:
Intermittent, rarerly.
Steps to Reproduce:
1. Configure the TLS in the apiserver "cluster" resource
$ oc patch apiserver cluster --type json -p '[{ "op": "add", "path": "/spec/tlsSecurityProfile", "value": {"type": "Old","old": {}}}]'
2.
3.
Actual results:
Many operator pods report Backoff state and cannot start properly. Since they cannot start, they can't recreate evicted pods and the update gets stuck because of poddisruptionbudgets.
Expected results:
No operator pod should be reporting Backoff state.
Additional info:
Deleting the pod manually fixes the issue. The new pod will be able to run without problems.
- relates to
-
MCO-2110 Migrate mco_security tests
-
- Closed
-