-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18.0, 4.19.0
-
None
-
Approved
-
False
-
-
-
Enhancement
-
In Progress
Description of problem:
Requests allow up to 30s for etcd to respond. Readiness probes only allow 9s for etcd to respond. When etcd latency is between 10-30s, standard requests will succeed, but due to the readiness probe configuration we lose every apiserver endpoint at the same time. This requires correction in the pod definitions and the load balancers. Making the ongoing readiness check `readyz?exclude=etcd` should correct the issue.
Off the top of my head this will include
- kube-apiserver operator
- authentication operator
- openshift-apiserver operator
- MCO apiserver-watch
- metal LB
- https://github.com/multi-arch/ocp-remote-ci/pull/39
- where LBs are defined for aws, azure, and gcp
This is a low cost, low risk, high benefit change.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
- blocks
-
OCPBUGS-49749 Readiness probes must not rely on etcd
- POST
- is cloned by
-
OCPBUGS-49749 Readiness probes must not rely on etcd
- POST
- links to