-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem
When the kube-apiserver needs to post to a custom mutating webhook, there are intermittent tls: failed to verify certificate: x509: certificate signed by unknown authority errors although the certificate returned by the webhook is correct.
The amount of errors depends on the load, i.e. the error is more likely to occur if we force more load in kube-apiserver, such that the webhook has to be invoked more times. It is important to highlight that it doesn't always fail, but only sometimes (the more load, the more frequently), which suggests some kind of internal race that makes the TLS validation of the webhook server to sometimes fail when it shouldn't.
A network traffic capture revealed that the server certificate returned is always the same (either when there is error and when there is not) and it matches the caBundle one (it is self-signed). That, together with the intermittent nature of the error, confirms that there must be something odd happening inside the kube-apiserver for the TLS error to happen.
Version-Release number of selected component (if applicable):
4.18.22
How reproducible:
Always, depending on the load, at the customer cluster.
Steps to Reproduce:
- Install the custom mutating admission webhooks.
- Induce enough load in the kube-apiserver, such that those webhooks have to be invoked many times.
Actual results
TLS validation errors for some of the requests made to the webhooks, although the server always returns the same correct certificate (which is accepted in the rest of the requests).
Expected results
No spurious TLS validation errors.
Additional info
In comments.