Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62639

Intermittent "tls: failed to verify certificate: x509: certificate signed by unknown authority" although certificate is valid and always the same

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.18
    • kube-apiserver
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem

      When the kube-apiserver needs to post to a custom mutating webhook, there are intermittent tls: failed to verify certificate: x509: certificate signed by unknown authority errors although the certificate returned by the webhook is correct.

      The amount of errors depends on the load, i.e. the error is more likely to occur if we force more load in kube-apiserver, such that the webhook has to be invoked more times. It is important to highlight that it doesn't always fail, but only sometimes (the more load, the more frequently), which suggests some kind of internal race that makes the TLS validation of the webhook server to sometimes fail when it shouldn't.

      A network traffic capture revealed that the server certificate returned is always the same (either when there is error and when there is not) and it matches the caBundle one (it is self-signed). That, together with the intermittent nature of the error, confirms that there must be something odd happening inside the kube-apiserver for the TLS error to happen.

      Version-Release number of selected component (if applicable):

      4.18.22

      How reproducible:

      Always, depending on the load, at the customer cluster.

      Steps to Reproduce:

      • Install the custom mutating admission webhooks.
      • Induce enough load in the kube-apiserver, such that those webhooks have to be invoked many times.

      Actual results

      TLS validation errors for some of the requests made to the webhooks, although the server always returns the same correct certificate (which is accepted in the rest of the requests).

      Expected results

      No spurious TLS validation errors.

      Additional info

      In comments.

              davegord@redhat.com Dave Gordon
              rhn-support-palonsor Pablo Alonso Rodriguez
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: