Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25729

oc hits x509 error even if API server cert is issued by the well trusted "Let's Encrypt" product env

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.15.0
    • apiserver-auth
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      oc hits x509 error even if API server cert is issued by the well trusted "Let's Encrypt" product env.
      

      Version-Release number of selected component (if applicable):

      $ oc version
      Client Version: 4.15.0-0.nightly-2023-12-19-033450
      Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
      Server Version: 4.15.0-0.nightly-2023-12-19-033450
      Kubernetes Version: v1.28.4+7aa0a74
      

      How reproducible:

      Always
      

      Steps to Reproduce:

      1. Install AWS cluster. Then curl API server before adding an API server named certificate
      $ export KUBECONFIG=/path/to/admin/kubeconfig
      It is expected to see below error in output:
      $ curl $(oc whoami --show-server)
      curl: (60) SSL certificate problem: self-signed certificate in certificate chain
      More details here: https://curl.se/docs/sslcerts.html
      ...
      
      This is because the original API server certificate is issued by untrusted CA generated during cluster installtion.
      
      2. Use cert-manager to generate API server certificate issued by the well trusted "Let's Encrypt" product env. Here the API server certificate is be issued for same FQDN as it is in original admin kubeconfig "server" field:
      $ oc create secret generic test-secret --from-literal=secret-access-key=$(grep aws_secret_access_key ~/.aws/credentials | cut -d = -f 2) -n cert-manager
      $ oc patch certmanager/cluster --type=merge -p='{"spec":{"controllerConfig":{"overrideArgs":["--dns01-recursive-nameservers=1.1.1.1:53","--dns01-recursive-nameservers-only"]}}}'
      
      $ FQDN=$(oc whoami --show-server | sed -e "s|https://||" -e "s/:6443//")
      $ echo $FQDN
      api.xxxxxxxxx.<hidden_DNS_zone>
      
      $ oc create -f - << EOF
      apiVersion: cert-manager.io/v1
      kind: ClusterIssuer
      metadata:
        name: test-cluster-issuer
      spec:
        acme:
          preferredChain: ""
          privateKeySecretRef:
            name: example-issuer-account-key
          server: https://acme-v02.api.letsencrypt.org/directory
          solvers:
          - selector:
              dnsZones:
                - <hidden_DNS_zone>
            dns01:
              route53:
                region: us-east-1
                accessKeyID: <hidden_accessKeyID>
                hostedZoneID: <hostedZoneID of hidden_DNS_zone>
                secretAccessKeySecretRef:
                  name: test-secret
                  key: secret-access-key
      EOF
      
      $ oc create -n openshift-config -f - << EOF
      apiVersion: cert-manager.io/v1
      kind: Certificate
      metadata:
        name: api-server-cert
      spec:
        secretName: api-server-cert
        dnsNames:
        - $FQDN
        issuerRef:
          kind: ClusterIssuer
          name: test-cluster-issuer
      EOF
      
      $ oc extract secret/api-server-cert -n openshift-config
      tls.crt
      tls.key
      
      $ openssl crl2pkcs7 -nocrl -certfile tls.crt | openssl pkcs7 -print_certs -text | egrep -A4 Issuer
              Issuer: C=US, O=Let's Encrypt, CN=R3
      ...
              Subject: CN=api.xxxxxxxxx.<hidden_DNS_zone>
      ...
                  X509v3 Subject Alternative Name:
                      DNS:api.xxxxxxxxx.<hidden_DNS_zone>
      --
              Issuer: C=US, O=Internet Security Research Group, CN=ISRG Root X1
      ...
              Subject: C=US, O=Let's Encrypt, CN=R3
      --
              Issuer: O=Digital Signature Trust Co., CN=DST Root CA X3
      ...
              Subject: C=US, O=Internet Security Research Group, CN=ISRG Root X1
      
      3. Follow https://docs.openshift.com/container-platform/4.14/security/certificates/api-server.html to add an API server named certificate
      $ oc patch apiserver/cluster --type=merge -p "
      spec:
        servingCerts:
          namedCertificates:
          - names:
            - $FQDN
            servingCertificate:
              name: api-server-cert
      "
      
      Wait about 15 mins for kube-apiserver to finish rollout.
      
      curl API server again given API server named certificate is already added:
      $ curl $(oc whoami --show-server)
      {
        "kind": "Status",
      ...
      }
      
      We can see curl does not have previous "curl: (60) SSL certificate problem: self-signed certificate in certificate chain" error now. This is because the added API server certificate is issued by trusted CA from the well trusted "Let's Encrypt" product env.
      
      But oc turns to hit the error now:
      $ oc get po -n openshift-kube-apiserver -L revision -l apiserver
      Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority
      
      Until --insecure-skip-tls-verify is used:
      $ oc get po -n openshift-kube-apiserver -L revision -l apiserver --insecure-skip-tls-verify
      NAME                                                       READY   STATUS    RESTARTS   AGE   REVISION
      kube-apiserver-ip-10-0-19-145.us-east-2.compute.internal   5/5     Running   0          31m   7
      kube-apiserver-ip-10-0-62-250.us-east-2.compute.internal   5/5     Running   0          27m   7
      kube-apiserver-ip-10-0-64-61.us-east-2.compute.internal    5/5     Running   0          23m   7
      

      Actual results:

      oc turns to hit above error, given the added API server certificate is issued by trusted CA from the well trusted "Let's Encrypt" product env
      

      Expected results:

      oc should not hit above error, given the added API server certificate is issued by trusted CA from the well trusted "Let's Encrypt" product env
      

      Additional info:

      This bug is not a blocker.
      
      FYI, to solve above x509 error during oc communicates with kube-apiserver, here is a workaround - commenting out "certificate-authority-data" from $KUBECONFIG:
      $ sed -i "s/certificate-authority-data/#certificate-authority-data/" $KUBECONFIG
      $ oc get po -n openshift-kube-apiserver -L revision -l apiserver
      NAME                                                       READY   STATUS    RESTARTS   AGE   REVISION
      kube-apiserver-ip-10-0-19-145.us-east-2.compute.internal   5/5     Running   0          38m   7
      kube-apiserver-ip-10-0-62-250.us-east-2.compute.internal   5/5     Running   0          34m   7
      kube-apiserver-ip-10-0-64-61.us-east-2.compute.internal    5/5     Running   0          30m   7
      
      But this workaround is not thorough, it brings new issue below - below command then turns to hit the error:
      $ oc login -u user -p pass
      error: tls: failed to verify certificate: x509: certificate signed by unknown authority
      
      This is because oc further communicates with the https route:
      $ oc login -u user -p pass --v 6
      ...
      I1220 21:20:02.580555    6122 round_trippers.go:553] GET https://api.xxxxxxxxx.<hidden_DNS_zone>:6443/.well-known/oauth-authorization-server 200 OK in 920 milliseconds
      I1220 21:20:03.464208    6122 request_token.go:578] falling back to kubeconfig CA due to possible x509 error: x509: certificate signed by unknown authority
      I1220 21:20:04.016590    6122 round_trippers.go:553] GET https://oauth-openshift.apps.api.xxxxxxxxx.<hidden_DNS_zone>/oauth/authorize?client_id=openshift-challenging-client&...2gt in 552 milliseconds
      ...
      error: tls: failed to verify certificate: x509: certificate signed by unknown authority
      

            kostrows@redhat.com Krzysztof Ostrowski
            xxia-1 Xingxing Xia
            Xingxing Xia Xingxing Xia
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: