Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-41328

HostedClusterConfigOperator used wrong certificate for Kube certificate authority

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.16
    • HyperShift
    • Important
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          Rotating the root certificates (root CA) requires multiple certificates during the rotation process to prevent downtime as the server and client certificates are updated in the control and data planes. Currently, the HostedClusterConfigOperator uses the cluster-signer-ca from the control plane to create a kublet-serving-ca on the data plane. The cluster-signer-ca contains only a single certificate that is used for signing certificates for the kube-controller-manager. 
      
      During a rotation, the kublet-serving-ca will be updated with the new CA which triggers the metrics-server pod to restart and use the new CA. This will lead to an error in the metrics-server where it cannot scrape metrics as the kublet has yet to pickup the new certificate.
      
      E0808 16:57:09.829746       1 scraper.go:149] "Failed to scrape node" err="Get \"https://10.240.0.29:10250/metrics/resource\": tls: failed to verify certificate: x509: certificate signed by unknown authority" node="pres-cqogb7a10b7up68kvlvg-rkcpsms0805-default-00000130"
      
      rkc@rmac ~> kubectl get pods -n openshift-monitoring
      NAME                                                     READY   STATUS    RESTARTS   AGE
      metrics-server-594cd99645-g8bj7                          0/1     Running   0          2d20h
      metrics-server-594cd99645-jmjhj                          1/1     Running   0          46h 
      
      The HostedClusterConfigOperator should likely be using the KubeletClientCABundle from the control plane for the kublet-serving-ca in the data plane. This CA bundle will contain both the new and old CA such that all data plane components can remain up during the rotation process.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

            rcradick Ryan Cradick
            rcradick Ryan Cradick
            Jie Zhao Jie Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: