Uploaded image for project: 'Machine Config Operator'
  1. Machine Config Operator
  2. MCO-499

The MCD should manage certificates via a separate, non-MC path

    XMLWordPrintable

Details

    • MCD Certificate Improvements
    • False
    • None
    • False
    • Not Selected
    • To Do
    • TELCOSTRAT-87 - Single Core CPU CaaS Budget for DU Deployment w/ Single-Node OpenShift on Sapphire Rapids Platform
    • OCPSTRAT-832MCO should manage certificates via non-MC path
    • 100
    • 100% 100%
    • +
    • 0
    • 0

    Description

      Latest status as of 4.14 freeze:

      The MCD no longer uses MachineConfigs to update certs, but rather reads it off our internal resource "controllerconfig" directly. The MachineConfig path still exists but is a no-op (although the MCO still falsely claims an update is pending as a result). The MachineConfig removal work is ready, but waiting for windows-MCO to change their workflow so as to not break them.

       

      --------------------------------

       

      The logic for handling certificate rotation should live outside of the MachineConfig-files path as it stands today. This will allow certs to rotate live, through paused pools, without generating additional churn in rendered configs, and most, if not all, certificates do not require drains/reboots to the node.

       

      Context

      The MCO has, since the beginning of time, managed certificates. The general flow is a cluster configmap -> MCO -> controllerconfig -> MCC -> renderedconfig -> MCD -> laid down to disk as a regular file.

       

      When we talk about certs, the MCD actually manages 4 (originally 5) certs: see https://docs.google.com/document/d/1ehdOYDY-SvUU9ffdIKlt7XaoMaZ0ioMNZMu31-Mo1l4/edit (this document is a bit outdated)

      Of these, the only one we care about is "/etc/kubernetes/kubelet-ca.crt", which is a bundle of 5 (now 7) certs. This will be expanded on below.

       

      Unlike regular files though, certificates rotate automatically at some set cadence. Prior to 4.7, this would cause the MCD to seemingly randomly start an update and reboot nodes, much to the annoyance of customers, so we made it disruptionless.

       

      There was still one more problem, a lot of users pauses pools for additional safety (which is their way of saying we don't want you to disrupt our workloads), which still gated the certificate from actually rotating in when it updated. In 4.12 and previous versions, this means that at 80% of the 1 year mark, a new kube-apiserver-to-kubelet-signer cert would be generated. After ~12 hours, this would affect some operation (oc logs, etc.) since the old signer was no longer matching the apiserver's new cert. At the one year mark, this would proceed to break entirely the kubelet. To combat this, we added an alert MachineConfigControllerPausedPoolKubeletCA to warn the users about the effects and expiry, which was ok since this should only be an annual occurrence.

       

      Updates for 4.13

      In 4.13, we realized that the kubelet-ca cert was being read from a wrong location which updated the kube-apiserver-to-kubelet-signer I mentioned above, but not some other certs. This was not a problem since nobody was depending on them, but in 4.13, monitoring was updated to use the right certs which then subsequently caused reports of kubeletdown to fire, which then David Eads fixed via https://github.com/openshift/machine-config-operator/pull/3458

      So now instead of expired certs we have correct certs, which is great, but now we realized that the cert rotation will happen much more frequently.

       

      Previously on the system, we had:

      admin-kubeconfig-signer, kubelet-signer, kube-apiserver-to-kubelet-signer, kube-control-plane-signer, kubelet-bootstrap-kubeconfig-signer

       

      now with the correct certs, right after install we get: admin-kubeconfig-signer, kube-csr-signer_@1675718562, kubelet-signer, kube-apiserver-to-kubelet-signer, kube-control-plane-signer, kubelet-bootstrap-kubeconfig-signer, openshift-kube-apiserver-operator_node-system-admin-signer@1675718563

       

      The most immediate issue was bootstrap drift, which John solved via https://github.com/openshift/machine-config-operator/pull/3513

       

      But the issue here is now we are updating two certs:

      1. kube-csr-signer, rotated every month
      2. openshift-kube-controller-manager-operator_csr-signer-signer (called kubelet-signer until the first rotation), rotated every two months

       

      Meaning that every month we would be generating at least 2 new machineconfigs (new one rotating in, old one rotating out) to manage this.

      During install, due to how the certs are set up (bootstrap ones expire in 24h) this means you get 5 MCs within 24 hours: bootstrap bundle, incluster bundle, incluster bundle with 1 new, incluster bundle with 2 new, incluster bundle with 2 new 2 old removed

      On top of this, previously the cluster chugged along with the expiry with only the warning, but now, when the old certs rotate and the pools paused, TargetDown and KubeletDown fires after a few hours, making it very bad from a user perspective.

       

      Solutions

      Solution1: don't do anything

      Nothing should badly break, but the user will get critical alerts after ~1 month if they pause and upgrade to 4.13. Not a great UX

      Solution2: revert the monitoring change or mask the alert

      A bit late, but potentially doable? Masking the alert will likely mask real issues, though

      Solution3: MVP MCD changes (Estimate: 1week)

      The MCD update, MCD verification, MCD config drift monitor all ignore the kubelet-ca cert file. The MCD gets a new routine to update the file, reading from a configmap the MCC manages. The MCC still renders the cert but the cert will be updated even if the pool is paused

      Solution4: MVP MCC changes (Estimate: a few days)

      Have the controller splice in changes even when the pool is paused. John has a MVP here: https://github.com/openshift/machine-config-operator/compare/master...jkyros:machine-config-operator:mco-77-bypass-pause 
      This is a cleaner solution compared to 3, but will cause the pool to go into updating briefly. If there are other operations causing nodes to be cordoned, etc., we would have to consider overriding that.

      Solution5: MCD cert management path (full, Estimate: 1 sprint)

      The cert is removed from the rendered-config. The MCC will read it off the controllerconfig and render it into a custom configmap. The MCS will add this additional file when serving content, but it is not part of the rendered-MC otherwise. The MCD will have a new routine to manage the certs live directly.

      The bootstrap MCS will also need to have a way to render it into the initial served configuration without it being part of the MachineConfigs (this is especially important for HyperShift). We will have to make sure the inplace updater doesn't break

      We may also have to solve config drift problems from bootstrap to incluster, for self-driving and hypershift inplace

      We also have to make sure the file isn't deleted upon an update to the new management, so the certs don't disappear for awhile, since the MCD would have seen the diff and deleted it

       

      DOCS (WIP)

       

      https://docs.google.com/document/d/1qXYV9Hj98QhJSKx_2IWBbU_bxu30YQtbR21mmOtBsIg/edit?usp=sharing

      Attachments

        Issue Links

          Activity

            People

              cdoern@redhat.com Charles Doern
              jerzhang@redhat.com Yu Qi Zhang
              John Kyros, Yu Qi Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: