-
Sub-task
-
Resolution: Unresolved
-
Critical
-
None
-
None
-
False
-
None
-
False
-
OCPSTRAT-1797 - Hitless TLS Certificate Rotation for Kubernetes API
-
-
Currently, the EtcdCertSigner controller, which is part of the CEO, renews the aforementioned certificates roughly every 3 years. However, if the cluster is offline for a period longer than the certificate's validity, upon restarting the cluster, the controller won't be able to renew the certificates since the operator won't be running at all.
It seems that we could apply a similar solution to what has been implemented for kube-apiserver.
The document describing the solution for kube-apiserver in detail is at https://github.com/openshift/enhancements/blob/master/enhancements/kube-apiserver/auto-cert-recovery.md
The Cert rotation controller is at https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/pkg/operator/certrotationcontroller/certrotationcontroller.go#L69
The Recovery controller is at https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/cmd/cluster-kube-apiserver-operator/main.go#LL57C78-L57C86
The Cert syncer is at https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/cmd/cluster-kube-apiserver-operator/main.go#LL56C16-L56C105 and https://github.com/openshift/library-go/tree/master/pkg/operator/staticpod/certsyncpod
Scope:
- Applies to peer serving and serving metrics certificates
- Does not apply to etcd signer and etcd client
Acceptance criteria
- KEP reviewed and approved
- The EtcdCertSigner controller is successfully able to renew the peer, serving, and serving-metrics certificates when their validity period expires.
- The renewal process for certificates is automated and does not require manual intervention, ensuring seamless operation
- Proper documentation is provided detailing the renewal process and the steps taken by the EtcdCertSigner controller to renew expired certificates when necessary.
- is related to
-
ETCD-510 Automatic recovery from expired server and peer certs
- Closed