-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.22
-
None
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
When the certificate in loadbalancer-serving-signer expires, MCDs fail reporting this error
E0206 13:19:21.198022 2776 daemon.go:1410] Got an error from auxiliary tools: failed to list *v1.Node: Get "https://api-int.sregidor-exp2pr.qe.devcluster.openshift.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dip-10-0-77-136.us-east-2.compute.internal&resourceVersion=47182": tls: failed to verify certificate: x509: certificate signed by unknown authority
E0206 13:19:57.391123 2776 reflector.go:205] "Failed to watch" err="failed to list *v1.Node: Get \"https://api-int.sregidor-exp2pr.qe.devcluster.openshift.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dip-10-0-77-136.us-east-2.compute.internal&resourceVersion=47182\": tls: failed to verify certificate: x509: certificate signed by unknown authority" reflector="k8s.io/client-go/informers/factory.go:160" type="*v1.Node"
E0206 13:19:57.391153 2776 daemon.go:1410] Got an error from auxiliary tools: failed to list *v1.Node: Get "https://api-int.sregidor-exp2pr.qe.devcluster.openshift.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dip-10-0-77-136.us-east-2.compute.internal&resourceVersion=47182": tls: failed to verify certificate: x509: certificate signed by unknown authority
Version-Release number of selected component (if applicable):
4.22
How reproducible:
Always
Steps to Reproduce:
1. Modify the installer and the cluster-kube-apiserver-operator so that loadbalancer-serving-signer expires after 2 hours
We can do it using PRs like these ones to generate the image that we will test
https://github.com/openshift/installer/pull/10291/changes
https://github.com/openshift/cluster-kube-apiserver-operator/pull/2030/changes
use clusterbot like this to generate the image
build 4.22,openshift/installer#10291,openshift/machine-config-operator#5623
2. Install a cluster using the image generated in step 1
3. Wait 2 hours until the certificate expires
4. Read the logs in MCD pods
Actual results:
We will find these errors in most MCD pods
E0206 13:19:21.198022 2776 daemon.go:1410] Got an error from auxiliary tools: failed to list *v1.Node: Get "https://api-int.sregidor-exp2pr.qe.devcluster.openshift.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dip-10-0-77-136.us-east-2.compute.internal&resourceVersion=47182": tls: failed to verify certificate: x509: certificate signed by unknown authority
E0206 13:19:57.391123 2776 reflector.go:205] "Failed to watch" err="failed to list *v1.Node: Get \"https://api-int.sregidor-exp2pr.qe.devcluster.openshift.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dip-10-0-77-136.us-east-2.compute.internal&resourceVersion=47182\": tls: failed to verify certificate: x509: certificate signed by unknown authority" reflector="k8s.io/client-go/informers/factory.go:160" type="*v1.Node"
E0206 13:19:57.391153 2776 daemon.go:1410] Got an error from auxiliary tools: failed to list *v1.Node: Get "https://api-int.sregidor-exp2pr.qe.devcluster.openshift.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dip-10-0-77-136.us-east-2.compute.internal&resourceVersion=47182": tls: failed to verify certificate: x509: certificate signed by unknown authority
Expected results:
When the certificate is rotated, MCDs should include it in the kubeconfig file and restart kubelet and they should continue working fine.
Additional info:
This certificate expires in 10 years in normal conditions. That's why we need to hack the installer to test this behaviour.