-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.20
-
Quality / Stability / Reliability
-
False
-
-
3
-
None
-
None
-
None
-
None
-
WINC - Sprint 280
-
1
-
In Progress
-
Bug Fix
-
-
None
-
None
-
None
-
None
Description of problem:
hybrid-overlay-node service is restarting due to cert rotation errors
This issue was discussed in OCPBUGS-59637. Problem was expected to be addressed in OCP 4.20/WMCO 10.20. Fix was to include --k8s-cacert
[root@vm-236-67 ~]# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.20.2 True False 24h Cluster version is 4.20.2
[root@vm-236-67 ~]# oc get configmap windows-services-10.20.0-e3c3dfe -n openshift-windows-machine-config-operator
NAME DATA AGE
windows-services-10.20.0-e3c3dfe 3 24h
administrator@WIN-PKOLG940BKQ C:\k>sc qc hybrid-overlay-node
[SC] QueryServiceConfig SUCCESS
SERVICE_NAME: hybrid-overlay-node
TYPE : 10 WIN32_OWN_PROCESS
START_TYPE : 3 DEMAND_START
ERROR_CONTROL : 0 IGNORE
BINARY_PATH_NAME : C:\k\hybrid-overlay-node.exe --node win-pkolg940bkq --bootstrap-kubeconfig=C:\k\kubeconfig --cert-dir=C:\k\cni\config --cert-duration=24h --windows-service --logfile C:\var\log\hybrid-overlay\hybrid-overlay.log --k8s-cacert C:\k\ca-bundle.crt --hybrid-overlay-vxlan-port 4789 --loglevel 5
LOAD_ORDER_GROUP :
TAG : 0
DISPLAY_NAME : hybrid-overlay-node
DEPENDENCIES : kubelet
SERVICE_START_NAME : LocalSystem
hybrid-overlay-node logs
I1105 01:41:32.000965 1660 certificate_manager.go:422] "Certificate rotation is enabled" logger="kubernetes.io/kube-apiserver-client"
I1105 01:41:32.000965 1660 kube.go:426] Certificate found
I1105 01:41:32.000965 1660 certificate_manager.go:715] "Certificate rotation deadline determined" logger="kubernetes.io/kube-apiserver-client" expiration="2025-11-06 04:50:54 +0000 UTC" deadline="2025-11-06 01:24:15.547806536 +0000 UTC"
I1105 01:41:32.000965 1660 certificate_manager.go:431] "Waiting for next certificate rotation" logger="kubernetes.io/kube-apiserver-client" sleep="15h42m43.546840836s"
--
--
I1105 17:24:16.330619 1660 certificate_manager.go:566] "Rotating certificates" logger="kubernetes.io/kube-apiserver-client"
E1105 17:24:16.341965 1660 certificate_manager.go:596] "Failed while requesting a signed certificate from the control plane" err="cannot create certificate signing request: Post \"http://localhost:8443/apis/certificates.k8s.io/v1/certificatesigningrequests\": dial tcp [::1]:8443: connectex: No connection could be made because the target machine actively refused it." logger="kubernetes.io/kube-apiserver-client.UnhandledError"
E1105 17:24:16.341965 1660 certificate_manager.go:596] "Failed while requesting a signed certificate from the control plane" err="cannot create certificate signing request: Post \"http://localhost:8443/apis/certificates.k8s.io/v1/certificatesigningrequests\": dial tcp [::1]:8443: connectex: No connection could be made because the target machine actively refused it." logger="kubernetes.io/kube-apiserver-client.UnhandledError"
--
--
I1105 20:50:59.533449 1660 certificate_manager.go:387] "Current certificate is expired" logger="kubernetes.io/kube-apiserver-client"
E1105 20:50:59.533449 1660 kube.go:437] The current certificate is invalid, exiting.
E1105 20:50:59.533449 1660 kube.go:437] The current certificate is invalid, exiting.
E1105 20:50:59.533449 1660 kube.go:437] The current certificate is invalid, exiting.
Workaround
It's a temporary workaround until we address this issue
[Environment]::SetEnvironmentVariable('KUBECONFIG','C:\k\kubeconfig','Machine')
After setting the above environment variable , no more restarts were seen
Version-Release number of selected component (if applicable):
OCP 4.20.2
WMCO 10.20
How reproducible:
Always
Steps to Reproduce:
1. Add a windows node to OCP cluster
2. Wait for 24 hours and check hybrid-overlay-node logs for errors
3.
Actual results:
Certificate rotation fails and hybrid-overlay-node service restarts
Expected results:
Certificate rotation should be successful avoiding hybrid-overlay-node service restarts
Additional info:
Application outages are seen which coincides with the hybrid-overlay-node service restarts
- is blocked by
-
OCPBUGS-65856 hybrid-overlay-node fails to Certificate Rotation due to invalid APIServer when BootstrapKubeconfig option is provided
-
- Closed
-
- is duplicated by
-
OCPBUGS-65856 hybrid-overlay-node fails to Certificate Rotation due to invalid APIServer when BootstrapKubeconfig option is provided
-
- Closed
-
- links to