-
Bug
-
Resolution: Done
-
Critical
-
4.13.0
-
None
-
No
-
0
-
WINC - Sprint 234, WINC - Sprint 235
-
2
-
False
-
This is a clone of issue OCPBUGS-10709. The following is the description of the original issue:
—
This is a clone of issue OCPBUGS-10572. The following is the description of the original issue:
—
Description of problem:
Upgrading on GCP BYOH nodes from 7.0.1 GA to latest 8.0.0 is failing nodes remains in NotReady,SchedulingDisabled without kubelet version upgraded, pods are stuck in pending state. Tested with machineset nodes and upgrade is successful
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-03-11-033820, from windows-services-7.0.1-bc9473b to windows-services-8.0.0-0371c56
How reproducible:
100%
Steps to Reproduce:
1. install WMCO 7.0.1 GA on 4.12 GCP cluster 2. configure a VM with a Windows Server 2022 Datacenter 3. install SSH on that node 4. set the username and IP address in the windows-instances cm 5. wait until the nodes are configured by WMCO and node is in Ready status 6. upgrade the cluster to latest 8.0.0 version 7. after the cluster upgraded modify the wmco_index to use the latest 8.0.0 build 8. check the CSV that the version upgraded to latest 8.0.0 version
Actual results:
Nodes are in not ready NAME STATUS ROLES AGE VERSION byoh-winc-0.c.openshift-qe.internal NotReady,SchedulingDisabled worker 115m v1.25.0-2653+a34b9e9499e6c3 byoh-winc-1.c.openshift-qe.internal NotReady,SchedulingDisabled worker 122m v1.25.0-2653+a34b9e9499e6c3 Pods are in pending status win-webserver-745df6565f-dlnl8 0/1 Pending 0 18m <none> <none> <none> <none> win-webserver-745df6565f-dm7mx 0/1 Pending 0 18m <none> <none> <none> <none> Nodes are not upgraded WMCO log: {"level":"info","ts":"2023-03-15T09:30:03Z","logger":"wc 10.0.128.12","msg":"deconfiguring"} {"level":"error","ts":"2023-03-15T09:30:04Z","logger":"wc 10.0.128.12","msg":"error running","cmd":"powershell.exe -NonInteractive -ExecutionPolicy Bypass \"C:\\k\\windows-instance-config-daemon.exe cleanup --api-server https://api-int.rrasouli-113.qe.gcp.devcluster.openshift.com:6443 --sa-ca C:\\k\\sa-ca.crt --sa-token C:\\k\\sa-token --namespace openshift-windows-machine-config-operator\"","out":"I0315 09:30:04.393643 6148 cleanup.go:106] removed services: [\"kube-proxy\" \"hybrid-overlay-node\" \"windows_exporter\" \"kubelet\"]\nI0315 09:30:04.393745 6148 cleanup.go:88] deleting node byoh-winc-1.c.openshift-qe.internal\nF0315 09:30:04.396410 6148 cleanup.go:56] nodes \"byoh-winc-1.c.openshift-qe.internal\" is forbidden: User \"system:serviceaccount:openshift-windows-machine-config-operator:windows-instance-config-daemon\" cannot delete resource \"nodes\" in API group \"\" at the cluster scope\n","error":"Process exited with status 1","stacktrace":"github.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).Run\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:383\ngithub.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).RunWICDCleanup\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:401\ngithub.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:414\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:442\ngithub.com/openshift/windows-machine-config-operator/controllers.(*instanceReconciler).ensureInstanceIsUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/controllers.go:81\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).ensureInstancesAreUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:315\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).reconcileNodes\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:280\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).Reconcile\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:190\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235"} {"level":"info","ts":"2023-03-15T09:30:04Z","logger":"wc 10.0.128.12","msg":"failed to cleanup node","command":"C:\\k\\windows-instance-config-daemon.exe cleanup --api-server https://api-int.rrasouli-113.qe.gcp.devcluster.openshift.com:6443 --sa-ca C:\\k\\sa-ca.crt --sa-token C:\\k\\sa-token --namespace openshift-windows-machine-config-operator","output":"I0315 09:30:04.393643 6148 cleanup.go:106] removed services: [\"kube-proxy\" \"hybrid-overlay-node\" \"windows_exporter\" \"kubelet\"]\nI0315 09:30:04.393745 6148 cleanup.go:88] deleting node byoh-winc-1.c.openshift-qe.internal\nF0315 09:30:04.396410 6148 cleanup.go:56] nodes \"byoh-winc-1.c.openshift-qe.internal\" is forbidden: User \"system:serviceaccount:openshift-windows-machine-config-operator:windows-instance-config-daemon\" cannot delete resource \"nodes\" in API group \"\" at the cluster scope\n"} {"level":"error","ts":"2023-03-15T09:30:04Z","msg":"Reconciler error","controller":"configmap","controllerGroup":"","controllerKind":"ConfigMap","ConfigMap":{"name":"windows-instances","namespace":"openshift-windows-machine-config-operator"},"namespace":"openshift-windows-machine-config-operator","name":"windows-instances","reconcileID":"621823e1-bfff-4459-b174-6755821f2085","error":"error configuring host with address 10.0.128.12: error deconfiguring instance: Unable to cleanup the Windows instance: error running powershell.exe -NonInteractive -ExecutionPolicy Bypass \"C:\\k\\windows-instance-config-daemon.exe cleanup --api-server https://api-int.rrasouli-113.qe.gcp.devcluster.openshift.com:6443 --sa-ca C:\\k\\sa-ca.crt --sa-token C:\\k\\sa-token --namespace openshift-windows-machine-config-operator\": Process exited with status 1","errorVerbose":"Process exited with status 1\nerror running powershell.exe -NonInteractive -ExecutionPolicy Bypass \"C:\\k\\windows-instance-config-daemon.exe cleanup --api-server https://api-int.rrasouli-113.qe.gcp.devcluster.openshift.com:6443 --sa-ca C:\\k\\sa-ca.crt --sa-token C:\\k\\sa-token --namespace openshift-windows-machine-config-operator\"\ngithub.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).Run\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:385\ngithub.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).RunWICDCleanup\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:401\ngithub.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:414\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:442\ngithub.com/openshift/windows-machine-config-operator/controllers.(*instanceReconciler).ensureInstanceIsUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/controllers.go:81\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).ensureInstancesAreUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:315\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).reconcileNodes\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:280\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).Reconcile\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:190\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594\nUnable to cleanup the Windows instance\ngithub.com/openshift/windows-machine-config-operator/pkg/windows.(*windows).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/windows/windows.go:415\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:442\ngithub.com/openshift/windows-machine-config-operator/controllers.(*instanceReconciler).ensureInstanceIsUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/controllers.go:81\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).ensureInstancesAreUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:315\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).reconcileNodes\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:280\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).Reconcile\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:190\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594\nerror deconfiguring instance\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Deconfigure\n\t/remote-source/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:443\ngithub.com/openshift/windows-machine-config-operator/controllers.(*instanceReconciler).ensureInstanceIsUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/controllers.go:81\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).ensureInstancesAreUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:315\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).reconcileNodes\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:280\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).Reconcile\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:190\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594\nerror configuring host with address 10.0.128.12\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).ensureInstancesAreUpToDate\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:322\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).reconcileNodes\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:280\ngithub.com/openshift/windows-machine-config-operator/controllers.(*ConfigMapReconciler).Reconcile\n\t/remote-source/build/windows-machine-config-operator/controllers/configmap_controller.go:190\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235"}
Expected results:
BYOH upgrade nodes pass, all nodes are with latest kubelet version, in Ready state, workloads are running and scheduled to the correct Windows workers
Additional info:
- clones
-
OCPBUGS-10709 BYOH upgrade failed Unable to cleanup the Windows instance: error running powershell.exe -NonInteractive -ExecutionPolicy Bypass \"C:\\k\\windows-instance-config-daemon.exe cleanup -
- Closed
- is blocked by
-
OCPBUGS-10709 BYOH upgrade failed Unable to cleanup the Windows instance: error running powershell.exe -NonInteractive -ExecutionPolicy Bypass \"C:\\k\\windows-instance-config-daemon.exe cleanup -
- Closed
- links to
- mentioned on