-
Bug
-
Resolution: Done
-
Major
-
None
-
4.15.0
-
Critical
-
None
-
False
-
Description of problem:
On vsphere ,nutanix and UPI-AWS, after wmco seamless upgrade from 10.15.2 to 10.15.3, windowswoker gets into "Ready,SchedulingDisabled" , with wmco error: "expected 1 secret for SA 'windows-instance-config-daemon', found 2"
Version-Release number of selected component (if applicable):
10.15.3-eda454b 4.15.0-0.nightly-2024-08-13-134622
How reproducible:
100% on vsphere,nutanix and UPI-AWS Till now 0% on other platforms (aws/azure/gcp/node proxy)
Steps to Reproduce:
1. Install WMCO latest 10.15.2 2. Perform seamless upgrade to 10.15.3 (create catelogsource with IIB:786342, uninstall old wmco, install the new one) 3. wait for nodes to be ready (it will not)
Actual results:
> oc get no NAME STATUS ROLES AGE VERSION weinliu-2574-wkxrp-master-0 Ready control-plane,master 4h35m v1.28.12+396c881 weinliu-2574-wkxrp-master-1 Ready control-plane,master 4h35m v1.28.12+396c881 weinliu-2574-wkxrp-master-2 Ready control-plane,master 4h34m v1.28.12+396c881 weinliu-2574-wkxrp-worker-0-mzgp5 Ready worker 4h23m v1.28.12+396c881 weinliu-2574-wkxrp-worker-0-pnh92 Ready worker 4h23m v1.28.12+396c881 winworker-499qm Ready,SchedulingDisabled worker 3h42m v1.28.8+8974577 winworker-4dtt2 Ready worker 3h49m v1.28.8+8974577
Expected results:
windows nodes got ready
Additional info:
WMCO:ed 1 secret for SA 'windows-instance-config-daemon', found 2","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"} {"level":"info","ts":"2024-08-14T08:52:38Z","logger":"controller.windowsmachine","msg":"processing","windowsmachine":{"name":"winworker-4dtt2","namespace":"openshift-machine-api"},"address":"192.168.221.213"} {"level":"info","ts":"2024-08-14T08:52:40Z","logger":"controller.windowsmachine","msg":"instance requires upgrade","node":"winworker-4dtt2","version":"10.15.2-446d3ec","expected version":"10.15.3-eda454b"} {"level":"error","ts":"2024-08-14T08:52:40Z","msg":"Reconciler error","controller":"machine","controllerGroup":"machine.openshift.io","controllerKind":"Machine","Machine":{"name":"winworker-4dtt2","namespace":"openshift-machine-api"},"namespace":"openshift-machine-api","name":"winworker-4dtt2","reconcileID":"a17e964c-0570-4b30-a5af-56af4ce152ad","error":"unable to configure instance 423d304a-3446-5023-2160-6306b5f6c887: cannot mark node winworker-4dtt2 as upgrading, maximum number of parallel upgrading nodes reached (1)","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"} {"level":"info","ts":"2024-08-14T08:52:41Z","logger":"controller.windowsmachine","msg":"processing","windowsmachine":{"name":"winworker-499qm","namespace":"openshift-machine-api"},"address":"192.168.221.215"} {"level":"info","ts":"2024-08-14T08:52:43Z","logger":"controller.windowsmachine","msg":"instance requires upgrade","node":"winworker-499qm","version":"10.15.2-446d3ec","expected version":"10.15.3-eda454b"} {"level":"info","ts":"2024-08-14T08:52:43Z","logger":"nc 192.168.221.215","msg":"deconfiguring"} {"level":"error","ts":"2024-08-14T08:52:43Z","msg":"Reconciler error","controller":"machine","controllerGroup":"machine.openshift.io","controllerKind":"Machine","Machine":{"name":"winworker-499qm","namespace":"openshift-machine-api"},"namespace":"openshift-machine-api","name":"winworker-499qm","reconcileID":"2444b95f-7dda-4375-a853-af4c7cfd4d43","error":"unable to configure instance 423d11cc-fca4-c7ca-911d-a31ecf8b3e8d: expected 1 secret for SA 'windows-instance-config-daemon', found 2","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"} {"level":"info","ts":"2024-08-14T08:54:02Z","logger":"controller.windowsmachine","msg":"processing","windowsmachine":{"name":"winworker-499qm","namespace":"openshift-machine-api"},"address":"192.168.221.215"} {"level":"info","ts":"2024-08-14T08:54:04Z","logger":"controller.windowsmachine","msg":"instance requires upgrade","node":"winworker-499qm","version":"10.15.2-446d3ec","expected version":"10.15.3-eda454b"} {"level":"info","ts":"2024-08-14T08:54:04Z","logger":"nc 192.168.221.215","msg":"deconfiguring"} {"level":"error","ts":"2024-08-14T08:54:04Z","msg":"Reconciler error","controller":"machine","controllerGroup":"machine.openshift.io","controllerKind":"Machine","Machine":{"name":"winworker-499qm","namespace":"openshift-machine-api"},"namespace":"openshift-machine-api","name":"winworker-499qm","reconcileID":"e692966c-f55a-4d7d-95f8-bfaacc0152d3","error":"unable to configure instance 423d11cc-fca4-c7ca-911d-a31ecf8b3e8d: expected 1 secret for SA 'windows-instance-config-daemon', found 2","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"} {"level":"info","ts":"2024-08-14T08:54:07Z","logger":"controller.windowsmachine","msg":"processing","windowsmachine":{"name":"winworker-4dtt2","namespace":"openshift-machine-api"},"address":"192.168.221.213"} {"level":"info","ts":"2024-08-14T08:54:09Z","logger":"controller.windowsmachine","msg":"instance requires upgrade","node":"winworker-4dtt2","version":"10.15.2-446d3ec","expected version":"10.15.3-eda454b"} {"level":"error","ts":"2024-08-14T08:54:09Z","msg":"Reconciler error","controller":"machine","controllerGroup":"machine.openshift.io","controllerKind":"Machine","Machine":{"name":"winworker-4dtt2","namespace":"openshift-machine-api"},"namespace":"openshift-machine-api","name":"winworker-4dtt2","reconcileID":"690ee17d-33d4-4dd2-80d0-e59cb3b1864c","error":"unable to configure instance 423d304a-3446-5023-2160-6306b5f6c887: cannot mark node winworker-4dtt2 as upgrading, maximum number of parallel upgrading nodes reached (1)","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"} oc get cm -n openshift-windows-machine-config-operator NAME DATA AGE kube-root-ca.crt 1 4h4m openshift-service-ca.crt 1 4h4m windows-machine-config-operator-lock 0 92m windows-services-10.15.2-446d3ec 3 4h windows-services-10.15.3-eda454b 3 92m > oc describe node winworker-499qm Name: winworker-499qm Roles: worker Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-win10server beta.kubernetes.io/os=windows kubernetes.io/arch=amd64 kubernetes.io/hostname=winworker-499qm kubernetes.io/os=windows node-role.kubernetes.io/worker= node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-win10server node.kubernetes.io/windows-build=10.0.20348 node.openshift.io/os_id=Windows windowsmachineconfig.openshift.io/upgrading=true Annotations: k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac: 00-15-5D-F9-53-86 k8s.ovn.org/hybrid-overlay-node-subnet: 10.132.1.0/24 machine.openshift.io/machine: openshift-machine-api/winworker-499qm volumes.kubernetes.io/controller-managed-attach-detach: true windowsmachineconfig.openshift.io/desired-version: 10.15.2-446d3ec windowsmachineconfig.openshift.io/pub-key-hash: 6b5dbed399c508c1d0edcf1432bce445d9cb3f3c8832f14bd8b5e37c329ccfc5 windowsmachineconfig.openshift.io/username: wx4EBwMIyerJ9nJtTuNgaVWe72Ijsupw2Io98hiCmIPS5gFLwRlZNvPsN2xmn95hqv8jZIx9ISpWXspaAuhw6zRwZKqXdVtB3qgCQOqdoAe/SQMQ4yovwj63sGV/2v... windowsmachineconfig.openshift.io/version: 10.15.2-446d3ec CreationTimestamp: Wed, 14 Aug 2024 13:15:30 +0800 Taints: node.kubernetes.io/unschedulable:NoSchedule os=Windows:NoSchedule Unschedulable: true Lease: HolderIdentity: winworker-499qm AcquireTime: RenewTime: Wed, 14 Aug 2024 17:02:10 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Wed, 14 Aug 2024 16:57:47 +0800 Wed, 14 Aug 2024 13:15:30 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 14 Aug 2024 16:57:47 +0800 Wed, 14 Aug 2024 13:15:30 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Wed, 14 Aug 2024 16:57:47 +0800 Wed, 14 Aug 2024 13:15:30 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Wed, 14 Aug 2024 16:57:47 +0800 Wed, 14 Aug 2024 13:16:31 +0800 KubeletReady kubelet is posting ready status Addresses: Hostname: winworker-499qm InternalIP: 192.168.221.215 ExternalIP: 192.168.221.215 Capacity: cpu: 4 ephemeral-storage: 62400508Ki memory: 16776692Ki pods: 250 Allocatable: cpu: 3500m ephemeral-storage: 56434566254 memory: 15625716Ki pods: 250 System Info: Machine ID: winworker-499qm System UUID: CC113D42-A4FC-CAC7-911D-A31ECF8B3E8D Boot ID: 8 Kernel Version: 10.0.20348.2461 OS Image: Windows Server 2022 Standard Operating System: windows Architecture: amd64 Container Runtime Version: containerd://1.7.9 Kubelet Version: v1.28.8+8974577 Kube-Proxy Version: v1.28.8+8974577 ProviderID: vsphere://423d11cc-fca4-c7ca-911d-a31ecf8b3e8d Non-terminated Pods: (0 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 0 (0%) 0 (0%) memory 0 (0%) 0 (0%) ephemeral-storage 0 (0%) 0 (0%) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NodeNotSchedulable 91m (x2 over 3h46m) kubelet Node winworker-499qm status is now: NodeNotSchedulable > oc get secrets -n openshift-machine-api NAME TYPE DATA AGE builder-dockercfg-245qb kubernetes.io/dockercfg 1 3h43m builder-token-gf8s8 kubernetes.io/service-account-token 4 3h43m cluster-autoscaler-dockercfg-442cm kubernetes.io/dockercfg 1 3h43m cluster-autoscaler-operator-cert kubernetes.io/tls 2 4h36m cluster-autoscaler-operator-dockercfg-6gk6r kubernetes.io/dockercfg 1 3h43m cluster-autoscaler-operator-token-whwp6 kubernetes.io/service-account-token 4 3h43m cluster-autoscaler-token-szxhr kubernetes.io/service-account-token 4 3h43m cluster-baremetal-operator-dockercfg-94xqj kubernetes.io/dockercfg 1 3h43m cluster-baremetal-operator-tls kubernetes.io/tls 2 4h36m cluster-baremetal-operator-token-8gvlv kubernetes.io/service-account-token 4 3h43m cluster-baremetal-webhook-server-cert kubernetes.io/tls 2 4h36m control-plane-machine-set-operator-dockercfg-85j59 kubernetes.io/dockercfg 1 3h43m control-plane-machine-set-operator-tls kubernetes.io/tls 2 4h36m control-plane-machine-set-operator-token-qmtxp kubernetes.io/service-account-token 4 3h43m default-dockercfg-s6b9l kubernetes.io/dockercfg 1 3h43m default-token-fsncf kubernetes.io/service-account-token 4 3h43m deployer-dockercfg-6lfxd kubernetes.io/dockercfg 1 3h43m deployer-token-gjrpb kubernetes.io/service-account-token 4 3h43m machine-api-controllers-dockercfg-9fkfl kubernetes.io/dockercfg 1 3h43m machine-api-controllers-tls kubernetes.io/tls 2 4h36m machine-api-controllers-token-qkpxb kubernetes.io/service-account-token 4 3h43m machine-api-operator-dockercfg-749m7 kubernetes.io/dockercfg 1 3h43m machine-api-operator-machine-webhook-cert kubernetes.io/tls 2 4h36m machine-api-operator-tls kubernetes.io/tls 2 4h36m machine-api-operator-token-w5fjp kubernetes.io/service-account-token 4 3h43m machine-api-operator-webhook-cert kubernetes.io/tls 2 4h36m machine-api-termination-handler-dockercfg-9b96c kubernetes.io/dockercfg 1 3h43m machine-api-termination-handler-token-fq4k2 kubernetes.io/service-account-token 4 3h43m master-user-data Opaque 2 4h41m master-user-data-managed Opaque 2 4h34m vsphere-cloud-credentials Opaque 2 4h40m windows-user-data Opaque 1 4h1m worker-user-data Opaque 2 4h41m worker-user-data-managed Opaque 2 4h34m > oc get serviceaccount -n openshift-windows-machine-config-operator windows-instance-config-daemon -oyaml apiVersion: v1 imagePullSecrets: - name: windows-instance-config-daemon-dockercfg-vlm2b kind: ServiceAccount metadata: creationTimestamp: "2024-08-14T05:01:03Z" labels: olm.managed: "true" name: windows-instance-config-daemon namespace: openshift-windows-machine-config-operator resourceVersion: "102492" uid: 111f5095-9dc0-4db6-8c64-c511da8a1ee5 secrets: - name: windows-instance-config-daemon-dockercfg-vlm2b
- duplicates
-
OCPBUGS-38485 vSphere machines are getting into provisioned status "expected 1 secret for SA 'windows-instance-config-daemon', found 2""
- Closed