Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.14
-
None
Description
Description of problem:
After upgrading csi - proxy PVC cluster, Windows pvc workloads are stuck in ContainerCreating From workload status: Warning FailedAttachVolume 16s attachdetach-controller AttachVolume.Attach failed for volume "pvc-110fb11f-1d66-4a9f-a85e-e1c31143532b" : CSINode winworker-tf8tp does not contain driver csi.vsphere.vmware.com oc describe node winworker-tf8tp | grep csi windowsmachineconfig.openshift.io/configured-with-csi=true
Version-Release number of selected component (if applicable):
9.0.1-9038172
How reproducible:
Steps to Reproduce:
1. install vSphere cluster with 2 machineset nodes, 2 BYOH nodes 2. Install CSI driver https://raw.githubusercontent.com/openshift/windows-machine-config-operator/master/hack/manifests/csi/vsphere/01-example-driver-daemonset.yaml 3. Install storageclass cat storageclass.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: intree provisioner: kubernetes.io/vsphere-volume parameters: fstype: ntfs $ oc create -f storageclass.yaml 4. Install PVC cat pvc.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: winc-pvc annotations: volume.beta.kubernetes.io/storage-class: intree spec: accessModes: - ReadWriteOnce resources: requests: storage: 500Mi $ oc create -f pvc.yaml -n winc-test 5. cat WinWebServerPvc.yaml apiVersion: v1 kind: Service metadata: name: win-webserver-pvc labels: app: win-webserver-pvc spec: ports: # the port that this service should serve on - port: 80 targetPort: 80 selector: app: win-webserver-pvc type: LoadBalancer --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: win-webserver-pvc name: win-webserver-pvc spec: selector: matchLabels: app: win-webserver-pvc replicas: 3 template: metadata: labels: app: win-webserver-pvc name: win-webserver-pvc spec: tolerations: - key: "os" value: "Windows" Effect: "NoSchedule" volumes: - name: test-volume persistentVolumeClaim: claimName: winc-pvc containers: - name: windowswebserver image: mcr.microsoft.com/powershell:lts-nanoserver-ltsc2022 imagePullPolicy: IfNotPresent volumeMounts: - mountPath: C:/html/ name: test-volume securityContext: runAsNonRoot: false windowsOptions: runAsUserName: "ContainerAdministrator" command: - pwsh.exe - -command - echo "<html><body><H1>Windows Container Web Server</H1></body></html>" > C:/html/index.html;$listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content=Get-Content C:/html/index.html -Raw; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); }; # Hack to force all the pods from the deployment to land in the # same node. This is because the in-tree vsphere volume is # ReadWriteOnce, which means that the volume can land only in a # single node. Pods landing on a different node won't have access # to the persistent storage. nodeName: winworker-j72d6 $ oc create -f WinWebServerPvc.yaml -n winc-test 6. Upgrade from 8.1.1 version to 9.0.1 version
Actual results:
workloads stuck in ContainerCreating pod/win-webserver-pvc-5fb9c7bfc9-6qv68 0/1 ContainerCreating 0 120m
Expected results:
After upgrade all vSphere workloads should get recreated and in Ready status
Additional info:
oc describe node winworker-tf8tp Name: winworker-tf8tp Roles: worker Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown beta.kubernetes.io/os=windows kubernetes.io/arch=amd64 kubernetes.io/hostname=winworker-tf8tp kubernetes.io/os=windows node-role.kubernetes.io/worker= node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown node.kubernetes.io/windows-build=10.0.20348 node.openshift.io/os_id=Windows windowsmachineconfig.openshift.io/configured-with-csi=true Annotations: k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac: 00-15-5D-84-60-0F k8s.ovn.org/hybrid-overlay-node-subnet: 10.132.2.0/24 k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.3/16"} k8s.ovn.org/node-id: 3 k8s.ovn.org/node-transit-switch-port-ifaddr: {"ipv4":"100.88.0.3/16"} machine.openshift.io/machine: openshift-machine-api/winworker-tf8tp volumes.kubernetes.io/controller-managed-attach-detach: true windowsmachineconfig.openshift.io/desired-version: 9.0.1-9038172 windowsmachineconfig.openshift.io/pub-key-hash: 1df2c166b1c401180523270e9cf6bc2cd2724b9279ea65668a3b95298525a0f5 windowsmachineconfig.openshift.io/username: wx4EBwMIlLcsQVNvLitgW55YTgQEJ+OPx2zqbLKQyhvS5gF0ek9w7bMV3+ijXWeD<wmcoMarker>aEizvVpD5KkDHpNYZRu7QXVmxu17evCkvoQrszlRMvi07qalxw4GpQgihpY1Sw... windowsmachineconfig.openshift.io/version: 9.0.1-9038172 CreationTimestamp: Mon, 04 Mar 2024 15:48:11 +0200 Taints: os=Windows:NoSchedule Unschedulable: false Lease: HolderIdentity: winworker-tf8tp AcquireTime: <unset> RenewTime: Mon, 04 Mar 2024 19:21:55 +0200 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Mon, 04 Mar 2024 19:18:47 +0200 Mon, 04 Mar 2024 15:48:11 +0200 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Mon, 04 Mar 2024 19:18:47 +0200 Mon, 04 Mar 2024 15:48:11 +0200 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Mon, 04 Mar 2024 19:18:47 +0200 Mon, 04 Mar 2024 15:48:11 +0200 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Mon, 04 Mar 2024 19:18:47 +0200 Mon, 04 Mar 2024 15:48:42 +0200 KubeletReady kubelet is posting ready status Addresses: Hostname: winworker-tf8tp InternalIP: 192.168.221.230 ExternalIP: 192.168.221.230 Capacity: cpu: 4 ephemeral-storage: 93714428Ki memory: 16776244Ki pods: 250 Allocatable: cpu: 3500m ephemeral-storage: 85293474878 memory: 15625268Ki pods: 250 System Info: Machine ID: winworker-tf8tp System UUID: 64403D42-BB0D-C93B-F327-767C456BE690 Boot ID: 29 Kernel Version: 10.0.20348.681 OS Image: Windows Server 2022 Datacenter Operating System: windows Architecture: amd64 Container Runtime Version: containerd://1.7.6 Kubelet Version: v1.27.10+28ed2d7 Kube-Proxy Version: v1.27.10+28ed2d7 ProviderID: vsphere://423d4064-0dbb-3bc9-f327-767c456be690 Non-terminated Pods: (5 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- winc-test win-webserver-6fb6d967df-4hsrm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3h32m winc-test win-webserver-6fb6d967df-svlss 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3h32m winc-test win-webserver-pvc-5fb9c7bfc9-6qv68 0 (0%) 0 (0%) 0 (0%) 0 (0%) 19m winc-test win-webserver-pvc-5fb9c7bfc9-7lnbb 0 (0%) 0 (0%) 0 (0%) 0 (0%) 19m winc-test win-webserver-pvc-5fb9c7bfc9-xxp7x 0 (0%) 0 (0%) 0 (0%) 0 (0%) 19m Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 0 (0%) 0 (0%) memory 0 (0%) 0 (0%) ephemeral-storage 0 (0%) 0 (0%) Events: <none> apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "2" creationTimestamp: "2024-03-04T07:58:57Z" generation: 4 labels: app: win-webserver-pvc name: win-webserver-pvc namespace: winc-test resourceVersion: "283050" uid: 7da7e9cf-a66c-47cd-b470-8e351dd2daed spec: progressDeadlineSeconds: 600 replicas: 3 revisionHistoryLimit: 10 selector: matchLabels: app: win-webserver-pvc strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: win-webserver-pvc name: win-webserver-pvc spec: containers: - command: - pwsh.exe - -command - echo "<html><body><H1>Windows Container Web Server</H1></body></html>" > C:/html/index.html;$listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content=Get-Content C:/html/index.html -Raw; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); }; image: mcr.microsoft.com/powershell:lts-nanoserver-ltsc2022 imagePullPolicy: IfNotPresent name: windowswebserver resources: {} securityContext: runAsNonRoot: false windowsOptions: runAsUserName: ContainerAdministrator terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: C:/html/ name: test-volume dnsPolicy: ClusterFirst nodeName: winworker-tf8tp nodeSelector: kubernetes.io/os: windows