Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-30216

vSphere PVC workloads are stuck in ContainerCreating after upgrading from 8.x.x to 9.0.1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • 4.14
    • Windows Containers
    • None

    Description

      Description of problem:

          After upgrading csi - proxy PVC cluster, Windows pvc workloads are stuck in ContainerCreating
      
      From workload status:
      Warning  FailedAttachVolume  16s   attachdetach-controller  AttachVolume.Attach failed for volume "pvc-110fb11f-1d66-4a9f-a85e-e1c31143532b" : CSINode winworker-tf8tp does not contain driver csi.vsphere.vmware.com
      
       oc describe node winworker-tf8tp | grep csi  windowsmachineconfig.openshift.io/configured-with-csi=true 

      Version-Release number of selected component (if applicable):

          9.0.1-9038172

      How reproducible:

          

      Steps to Reproduce:

          1. install vSphere cluster with 2 machineset nodes, 2 BYOH nodes 
          2. Install CSI driver https://raw.githubusercontent.com/openshift/windows-machine-config-operator/master/hack/manifests/csi/vsphere/01-example-driver-daemonset.yaml 
          3. Install storageclass
      cat storageclass.yaml
       apiVersion: storage.k8s.io/v1
       kind: StorageClass
       metadata:
         name: intree
       provisioner: kubernetes.io/vsphere-volume
       parameters:
         fstype: ntfs
      
       $ oc create -f storageclass.yaml
      
         4.  Install PVC
      cat pvc.yaml
       kind: PersistentVolumeClaim
       apiVersion: v1
       metadata:
         name: winc-pvc
         annotations:
           volume.beta.kubernetes.io/storage-class: intree
       spec:
         accessModes:
           - ReadWriteOnce
         resources:
           requests:
             storage: 500Mi
      
       $ oc create -f pvc.yaml -n winc-test   
      
      
         5. cat WinWebServerPvc.yaml                                                                                         
       apiVersion: v1                
       kind: Service                           
       metadata:                            
         name: win-webserver-pvc
         labels:                    
           app: win-webserver-pvc
       spec:                   
         ports:                     
         # the port that this service should serve on
         - port: 80                                                                              
           targetPort: 80
         selector:       
           app: win-webserver-pvc
      
         type: 
      LoadBalancer                                                                                                                                                                                                                                                                                                                                                     
       
      ---                                                                                                                                                                                 
       apiVersion: apps/v1             
       kind: Deployment                                                                          
       metadata:                                                                                 
         labels:                                                                                 
           app: win-webserver-pvc                                                                
         name: win-webserver-pvc         
       spec:                           
         selector:       
           matchLabels:                                                                          
             app: win-webserver-pvc             
         replicas: 3         
         template:                
           metadata:                
             labels:             
               app: win-webserver-pvc
             name: win-webserver-pvc      
           spec:                                                                                 
             tolerations:
             - key: "os"
               value: "Windows"
               Effect: "NoSchedule"
             volumes:
             - name: test-volume
               persistentVolumeClaim:
                 claimName: winc-pvc
             containers:
             - name: windowswebserver
               image: mcr.microsoft.com/powershell:lts-nanoserver-ltsc2022
               imagePullPolicy: IfNotPresent
               volumeMounts:
               - mountPath: C:/html/
                 name: test-volume
               securityContext:
                 runAsNonRoot: false
                 windowsOptions:
                   runAsUserName: "ContainerAdministrator"
               command:
               - pwsh.exe
               - -command
      
               - echo "<html><body><H1>Windows Container Web
       Server</H1></body></html>" > 
      C:/html/index.html;$listener
       = New-Object System.Net.HttpListener; 
      $listener.Prefixes.Add('http://*:80/'); 
      $listener.Start();Write-Host('Listening at http://*:80/');
       while ($listener.IsListening) { $context = $listener.GetContext();
       $response = $context.Response; $content=Get-Content C:/html/index.html -Raw;
       $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length;
       $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
             # Hack to force all the pods from the deployment to land in the
             # same node. This is because the in-tree vsphere volume is
             # ReadWriteOnce, which means that the volume can land only in a
             # single node. Pods landing on a different node won't have access
             # to the persistent storage.
             nodeName: winworker-j72d6
      
       $ oc create -f WinWebServerPvc.yaml -n winc-test  
      6. Upgrade from 8.1.1 version to 9.0.1 version

      Actual results:

          workloads stuck in ContainerCreating
      pod/win-webserver-pvc-5fb9c7bfc9-6qv68   0/1     ContainerCreating   0          120m

      Expected results:

          After upgrade all vSphere workloads should get recreated and in Ready status

      Additional info:

          oc describe node winworker-tf8tp
      Name:               winworker-tf8tp
      Roles:              worker
      Labels:             beta.kubernetes.io/arch=amd64
                          beta.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown
                          beta.kubernetes.io/os=windows
                          kubernetes.io/arch=amd64
                          kubernetes.io/hostname=winworker-tf8tp
                          kubernetes.io/os=windows
                          node-role.kubernetes.io/worker=
                          node.kubernetes.io/instance-type=vsphere-vm.cpu-4.mem-16gb.os-unknown
                          node.kubernetes.io/windows-build=10.0.20348
                          node.openshift.io/os_id=Windows
                          windowsmachineconfig.openshift.io/configured-with-csi=true
      Annotations:        k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac: 00-15-5D-84-60-0F
                          k8s.ovn.org/hybrid-overlay-node-subnet: 10.132.2.0/24
                          k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.3/16"}
                          k8s.ovn.org/node-id: 3
                          k8s.ovn.org/node-transit-switch-port-ifaddr: {"ipv4":"100.88.0.3/16"}
                          machine.openshift.io/machine: openshift-machine-api/winworker-tf8tp
                          volumes.kubernetes.io/controller-managed-attach-detach: true
                          windowsmachineconfig.openshift.io/desired-version: 9.0.1-9038172
                          windowsmachineconfig.openshift.io/pub-key-hash: 1df2c166b1c401180523270e9cf6bc2cd2724b9279ea65668a3b95298525a0f5
                          windowsmachineconfig.openshift.io/username:
                            wx4EBwMIlLcsQVNvLitgW55YTgQEJ+OPx2zqbLKQyhvS5gF0ek9w7bMV3+ijXWeD<wmcoMarker>aEizvVpD5KkDHpNYZRu7QXVmxu17evCkvoQrszlRMvi07qalxw4GpQgihpY1Sw...
                          windowsmachineconfig.openshift.io/version: 9.0.1-9038172
      CreationTimestamp:  Mon, 04 Mar 2024 15:48:11 +0200
      Taints:             os=Windows:NoSchedule
      Unschedulable:      false
      Lease:
        HolderIdentity:  winworker-tf8tp
        AcquireTime:     <unset>
        RenewTime:       Mon, 04 Mar 2024 19:21:55 +0200
      Conditions:
        Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
        ----             ------  -----------------                 ------------------                ------                       -------
        MemoryPressure   False   Mon, 04 Mar 2024 19:18:47 +0200   Mon, 04 Mar 2024 15:48:11 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
        DiskPressure     False   Mon, 04 Mar 2024 19:18:47 +0200   Mon, 04 Mar 2024 15:48:11 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
        PIDPressure      False   Mon, 04 Mar 2024 19:18:47 +0200   Mon, 04 Mar 2024 15:48:11 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
        Ready            True    Mon, 04 Mar 2024 19:18:47 +0200   Mon, 04 Mar 2024 15:48:42 +0200   KubeletReady                 kubelet is posting ready status
      Addresses:
        Hostname:    winworker-tf8tp
        InternalIP:  192.168.221.230
        ExternalIP:  192.168.221.230
      Capacity:
        cpu:                4
        ephemeral-storage:  93714428Ki
        memory:             16776244Ki
        pods:               250
      Allocatable:
        cpu:                3500m
        ephemeral-storage:  85293474878
        memory:             15625268Ki
        pods:               250
      System Info:
        Machine ID:                 winworker-tf8tp
        System UUID:                64403D42-BB0D-C93B-F327-767C456BE690
        Boot ID:                    29
        Kernel Version:             10.0.20348.681
        OS Image:                   Windows Server 2022 Datacenter
        Operating System:           windows
        Architecture:               amd64
        Container Runtime Version:  containerd://1.7.6
        Kubelet Version:            v1.27.10+28ed2d7
        Kube-Proxy Version:         v1.27.10+28ed2d7
      ProviderID:                   vsphere://423d4064-0dbb-3bc9-f327-767c456be690
      Non-terminated Pods:          (5 in total)
        Namespace                   Name                                  CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
        ---------                   ----                                  ------------  ----------  ---------------  -------------  ---
        winc-test                   win-webserver-6fb6d967df-4hsrm        0 (0%)        0 (0%)      0 (0%)           0 (0%)         3h32m
        winc-test                   win-webserver-6fb6d967df-svlss        0 (0%)        0 (0%)      0 (0%)           0 (0%)         3h32m
        winc-test                   win-webserver-pvc-5fb9c7bfc9-6qv68    0 (0%)        0 (0%)      0 (0%)           0 (0%)         19m
        winc-test                   win-webserver-pvc-5fb9c7bfc9-7lnbb    0 (0%)        0 (0%)      0 (0%)           0 (0%)         19m
        winc-test                   win-webserver-pvc-5fb9c7bfc9-xxp7x    0 (0%)        0 (0%)      0 (0%)           0 (0%)         19m
      Allocated resources:
        (Total limits may be over 100 percent, i.e., overcommitted.)
        Resource           Requests  Limits
        --------           --------  ------
        cpu                0 (0%)    0 (0%)
        memory             0 (0%)    0 (0%)
        ephemeral-storage  0 (0%)    0 (0%)
      Events:              <none>
      
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        annotations:
          deployment.kubernetes.io/revision: "2"
        creationTimestamp: "2024-03-04T07:58:57Z"
        generation: 4
        labels:
          app: win-webserver-pvc
        name: win-webserver-pvc
        namespace: winc-test
        resourceVersion: "283050"
        uid: 7da7e9cf-a66c-47cd-b470-8e351dd2daed
      spec:
        progressDeadlineSeconds: 600
        replicas: 3
        revisionHistoryLimit: 10
        selector:
          matchLabels:
            app: win-webserver-pvc
        strategy:
          rollingUpdate:
            maxSurge: 25%
            maxUnavailable: 25%
          type: RollingUpdate
        template:
          metadata:
            creationTimestamp: null
            labels:
              app: win-webserver-pvc
            name: win-webserver-pvc
          spec:
            containers:
            - command:
              - pwsh.exe
              - -command
              - echo "<html><body><H1>Windows Container Web Server</H1></body></html>" >
                C:/html/index.html;$listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/');
                $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening)
                { $context = $listener.GetContext(); $response = $context.Response; $content=Get-Content
                C:/html/index.html -Raw; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content);
                $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer,
                0, $buffer.Length); $response.Close(); };
              image: mcr.microsoft.com/powershell:lts-nanoserver-ltsc2022
              imagePullPolicy: IfNotPresent
              name: windowswebserver
              resources: {}
              securityContext:
                runAsNonRoot: false
                windowsOptions:
                  runAsUserName: ContainerAdministrator
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
              - mountPath: C:/html/
                name: test-volume
            dnsPolicy: ClusterFirst
            nodeName: winworker-tf8tp
            nodeSelector:
              kubernetes.io/os: windows

      Attachments

        Activity

          People

            team-winc Team WinC
            rrasouli Aharon Rasouli
            Weinan Liu Weinan Liu
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: