Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-50289

Descheduler produces PDB errors when it evicts VMs being migrated or in the parallelOutboundMigrationsPerNode queue

XMLWordPrintable

    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • ---
    • ---
    • Medium
    • None

      Description of problem:

      The descheduler will log the following error on every loop when it tries to evict VMs that have already being evicted in the last cycle, and are waiting for their turn in the migration queue (parallelOutboundMigrationsPerNode) or their migration has not finished yet.
      
      E1027 23:12:15.092280       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-3-tzwgg\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-3-tzwgg" reason=""
      E1027 23:12:15.092303       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-3-tzwgg": Cannot evict pod as it would violate the pod's disruption budget.

      Version-Release number of selected component (if applicable):

      4.17

      How reproducible:

      Always

      Steps to Reproduce:

      1. Put 10 tiny VMs on a node with the descheduler annonation
      
      # oc get pods -o wide
      NAME                             READY   STATUS    RESTARTS   AGE     IP             NODE                    NOMINATED NODE   READINESS GATES
      virt-launcher-centos-9-1-nbxpx   1/1     Running   0          12m     10.128.4.99    white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-2-nl6qs   1/1     Running   0          87s     10.128.4.124   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-3-9n4lj   1/1     Running   0          69s     10.128.4.125   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-4-mcc5h   1/1     Running   0          11m     10.128.4.106   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-5-j2s5f   1/1     Running   0          8m36s   10.128.4.118   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-6-slx62   1/1     Running   0          8m41s   10.128.4.117   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-7-4r79m   1/1     Running   0          11m     10.128.4.105   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-8-vjngw   1/1     Running   0          10m     10.128.4.115   white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-9-dmngj   1/1     Running   0          17m     10.128.4.65    white.shift.home.arpa   <none>           1/1
      virt-launcher-centos-9-nvz87     1/1     Running   0          11m     10.128.4.107   white.shift.home.arpa   <none>           1/1
      
      2. Make sure the descheduler is running with a shorter cycle, like 30s
      
      deschedulingIntervalSeconds: 30
      
      3. Start a big VM on that node, so it becomes overutilized
      
      4. In the first descheduler cycle, 2 VMs start migrating:
      
      NAME                                                                    PHASE        VMI
      virtualmachineinstancemigration.kubevirt.io/kubevirt-evacuation-br9sc   Scheduling   centos-9-4
      virtualmachineinstancemigration.kubevirt.io/kubevirt-evacuation-qlsc9   Scheduling   centos-9-1
      
      5. Watch the descheduler logs for the next cycle
      
      I1027 23:12:13.081619       1 lownodeutilization.go:136] "Number of underutilized nodes" totalNumber=1
      ...
      I1027 23:12:13.081635       1 lownodeutilization.go:150] "Number of overutilized nodes" totalNumber=1
      ...
      I1027 23:12:13.081754       1 nodeutilization.go:267] "Pods on node" node="white.shift.home.arpa" allPods=33 nonRemovablePods=23 removablePods=10 
      
      An error for all 10 VMs (the 2 migrating and the 8 waiting)
      
      E1027 23:12:13.095459       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-1-nxrm7\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-1-nxrm7" reason=""
      E1027 23:12:13.095477       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-1-nxrm7": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:13.106601       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-4-t8nxh\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-4-t8nxh" reason=""
      E1027 23:12:13.106636       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-4-t8nxh": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:13.117194       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-6-tdrjl\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-6-tdrjl" reason=""
      E1027 23:12:13.117211       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-6-tdrjl": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:13.127706       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-7-2ggzl\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-7-2ggzl" reason=""
      E1027 23:12:13.127723       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-7-2ggzl": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:13.138281       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-8-sgqcx\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-8-sgqcx" reason=""
      E1027 23:12:13.138296       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-8-sgqcx": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:13.491618       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-887lg\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-887lg" reason=""
      E1027 23:12:13.491641       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-887lg": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:13.891674       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-5-msf8t\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-5-msf8t" reason=""
      E1027 23:12:13.891697       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-5-msf8t": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:14.291858       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-9-l9hgh\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-9-l9hgh" reason=""
      E1027 23:12:14.291892       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-9-l9hgh": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:14.691355       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-2-p96zn\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-2-p96zn" reason=""
      E1027 23:12:14.691379       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-2-p96zn": Cannot evict pod as it would violate the pod's disruption budget.
      E1027 23:12:15.092280       1 evictions.go:520] "Error evicting pod" err="error when evicting pod (ignoring) \"virt-launcher-centos-9-3-tzwgg\": Cannot evict pod as it would violate the pod's disruption budget." pod="homelab/virt-launcher-centos-9-3-tzwgg" reason=""
      E1027 23:12:15.092303       1 nodeutilization.go:361] eviction failed: error when evicting pod (ignoring) "virt-launcher-centos-9-3-tzwgg": Cannot evict pod as it would violate the pod's disruption budget.

      Actual results:

      False errors in the logs, the system is actually working
      It's confusing for the customer and for support.

      Expected results:

      Ignore VMs already being evited or waiting to migrate

      Additional info:

       

              sgott@redhat.com Stuart Gott
              rhn-support-gveitmic Germano Veit Michel
              Kedar Bidarkar Kedar Bidarkar
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: