Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-11891

[1957477] Upgrade hung due to a failure to drain with several pods unable to be evicted

XMLWordPrintable

    • High
    • No

      Description of problem:
      OCP upgrade from 4.6.26 to 4.7.8 stuck while trying to evict PODs from nodes to be drained

      kube-apiserver logs showed `configuration:virt-api-validator,webhook:virt-launcher-eviction-interceptor.kubevirt.io` failing with `context canceled`

      Version-Release number of selected component (if applicable):

      • OCP version 4.6.26
      • OCP Virtualization 2.5.5
      • Fresh cluster with no customer workloads, no OCP-Virt VMs

      How reproducible:
      Currently being seen in customer environment

      Steps to Reproduce:
      1. Have an OCP 4.6.x cluster with CNV 2.5.x installed with no VMs at that point
      2. Trigger the upgrade to 4.7.x

      Actual results:
      Cluster upgrade is not progressing since nodes cannot be drained.

      Expected results:
      Since webhook:virt-launcher-eviction-interceptor has a failurePolicy set to Ignore, drain should not fail.

      Additional info:

      must-gather couldn't be collected due to apiserver unresponsiveness

              sgott@redhat.com Stuart Gott
              rhn-support-jcoscia Javier Coscia
              Israel Pinto Israel Pinto
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: