Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-66890

CAPK does not drain guest node when VM is evicted from host cluster (and cannot live migrate)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.18.z, 4.20.z
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • x86_64
    • None
    • None
    • None
    • CNV I/U Operators Sprint 281
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Create a HCP Kubevirt cluster with a few VMs that cannot live migrate (i.e. using local storage of GPU) and with EvictionStrategy set to External. Then evict a VM that is a node of the Guest cluster and note it will not drain the guest node.

      Step by step:

      1. Stop and set the eviction strategy of the guest cluster VMs to External, which is what OCPBUGS-58397: feat(KubeVirt): configure External evictionStrategy on VMs will do.

      kind: VirtualMachine
      metadata:
      spec:
         template:
          spec:
            evictionStrategy: External
      

      2. Bring those VMs up again, ensure its all fine.

      [HOST] # oc get vmi
      NAME                            AGE   PHASE     IP             NODENAME               READY
      hostedcluster-420-9cqbx-jzq4g   28m   Running   10.129.4.100   cyan.shift.home.arpa   True
      hostedcluster-420-9cqbx-lb2k4   28m   Running   10.129.4.101   cyan.shift.home.arpa   True
      hostedcluster-420-9cqbx-qm6z7   23m   Running   10.129.4.103   cyan.shift.home.arpa   True
      
      [GUEST] # oc get nodes 
      NAME                            STATUS   ROLES    AGE   VERSION
      hostedcluster-420-9cqbx-jzq4g   Ready    worker   72m   v1.33.5
      hostedcluster-420-9cqbx-lb2k4   Ready    worker   65m   v1.33.5
      hostedcluster-420-9cqbx-qm6z7   Ready    worker   72m   v1.33.5
      

      3. Next, let's evict one of the VMs, I'll pick the last one hostedcluster-420-9cqbx-qm6z7:

      [HOST] # oc adm drain cyan.shift.home.arpa --pod-selector='kubevirt.io/vm=hostedcluster-420-9cqbx-qm6z7' --delete-emptydir-data
      

      4\. CAPK puts the annotation in and the eviction gets blocked for a while

      I1205 02:40:56.487949       1 machine.go:510] "msg"="setting the capk.cluster.x-k8s.io/vmi-deletion-grace-time annotation" "KubevirtMachine"={"name":"hostedcluster-420-9cqbx-qm6z7","namespace":"clusters-hostedcluster-420"} "controller"="kubevirtmachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="KubevirtMachine" "logger"="clusters-hostedcluster-420.hostedcluster-420-9cqbx-qm6z7" "name"="hostedcluster-420-9cqbx-qm6z7" "namespace"="clusters-hostedcluster-420" "reconcileID"="3e3d400d-d535-447d-ad37-e6446af549f1"
      
      ...
      
      error when evicting pods/"virt-launcher-hostedcluster-420-9cqbx-qm6z7-tdfb9" -n "clusters-hostedcluster-420" (will retry after 5s): admission webhook "virt-launcher-eviction-interceptor.kubevirt.io" denied the request: Eviction triggered evacuation of VMI "clusters-hostedcluster-420/hostedcluster-420-9cqbx-qm6z7"
      ...
      

      5. But then CAPK fails to find the Guest node to drain it

      E1205 02:40:56.496122       1 machine.go:539] "msg"="Could not find node from noderef, it may have already been deleted" "error"="nodes \"cyan.shift.home.arpa\" not found" "KubevirtMachine"={"name":"hostedcluster-420-9cqbx-qm6z7","namespace":"clusters-hostedcluster-420"} "controller"="kubevirtmachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="KubevirtMachine" "logger"="clusters-hostedcluster-420.hostedcluster-420-9cqbx-qm6z7" "name"="hostedcluster-420-9cqbx-qm6z7" "namespace"="clusters-hostedcluster-420" "reconcileID"="3e3d400d-d535-447d-ad37-e6446af549f1"
      

      6. And gives up. The VM is now on its way to shutdown without drain, as soon as the timeout above expires.

      I1205 02:40:56.532994       1 machine.go:470] "msg"="DrainNode: the virtualMachineInstance is already in deletion process. Nothing to do here" "KubevirtMachine"={"name":"hostedcluster-420-9cqbx-qm6z7","namespace":"clusters-hostedcluster-420"} "controller"="kubevirtmachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="KubevirtMachine" "logger"="clusters-hostedcluster-420.hostedcluster-420-9cqbx-qm6z7" "name"="hostedcluster-420-9cqbx-qm6z7" "namespace"="clusters-hostedcluster-420" "reconcileID"="ee985ad7-1a96-438a-864b-1835b848bb29"
      

      The error in step 5 is in this part of the code:

      func (m *Machine) drainNode(wrkldClstr workloadcluster.WorkloadCluster) (time.Duration, error) {
              
              .....
      
      	nodeName := m.vmiInstance.Status.EvacuationNodeName   <---- why? this is the host node, not guest node we want to drain?
      	node, err := kubeClient.CoreV1().Nodes().Get(m.machineContext, nodeName, metav1.GetOptions{})
      	if err != nil {
      		if apierrors.IsNotFound(err) {
      			// If an admin deletes the node directly, we'll end up here.
      			m.machineContext.Logger.Error(err, "Could not find node from noderef, it may have already been deleted")   <------ we get here
      			return 0, nil
      		}
      		return 0, fmt.Errorf("unable to get node %q: %w", nodeName, err)
      	}
      

      Link: https://github.com/kubernetes-sigs/cluster-api-provider-kubevirt/blob/b1ad7eddf047dcdde80f46d9cdaece523a15c6a2/pkg/kubevirt/machine.go#L535

      It seems to be looking for:

      nodeName := m.vmiInstance.Status.EvacuationNodeName 

      But this is the host node name, not the guest node name (which it wants to drain). Then it doesn't find the host node in the guest cluster.

      Look at this, its the host node:

      [HOST] # oc adm drain cyan.shift.home.arpa --pod-selector='kubevirt.io/vm=hostedcluster-420-9cqbx-qm6z7' --delete-emptydir-data
      [HOST] # oc get vmi hostedcluster-420-9cqbx-qm6z7 -o yaml | yq '.status.evacuationNodeName'
      cyan.shift.home.arpa 

      Shouldn't it be looking for the guest node name there to trigger the drain?

      Version-Release number of selected component (if applicable):

      OCP 4.20.4 (both clusters)
      CNV 4.20.1
      

      How reproducible:

      Always
      

      Steps to Reproduce:

      As above
      

      Actual results:

      Guest node is not drained
      

      Expected results:

      Guest node drains
      

              nunnatsa Nahshon Unna Tsameret
              rhn-support-gveitmic Germano Veit Michel
              None
              None
              Ying Zhou Ying Zhou
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: