Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-20830

[2121631] Hyper-V enlightenments on NodeSelector makes event message useless to customer if node has no kvm

XMLWordPrintable

    • Medium
    • None

      Description of problem:

      If one has worker nodes without vmx/svm and launch a Windows VM, the virt-laucher pod fails to schedule with:

      status:
      conditions:

      • lastProbeTime: null
        lastTransitionTime: "2022-08-26T04:59:59Z"
        message: '0/7 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }

        , that the pod didn''t tolerate, 4 node(s) didn''t match Pod''s node affinity/selector.'
        reason: Unschedulable
        status: "False"
        type: PodScheduled

      If its a Linux VM the message is actually useful to the customer:

      status:
      conditions:

      • lastProbeTime: null
        lastTransitionTime: "2022-08-26T05:09:52Z"
        message: '0/7 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }

        , that the pod didn''t tolerate, 4 Insufficient devices.kubevirt.io/kvm.'
        reason: Unschedulable
        status: "False"
        type: PodScheduled

      This is because of the nodeSelector of the Windows VM has these:

      nodeSelector:
      hyperv.node.kubevirt.io/frequencies: "true"
      hyperv.node.kubevirt.io/ipi: "true"
      hyperv.node.kubevirt.io/reenlightenment: "true"
      hyperv.node.kubevirt.io/reset: "true"
      hyperv.node.kubevirt.io/runtime: "true"
      hyperv.node.kubevirt.io/synic: "true"
      hyperv.node.kubevirt.io/synictimer: "true"
      hyperv.node.kubevirt.io/tlbflush: "true"
      hyperv.node.kubevirt.io/vpindex: "true"
      kubevirt.io/schedulable: "true"

      Which are not present on the node object if the node does not have virtualization:

      So it does not match the node, and fails before checking if the resource kvm is available for use which would in turn produce an useful error event.

      This sort of basic problem must be clear to the user with proper event messages.

      Version-Release number of selected component (if applicable):
      CNV 4.10.4
      OCP 4.10.26

      How reproducible:
      Always

      Steps to Reproduce:
      1. Disable VMX on Worker Nodes
      2. Start a Linux and a Windows VM
      3. Windows VM fails to start with unclear message

              sgott@redhat.com Stuart Gott
              rhn-support-gveitmic Germano Veit Michel
              Kedar Bidarkar Kedar Bidarkar
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: