Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-50000

scsi-block: Cannot setup Windows Failover Cluster, qemu crashes on assert [rhel-9.5]

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Undefined Undefined
    • rhel-9.5
    • rhel-9.2.0.z, rhel-9.4
    • qemu-kvm / Storage
    • qemu-kvm-9.0.0-8.el9
    • None
    • Important
    • ZStream
    • rhel-sst-virtualization-storage
    • ssg_virtualization
    • 5
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux, Red Hat OpenShift Virtualization
    • None
    • Approved Blocker
    • x86_64
    • None

      What were you trying to do that didn't work?

      Potentially not limited to, but the scenario is:

      • In OpenShift Virtualization 4.16 (RHEL 9.4 based), set 2 Windows 2019 Virtual Machines.
      • Present a shared LUN from SAN to each, with reservation turned on
      $ oc get vm windows-2019 -o yaml  | yq '.spec.template.spec.domain.devices.disks'
      [
      ...
        {
          "lun": {
            "bus": "scsi",
            "reservation": true
          },
          "name": "iscsi-pv"
        }
      ]
      

       

      For this, kubevirt generates the following XML

       

      $ oc rsh virt-launcher-windows-2019-hd4h2 virsh dumpxml 1 --xpath //domain//devices//disk
      ...
      <disk type="block" device="lun">
        <driver name="qemu" type="raw" cache="none" error_policy="stop" io="native" discard="unmap"/>
        <source dev="/dev/iscsi-pv" index="1">
          <reservations managed="no">
            <source type="unix" path="/var/run/kubevirt/daemons/pr/pr-helper.sock" mode="client"/>
          </reservations>
        </source>
        <backingStore/>
        <target dev="sdb" bus="scsi"/>
        <alias name="ua-iscsi-pv"/>
        <address type="drive" controller="0" bus="0" target="0" unit="1"/>
      </disk>
      

       

      Once both VMs are up using this shared LUN, create a Failover Cluster and try to run the Validation Test (make sure the disk is offline on the cluster otherwise it won't try to failover the disk).

      When it tries to failover the disk, one of the VMs will crash.

      341257 04:55:28.245249 write(2<pipe:[3066370]>, "qemu-kvm: ../hw/scsi/scsi-disk.c:558: void scsi_write_data(SCSIRequest *): Assertion `r->req.aiocb == NULL' failed.\n", 116) = 116 <0.000005>

      Another simpler way to reproduce this:

      1. setup the cluster in the first VM, bring the disk online
      2. on the second VM, try to bring the disk online (without any cluster setup)

      Please provide the package NVR for which bug is seen:

      qemu-kvm-core-8.2.0-11.el9_4.3.x86_64

      How reproducible:

      Always

      Expected results

      VM is up, disk can failover and test passes

      Actual results

      VMs are crashing, cannot use cluster or failover disk

              kwolf@redhat.com Kevin Wolf
              rhn-support-gveitmic Germano Veit Michel
              Paolo Bonzini
              virt-maint virt-maint
              qing wang qing wang
              Votes:
              0 Vote for this issue
              Watchers:
              22 Start watching this issue

                Created:
                Updated:
                Resolved: