Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-151679

[rhel-9.8] Regression in BLOCK_IO_ERROR event delivery with (w|r)error setting of 'stop' or 'enospc' due to event rate limiting

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • rhel-9.8
    • rhel-9.8, rhel-10.2
    • qemu-kvm / Storage
    • Yes
    • Important
    • 1
    • rhel-virt-storage
    • 29
    • 0
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • VirtStorage Planning backlog
    • Proposed Exception
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      As of qemu commit 2155d2dd7f733674586119b6b4ee0f52d2032779 :

      commit 2155d2dd7f733674586119b6b4ee0f52d2032779
      Author: Leonid Kaplan <xeor@yandex-team.ru>
      Date:   Wed Oct 2 18:18:06 2024 +0300
      
          block-backend: per-device throttling of BLOCK_IO_ERROR reports
          
          BLOCK_IO_ERROR events comes from guest, so we must throttle them.
          We still want per-device throttling, so let's use device id as a key.
         
      

      https://gitlab.com/qemu-project/qemu/-/commit/2155d2dd7f733674586119b6b4ee0f52d2032779

      qemu rate-limits the BLOCK_IO_ERROR delivery. The reasoning for the change in general is that the event can be triggered by the guest OS and thus could spam the logs on the host, which is obviously not good. In cases though where qemu is setup to pause the guest on I/O errors this adds natural rate limiting as the error can't be re-triggered if the guest OS is paused. In addition the rate limit can avoid important notification in cases when the hypervisor side attempted to rectivy the I/O error source and wanted to let the VM continue as any further events within the same second would be suppressed.

      Thus the limiting must apply only when the VM is not paused as a result of the I/O error and thus log spamming can oocur.

      This regression was noticed by CNV's test suite where they use 'scsi-debug' driver to inject errors and then attempt to remove the error behaviour and continue the VM. The first attempt to fix the issue failed or was delayed, but the subsequent pause of the VM was not reported as an I/O error and thus caused unexpected behaviour from other sources:

      026-01-26 09:12:51.679+0000: event 'io-error' for domain 'kubevirt-test-default1_testvmi-fb92r': /dev/pvc-disk (ua-pvc-disk) pause
      2026-01-26 09:12:51.679+0000: event 'io-error' for domain 'kubevirt-test-default1_testvmi-fb92r': /dev/pvc-disk (ua-pvc-disk) pause due to message
      2026-01-26 09:12:51.679+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended I/O Error
      2026-01-26 09:12:51.698+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.720+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.738+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.761+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.769+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.797+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.804+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.836+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.844+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.876+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.883+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.916+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.924+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.952+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.959+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:51.989+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:51.996+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.022+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.029+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.051+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.059+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.081+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.088+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.115+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.123+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.151+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.158+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.189+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.196+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.223+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.230+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.256+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.263+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.290+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.297+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.326+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.333+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.356+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.362+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.391+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.397+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.427+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.437+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.463+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.480+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.506+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.513+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.539+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.545+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.572+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.581+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.611+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.618+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.645+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.653+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Resumed Unpaused
      2026-01-26 09:12:52.683+0000: event 'lifecycle' for domain 'kubevirt-test-default1_testvmi-fb92r': Suspended Paused
      2026-01-26 09:12:52.683+0000: event 'io-error' for domain 'kubevirt-test-default1_testvmi-fb92r': /dev/pvc-disk (ua-pvc-disk) pause
      2026-01-26 09:12:52.683+0000: event 'io-error' for domain 'kubevirt-test-default1_testvmi-fb92r': /dev/pvc-disk (ua-pvc-disk) pause due to message
      

      In the above excerpt all the other 'Suspended paused' state transitions were caused by the same I/O error but without notification.

              kwolf@redhat.com Kevin Wolf
              pkrempa@redhat.com Peter Krempa
              Kevin Wolf
              virt-maint virt-maint
              Tingting Mao Tingting Mao
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: