Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-60030

Libvirt domstats returns incorrect iowait_sum value

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • rhel-9.6
    • rhel-9.4, rhel-9.4.z
    • libvirt
    • None
    • No
    • Moderate
    • ZStream
    • sst_virtualization
    • ssg_virtualization
    • 2
    • Dev ack
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Approved Blocker
    • None
    • None
    • 10.9.0
    • None

      What were you trying to do that didn't work?

      In OpenShift Virtualization, I'm querying the metric 'kubevirt_vmi_vcpu_wait_seconds_total'. All values are 0.

      https://docs.openshift.com/container-platform/4.16/virt/monitoring/virt-prometheus-queries.html#virt-querying-metrics_virt-prometheus-queries

      When querying libvirt directly, I can confirm that the returned value is 0:

      # virsh -r domstats centos9 | grep wait
        vcpu.0.wait=0   <--
        vcpu.0.halt_wait_ns.sum=147337004989
        vcpu.1.wait=0   <--
        vcpu.1.halt_wait_ns.sum=163969618614

      What is the impact of this issue to you?

      Incorrect metrics for VMs.

      Please provide the package NVR for which the bug is seen:

      libvirt-10.0.0-6.7.el9_4.x86_64

      qemu-kvm-8.2.0-11.el9_4.6.x86_64

      kernel-5.14.0-427.37.1.el9_4.x86_64

      How reproducible is this bug?:

      100%

      Steps to reproduce

      1.  Using a RHEL 9.4 hypervisor, add the kernel cmdline option 'schedstats=enable'
      2.  Run a VM and generate IO load.
      3.  Check the vCPU wait stats: "virsh -r  domstats VM_NAME |grep wait"

      Expected results

      IO wait correctly reported.

      Actual results

      IO wait is 0, however the kernel is reporting IO wait in the 'wait_sum' stat:

      # cat /proc/cmdline 
      BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-427.37.1.el9_4.x86_64 root=UUID=aec1c1e8-3576-4eb2-ab62-f62984e655a2 console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M schedstats=enable
      
      # virsh -r  domstats centos9 |grep wait
        vcpu.0.wait=0
        vcpu.0.halt_wait_ns.sum=147337004989
        vcpu.1.wait=0
        vcpu.1.halt_wait_ns.sum=163969618614
      
      # pstree -t -p  4506
      qemu-kvm(4506)─┬─{CPU 0/KVM}(4518)
                     ├─{CPU 1/KVM}(4519)
                     ├─{IO mon_iothread}(4517)
                     ├─{qemu-kvm}(4513)
                     ├─{vnc_worker}(4521)
                     ├─{worker}(4514)
                     ├─{worker}(4573)
                     ├─{worker}(4575)
                     ├─{worker}(4576)
                     ├─{worker}(4577)
                     ├─{worker}(4578)
                     ├─{worker}(4579)
                     ├─{worker}(4580)
                     ├─{worker}(4581)
                     ├─{worker}(4582)
                     └─{worker}(4583)
      
      # grep wait /proc/4506/task/{4518,4519}/sched 
      /proc/4506/task/4518/sched:wait_start                                   :             0.000000
      /proc/4506/task/4518/sched:wait_max                                     :             4.657302
      /proc/4506/task/4518/sched:wait_sum                                     :           951.535557
      /proc/4506/task/4518/sched:wait_count                                   :               162006
      /proc/4506/task/4518/sched:iowait_sum                                   :             0.000000
      /proc/4506/task/4518/sched:iowait_count                                 :                    0
      /proc/4506/task/4519/sched:wait_start                                   :             0.000000
      /proc/4506/task/4519/sched:wait_max                                     :             3.558880
      /proc/4506/task/4519/sched:wait_sum                                     :           865.273765
      /proc/4506/task/4519/sched:wait_count                                   :               154549
      /proc/4506/task/4519/sched:iowait_sum                                   :             0.000000
      /proc/4506/task/4519/sched:iowait_count                                 :                    0

            mkletzan@redhat.com Martin Kletzander
            rhn-support-jortialc Juan Orti
            virt-maint virt-maint
            Nannan Li Nannan Li
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated: