Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-28152

[2187304] kubevirt_migrate_vmi_disk_transfer_rate_bytes is not reporting any value after VM migration

XMLWordPrintable

    • High
    • None

      +++ This bug was initially created as a clone of Bug #2168470 +++

      Description of problem:
      VM -> Metrics tab -> Migration: “KV data transfer rate” graph is empty

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:
      1. migrate a running vm
      2.
      3.

      Actual results:

      Expected results:
      "KV data transfer rate” graph is loaded with data

      Additional info:

      — Additional comment from Phillip Bailey on 2023-04-12 13:15:05 UTC —

      @sradco@redhat.com The query[1] used to pull data for this chart isn't returning any data. When I tried looking the metric up in the Observe > Metrics screen, it doesn't show in the list, even though it's in the kubevirt metrics list here: https://github.com/kubevirt/kubevirt/blob/main/docs/metrics.md. Do you have any idea why this metric isn't showing up/returning any data?

      [1] sum(sum_over_time(kubevirt_migrate_vmi_disk_transfer_rate_bytes{name='${name}',namespace='${namespace}'}[${duration}])) BY (name, namespace)
      Code reference: https://github.com/kubevirt-ui/kubevirt-plugin/blob/bb33bb4ee1c90ad23d72776acebabc952860a5c2/src/utils/components/Charts/utils/queries.ts#L57

      — Additional comment from Itamar Holder on 2023-04-17 09:24:40 UTC —

      The code says [1]:
      ```
      if jobInfo.DiskBpsSet {
      metrics.pushCommonMetric(
      MigrateVmiDiskTransferRateMetricName,
      "The total VM data processed and migrated.",
      prometheus.GaugeValue,
      float64(jobInfo.DiskBps),
      )
      }
      ```

      It seems possible that jobInfo.DiskBpsSet is set to false.

      Libvirt docs say [2]:
      ```
      virDomainGetJobStats field: Present only in statistics for a completed job. Optional error message for a failed job.
      ```

      Now, in our code we use `stats, err = dom.GetJobStats(0)` [3], that is 0 as flags. One of the flags is DOMAIN_JOB_STATS_COMPLETED. That makes me wonder - maybe the stat is empty because we don't use the completed flag once the migration is completed?

      @bodnopoz@redhat.com, as you are the one who implemented the PR [4], can you confirm that you've tested this metric locally? If so, how did you test it?

      Thanks,
      Itamar.

      [1] https://github.com/kubevirt/kubevirt/blob/v0.60.0-alpha.0/pkg/monitoring/domainstats/prometheus/prometheus.go#L115
      [2] https://libvirt.org/html/libvirt-libvirt-domain.html
      [3] https://github.com/kubevirt/kubevirt/blob/v0.60.0-alpha.0/pkg/virt-launcher/virtwrap/live-migration-source.go#L719
      [4] https://github.com/kubevirt/kubevirt/pull/7946

              dholler@redhat.com Dominik Holler
              sradco Shirly Radco
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: