Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-81061

[test-upgrade-cnv-to-4.16.z] TestUpgradeIUO::test_alerts_fired_during_upgrade - OutdatedVirtualMachineInstanceWorkloads alert fired repeatedly

XMLWordPrintable

    • Quality / Stability / Reliability
    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • CNV I/U Operators Sprint 285
    • None

      The test `TestUpgradeIUO::test_alerts_fired_during_upgrade` is consistently failing in the CNV 4.16 Z-stream upgrade job across multiple builds.

      Failing test

      tests/install_upgrade_operators/product_upgrade/test_upgrade_iuo.py::TestUpgradeIUO::test_alerts_fired_during_upgrade
      

      Error

      AssertionError: Following alerts were fired during upgrade:
      [{'labels': {'alertname': 'OutdatedVirtualMachineInstanceWorkloads', 'severity': 'warning', 'operator_health_impact': 'none', 'namespace': 'openshift-cnv', 'pod': 'virt-controller-...', 'job': 'kubevirt-prometheus-metrics', ...},
        'annotations': {'summary': 'Some running VMIs are still active in outdated pods after KubeVirt control plane update has completed.',
                         'runbook_url': 'https://github.com/openshift/runbooks/blob/master/alerts/openshift-virtualization-operator/OutdatedVirtualMachineInstanceWorkloads.md'},
        'state': 'pending'}]
      

      Root cause (observed)
      During CNV Z-stream upgrade (4.16.z -> 4.16.30), the `OutdatedVirtualMachineInstanceWorkloads` alert fires with state `pending`. The alert means that some VMI workloads are still running in outdated virt-controller pods after the KubeVirt control plane update completes. The test asserts that no such alert should be fired during upgrade.

      Affected builds (CNV 4.16 Z-stream, both identical failures)

      • Build #244 (2026-02-20) - Results: 40 Passed, 1 Failed, 2 Skipped
      • Build #245 (2026-02-27) - Results: 40 Passed, 1 Failed, 2 Skipped

      Environment

      • CNV version: 4.16.30 (hco-bundle-registry-container-v4.16.30.rhel9-42)
      • OCP version: 4.16.57
      • IIB: registry-proxy.engineering.redhat.com/rh-osbs/iib:1092425
      • Cluster: cnv416z-upg.rhos-psi.cnv-qe.rhood.us
      • Storage: ocs-storagecluster-ceph-rbd-virtualization
      • Errata: 157271 (SHIPPED_LIVE)

      Impact
      This is a T2 upgrade test. Two consecutive weekly upgrade runs for CNV 4.16 have 1 failure each with the same test. The test result is consistent across different cluster instances (different cluster IDs), confirming this is not a flake but a reproducible issue.

      Next steps

              rh-ee-orevah Ohad Revah
              rh-ee-msedlak Miroslav Sedlak
              Ohad Revah Ohad Revah
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: