Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-72688

Add pre-test validation: fail fast if cluster is already tainted

XMLWordPrintable

    • CNV I/U Operators Sprint 280
    • None

      Problem

      Crypto policy and JSON patch tests fail systematically in teardown when the cluster has pre-existing jsonpatch annotations on HyperConverged CR that were not created by the tests themselves.

      Affected Tests (9)

      Crypto Policy Tests:

      • test_update_specific_component_crypto_policy - CDI, CNAO, KubeVirt, SSP variants (4 tests)

      JSON Patch Alert Tests:

      • test_cdi_json_patch_alert
      • test_cnao_json_patch_alert
      • test_kubevirt_json_patch_alert
      • test_multiple_json_patch_alert
      • test_ssp_json_patch_alert

      Root Cause

      Tests assume clean cluster state but do not validate it before execution. When cluster has pre-existing jsonpatch annotations (from previous tests, manual changes, or incomplete cleanup), the cluster is already in TaintedConfiguration status.

      Scenario:

      1. Cluster has pre-existing annotation on HyperConverged CR
      2. HCO status shows TaintedConfiguration since before tests started
      3. Test adds its own jsonpatch annotation - cluster remains tainted (expected)
      4. Test removes its own annotation in teardown - cluster STILL tainted (pre-existing annotation remains)
      5. Teardown assertion fails - ERROR is reported

      Evidence

      Jenkins Job:
      test-pytest-cnv-4.20-iuo-ocs Build #34

      ReportPortal:
      Launch 146729

      Must-gather Analysis:

      • HyperConverged CR had pre-existing jsonpatch annotation for migrations configuration
      • Annotation content: allowPostCopy set to true via jsonpatch
      • Annotation timestamp: 2025-11-14 06:13:05 UTC
      • Tests started: 2025-11-14 08:15:35 UTC (2 hours after taint was set)
      • All 9 affected tests failed in teardown with identical error
      • Tests themselves PASSED - functionality works, only cleanup verification failed

      Error Pattern:

              AssertionError: assert not [TaintedConfiguration condition]
              lastTransitionTime: 2025-11-14T06:13:05Z
              message: Unsupported feature was activated via an HCO annotation
              reason: UnsupportedFeatureAnnotation
              status: True
              type: TaintedConfiguration
              

      Proposed Fix

      Alternative Implementation (simpler):

              import pytest
              from utilities.hco import is_hco_tainted
      
      
              @pytest.fixture(scope="session", autouse=True)
              def validate_cluster_not_tainted_before_tests(admin_client, hco_namespace):
                  """Fail fast if cluster is already tainted before jsonpatch tests run."""
                  if is_hco_tainted(admin_client=admin_client, hco_namespace=hco_namespace.name):
                      pytest.fail(
                          "Cluster is already in TaintedConfiguration state. "
                          "Clean jsonpatch annotations from HyperConverged CR before running tests. "
                          "Use: oc get hyperconverged -n openshift-cnv kubevirt-hyperconverged -o yaml"
                      )
              

      Validation Steps

      1. Run affected tests on cluster with pre-existing jsonpatch annotation
      2. Verify tests fail IMMEDIATELY in setup with clear error message
      3. Error should list which annotations need to be cleaned
      4. Clean annotations manually, re-run tests - should pass
      5. Run on clean cluster - should pass as before

      Benefits

      • Fail fast - Tests fail at setup (seconds) instead of teardown (after 30+ minutes)
      • Clear error message - Developers know immediately what is wrong and how to fix it
      • Prevents false failures - No more mysterious teardown failures from pre-existing state
      • Better CI hygiene - Forces clean cluster state for jsonpatch tests

      Impact

      • Frequency: Recurring (9 tests failed in job 34 with same root cause)
      • Jobs affected: test-pytest-cnv-4.20-iuo-ocs (TIER-2)
      • Classification: Test improvement (not product bug)
      • Test execution time saved: 30+ minutes per run when cluster is pre-tainted

              rlobillo Ramón Lobillo
              rlobillo Ramón Lobillo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: