Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-70614

[stabilization]Must-gather on specific worker killed in non-compact clusters

XMLWordPrintable

    • Quality / Stability / Reliability
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None

      The test 'test_must_gather_and_vm_same_node' is permafailing when it's run on a non-compact cluster (3 masters + 3 workers).

      The reason is that the must-gather cannot be scheduled as it has nodeAfinnity to run on masters:

      spec:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - c01-rlobillo-420-qxhkq-master-0
                  - c01-rlobillo-420-qxhkq-master-2
                  - c01-rlobillo-420-qxhkq-master-1
      

      and nodeName attribute set to the worker where the VM is running. Therefore, it is never scheduled and killed with 137 error:

      2025-10-08T19:13:11.051651 utilities.must_gather ERROR must-gather raised the following error: openshift-must-gather-pmr5w/must-gather-v2xqw unexpectedly terminated: exit code: 137, reason: ContainerStatusUnknown, message: The container could not be located when the pod was terminated
      

      The test was passing until https://github.com/RedHatQE/openshift-virtualization-tests/pull/705 fixed it and started to fail in non-compact clusters.

      Solution to be decided.

              rlobillo Ramón Lobillo
              rlobillo Ramón Lobillo
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: