Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-60649

[HyperShift] TestUpgradeControlPlane EnsureNoCrashingPods test failing too often

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Approved
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      (Feel free to update this bug's summary to be more specific.)
      Component Readiness has found a potential regression in the following test:

      TestUpgradeControlPlane

      Test has a 91.88% pass rate, but 95.00% is required.

      Sample (being evaluated) Release: 4.20
      Start Time: 2025-07-22T00:00:00Z
      End Time: 2025-08-19T12:00:00Z
      Success Rate: 91.88%
      Successes: 181
      Failures: 16
      Flakes: 0
      Base (historical) Release: 4.19
      Start Time: 2025-05-18T00:00:00Z
      End Time: 2025-06-17T23:59:59Z
      Success Rate: 0.00%
      Successes: 0
      Failures: 0
      Flakes: 0

      View the test details report for additional context.

      Also hits additional tests:
      TestUpgradeControlPlane/ValidateHostedCluster/EnsureNoCrashingPods
      TestUpgradeControlPlane/ValidateHostedCluster

      All three fail together, perhaps EnsureNoCrashingPods is the root of the problem bubbling up?

      Failure message:

      {Failed  === RUN   TestCreateCluster/ValidateHostedCluster/EnsureNoCrashingPods
          util.go:255: Successfully waited for kubeconfig to be published for HostedCluster e2e-clusters-vvlnz/create-cluster-rgdd6 in 25ms
          util.go:272: Successfully waited for kubeconfig secret to have data in 0s
          util.go:717: Container manager in pod capi-provider-6c49899869-8wt2v has a restartCount > 0 (2)
              --- FAIL: TestCreateCluster/ValidateHostedCluster/EnsureNoCrashingPods (0.07s)
      }
      

      Based on the failure message, I'm a little curious if the test is valid, are we sure the restarts implies a crash?

      In any case, this is showing as a new test at least for this variant combo, and thus it needs a 95% pass rate.

      This job is blocking on ci payloads so I don't think we can just omit the job like the recent batch of unmaintained jobs.

      Filed by: dgoodwin@redhat.com

              sjenning Seth Jennings
              openshift-trt OpenShift Technical Release Team
              None
              None
              Jim Ma Jim Ma
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: