Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-66149

[FLAKE] Test "add configurable terminationGracePeriod to liveness and startup probes" fails randomly

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Normal Normal
    • None
    • 4.19
    • Test Framework
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-53298. The following is the description of the original issue:

      Description of problem:

          The test "[sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes" sometimes fails as it expects Pods to terminate within the time range that is determined as terminationGracePeriod seconds minus 3 and plus 3.  

      Version-Release number of selected component (if applicable):

          4.19

      How reproducible:

          Reproduced with Cilium network stack (instead of OVNKubernetes) in this job run: https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/pr-logs/pull/openshift_release/62770/rehearse-62770-periodic-ci-openshift-openshift-tests-private-release-4.19-amd64-nightly-aws-ipi-cilium-hypershift-guest-f7/1901568871366660096 that ran tests from this pull request: https://github.com/openshift/release/pull/62770

      Steps to Reproduce:

      Run this command against an OpenShift cluster with Cilium network stack: 

      > extended-platform-tests run-test "[sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes"

      Actual results:

      Test fails with:

          {  fail [github.com/openshift/openshift-tests-private/test/extended/util/assert.go:30]: Unexpected error:
          <*errors.errorString | 0xc001d430c0>: 
          case: [sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes
          error: probe terminationGracePeriod is not as expected!
          {
              s: "case: [sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes\nerror: probe terminationGracePeriod is not as expected!",
          }
      occurred}

      and the logs show the following log where timeSec1=90 and timeSec2=25. These are timestamps for killing the container and starting it again.

      I0317 12:01:05.301792 42297 node_utils.go:1704] time1Min:, timeTemp:, time1Sec:90, time1MinInt:0, time1SecInt:90
          I0317 12:01:05.301806 42297 node_utils.go:1706] timeSec1: 90 
          I0317 12:01:05.301828 42297 node_utils.go:1731] time2Min:, time2Sec:25, time2MinInt:0, time2SecInt:25
          I0317 12:01:05.301843 42297 node_utils.go:1733] timeSec2: 25 
          I0317 12:01:05.301861 42297 node_utils.go:1739] terminationGracePeriod check failed 

      The test expects the Pod to be started again between 57 and 63 seconds (as it adds a toleration +-3 seconds to the default grace period seconds time). However, the Pod is started after 65 seconds in this case so it exceed the threshold and fails. I was not able to reproduce locally, it depends on environment.

      Expected results:

          The test passes and is more robust.

      Additional info:

          The source code for the test can be found here: https://github.com/openshift/openshift-tests-private/blob/master/test/extended/node/probe.go#L72

              rhn-engineering-dgoodwin Devan Goodwin
              mgencur@redhat.com Martin Gencur
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: