-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.19
-
None
This is a clone of issue OCPBUGS-53298. The following is the description of the original issue:
—
Description of problem:
The test "[sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes" sometimes fails as it expects Pods to terminate within the time range that is determined as terminationGracePeriod seconds minus 3 and plus 3.
Version-Release number of selected component (if applicable):
4.19
How reproducible:
Reproduced with Cilium network stack (instead of OVNKubernetes) in this job run: https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/pr-logs/pull/openshift_release/62770/rehearse-62770-periodic-ci-openshift-openshift-tests-private-release-4.19-amd64-nightly-aws-ipi-cilium-hypershift-guest-f7/1901568871366660096 that ran tests from this pull request: https://github.com/openshift/release/pull/62770
Steps to Reproduce:
Run this command against an OpenShift cluster with Cilium network stack:
> extended-platform-tests run-test "[sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes"
Actual results:
Test fails with:
{ fail [github.com/openshift/openshift-tests-private/test/extended/util/assert.go:30]: Unexpected error:
<*errors.errorString | 0xc001d430c0>:
case: [sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes
error: probe terminationGracePeriod is not as expected!
{
s: "case: [sig-node] NODE Probe feature Author:minmli-High-44493-add configurable terminationGracePeriod to liveness and startup probes\nerror: probe terminationGracePeriod is not as expected!",
}
occurred}
and the logs show the following log where timeSec1=90 and timeSec2=25. These are timestamps for killing the container and starting it again.
I0317 12:01:05.301792 42297 node_utils.go:1704] time1Min:, timeTemp:, time1Sec:90, time1MinInt:0, time1SecInt:90 I0317 12:01:05.301806 42297 node_utils.go:1706] timeSec1: 90 I0317 12:01:05.301828 42297 node_utils.go:1731] time2Min:, time2Sec:25, time2MinInt:0, time2SecInt:25 I0317 12:01:05.301843 42297 node_utils.go:1733] timeSec2: 25 I0317 12:01:05.301861 42297 node_utils.go:1739] terminationGracePeriod check failed
The test expects the Pod to be started again between 57 and 63 seconds (as it adds a toleration +-3 seconds to the default grace period seconds time). However, the Pod is started after 65 seconds in this case so it exceed the threshold and fails. I was not able to reproduce locally, it depends on environment.
Expected results:
The test passes and is more robust.
Additional info:
The source code for the test can be found here: https://github.com/openshift/openshift-tests-private/blob/master/test/extended/node/probe.go#L72
- clones
-
OCPBUGS-53298 [FLAKE] Test "add configurable terminationGracePeriod to liveness and startup probes" fails randomly
-
- Verified
-
- is blocked by
-
OCPBUGS-53298 [FLAKE] Test "add configurable terminationGracePeriod to liveness and startup probes" fails randomly
-
- Verified
-