-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.18
-
Quality / Stability / Reliability
-
False
-
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
During upgrade testing with OCL enabled, it was discovered that MCD pod deletion was reoccurring pathologically. Originally, I thought the source for this might have been https://issues.redhat.com/browse/OCPBUGS-42695 since that was causing unnecessary restarts of the MCD deployment. However, since that PR has landed, the issue remains.
Version-Release number of selected component (if applicable):
How reproducible:
Always in the e2e-aws-ovn-upgrade-ocb job that we intend to use to test OCL upgrade flows. Here's a rehearsal run where this failure occurs: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/58241/rehearse-58241-pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn-upgrade-ocb/1851035793171156992
Steps to Reproduce:
Run the aforementioned CI job.
Actual results:
Test [sig-arch] events should not repeat pathologically for ns/openshift-machine-config-operator fails with the following information:
{ 8 events happened too frequentlyevent happened 32 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/d8663bab95 - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-hqrq7 (00:44:33Z) result=reject
event happened 74 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/d705389d90 - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-njl7w (00:49:22Z) result=reject
event happened 116 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/86735cbecb - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-fbfmv (00:54:27Z) result=reject
event happened 188 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/dc2da581d9 - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-4sczr (00:59:43Z) result=reject
event happened 32 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/de1673a806 - reason/SuccessfulCreate (combined from similar events): Created pod: machine-config-daemon-klxjf (01:07:37Z) result=reject
event happened 81 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/f9637dcf20 - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-6vbtg (01:12:37Z) result=reject
event happened 152 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/ad46a2def0 - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-kbb8f (01:17:50Z) result=reject
event happened 188 times, something is wrong: namespace/openshift-machine-config-operator daemonset/machine-config-daemon hmsg/9ca5e98d9c - reason/SuccessfulDelete (combined from similar events): Deleted pod: machine-config-daemon-cp2xl (01:22:40Z) result=reject }
Expected results:
The aforementioned test should pass.
Additional info: