Description of problem:
Sometimes when running the etcd recovery test we see the following invariant failing: [sig-node] static pods should start after being created { static pod lifecycle failure - static pod: "etcd" in namespace: "openshift-etcd" for revision: 13 on node: "ip-10-0-72-9.ec2.internal" didn't show up, waited: 3m0s} This isn't a problem, the next revision was replacing it soon after, we should take a look why the etcd pod didn't come up within the 3 minute window though. example run: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-etcd-operator/1292/pull-ci-openshift-cluster-etcd-operator-master-e2e-aws-etcd-recovery/1808773983227613184 search: https://search.dptools.openshift.org/?search=static+pod+lifecycle+failure&maxAge=48h&context=1&type=junit&name=.*aws-etcd-recovery.*&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
Version-Release number of selected component (if applicable):
4.17
How reproducible:
sometimes
Steps to Reproduce:
1. you can run the origin test suite with "openshift-test run "openshift/etcd/recovery" 2. ??? 3. see the invariant result failing
Actual results:
static pod doesn't come up after 3 minutes
Expected results:
static pods come up happy
Additional info:
- is duplicated by
-
OCPBUGS-36867 Static pod controller pods sometimes fail to start [etcd]
- ASSIGNED
- relates to
-
OCPBUGS-36867 Static pod controller pods sometimes fail to start [etcd]
- ASSIGNED