-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.16
-
None
-
No
-
False
-
Description of problem:
While working on etcd cert rotation, I'm occasionally tripping over the thresholds for pathological events failures ("pathological event should not see excessive RequiredInstallerResourcesMissing secrets"). Besides the ones we have to fix in cluster-etcd-operator, I've also seen plenty of them in kube-api-server-operator. Similar pattern can be seen in kube-controller-manager-operator and kube-scheduler-operator, albeit less often. From the logs (attached) it seems to me that this is caused by a secrets informer not being synchronized before the controller actually runs. --- Some test runs from our payload testing: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-etcd-operator-1177-nightly-4.16-e2e-aws-sdn-upgrade/1750540617936539648 https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-etcd-operator-1177-nightly-4.16-e2e-aws-sdn-upgrade/1750540619433906176 https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-etcd-operator-1177-nightly-4.16-e2e-aws-sdn-upgrade/1750540619891085312 https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-etcd-operator-1177-nightly-4.16-e2e-aws-sdn-upgrade/1750540620352458752
Version-Release number of selected component (if applicable):
4.16, but I had a bug for a flaky test with this as early as 4.7, 4.8 (OCPBUGS-1128 and https://bugzilla.redhat.com/show_bug.cgi?id=2031564)
How reproducible:
2/8 runs on average, it's a race condition that is becoming more prevalent after merging https://github.com/openshift/cluster-etcd-operator/pull/1177
Steps to Reproduce:
1. checkout https://github.com/openshift/cluster-etcd-operator/pull/1177 2. run payload tests a few times 3. observe failures
Actual results:
flaky/failing test runs due to RequiredInstallerResourcesMissing
Expected results:
no flaky RequiredInstallerResourcesMissing anymore :)
Additional info:
In case we don't want to tackle it, I've prepped an increase for the time: being in https://github.com/openshift/origin/pull/28557 I can also add the other components, if necessary.