This is a clone of issue OCPBUGS-36462. The following is the description of the original issue:
—
Description of problem
Similar to OCPBUGS-20061, but for a different situation:
$ w3m -dump -cols 200 'https://search.dptools.openshift.org/?maxAge=48h&name=pull-ci-openshift-cluster-etcd-operator-master-e2e-aws-ovn-etcd-scaling&type=junit&search=clusteroperator/control-plane-machine-set+should+not+change+condition/Available' | grep 'failures match' | sort pull-ci-openshift-cluster-etcd-operator-master-e2e-aws-ovn-etcd-scaling (all) - 15 runs, 60% failed, 33% of failures match = 20% impact
In that test, since ETCD-329, the test suite deletes a control-plane Machine and waits for the ControlPlaneMachineSet controller to scale in a replacement. But in runs like this, the outgoing Node goes Ready=Unknown for not-yet-diagnosed reasons, and that somehow misses cpmso#294's inertia (maybe the running guard should be dropped?), and the ClusterOperator goes Available=False complaining about Missing 1 available replica(s).
It's not clear from the message which replica it's worried about (that would be helpful information to include in the message), but I suspect it's the Machine/Node that's in the deletion process. But regardless of the message, this does not seem like a situation worth a cluster-admin-midnight-page Available=False alarm.
Version-Release number of selected component
Seen in dev-branch CI. I haven't gone back to check older 4.y.
How reproducible
CI Search shows 20% impact, see my earlier query in this message.
Steps to Reproduce
Run a bunch of pull-ci-openshift-cluster-etcd-operator-master-e2e-aws-ovn-etcd-scaling and check CI Search results.
Actual results
20% impact
Expected results
No hits.
- clones
-
OCPBUGS-36462 control-plane-machine-set goes Available=False with UnavailableReplicas during etcd scale testing
- Closed
- is blocked by
-
OCPBUGS-36462 control-plane-machine-set goes Available=False with UnavailableReplicas during etcd scale testing
- Closed
- links to
-
RHBA-2024:5757 OpenShift Container Platform 4.16.z bug fix update