-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.21
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
During a node replacement test running in CI, we run pcs debug stop and pcs debug start to force the resource agent to start without needed to fence the node. For some reason in CI, this hits an error on startup where the dead node is already marked as a learner, so when the survivor tries to start up, it fails when it tries to add the failed node as a learner (since it's already there).
Steps to Reproduce:
I was able to reproduce this in CI only by using debug-stop and debug-start
Actual results:
etcd fails to start on the survivor (cannot add learner member)
Expected results:
etcd starts on the survivor
Additional info:
- clones
-
OCPBUGS-61117 Fencing recovery race causes node to remain learner, etcd never starts
-
- ASSIGNED
-