-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.20.z, 4.21.z, 4.22
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
It is possible for podman-etcd to try to recover in a state where the etcd revision on a learner is higher than the revision on a voter. This seems to be expecially prevalent in environments with slow disks where the voter gets starved for bandwidth, times out, and then is fenced.
Version-Release number of selected component (if applicable):
Current podman-etcd in 9.6 + 9.8
How reproducible:
Easiest way to reproduce is to spoof the revision numbers and then fencing the voter.
Steps to Reproduce:
1. Start one of your nodes as a learner
2. Spoof the revision so it's higher than the voting member node
3. Fence the voting member node
Actual results:
Recovery deadlock - no node can recover since the voter waits for the higher revision node to start, and the learner crashes because it has no voting members.
Expected results:
Learners are not real members. Their revision numbers don't matter. We need to check the revision numbers on only voting members when we check which node to start from.
Additional info:
We observed a similar case to this where the revision numbers were equal, so we were able to start the cluster, but one of the nodes (the learner) crashed immediately and pacemaker was left in a state where it thought that node was healthy despite etcd not running there. I would have expected this to be caught by the monitor code, but I didn't watch it long enough to observe if this happened.