-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.20
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
What is happening is that master-1: podman-etcd restarts and decides to start etcd as learner (good), it sees it is in the member list (CIB learner_node=master-1) and starts (bad), master-0: etcd is shutting down, hence master-1: etcd discovery fails
Version-Release number of selected component (if applicable):
How reproducible:
75%
Steps to Reproduce:
1. Deploy a TNF cluster 2. Reboot master-0 ; wait for it to reboot, but not completely start etcd 3. Poll podman etcd to determine what state it is in 5. Reboot master-1 while etcd on master-0 is still coming up. 4. Poll podman etcd on master-1 to view the failure
Actual results:
[core@master-0 ~]$ sudo podman exec etcd etcdctl member list -w table +------------------+-----------+----------+-----------------------------+-----------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+-----------+----------+-----------------------------+-----------------------------+------------+ | 1437c5d88d1a66e3 | unstarted | | https://192.168.111.21:2380 | | true | | e42e3c3a55c27ed6 | started | master-0 | https://192.168.111.20:2380 | https://192.168.111.20:2379 | false | +------------------+-----------+----------+-----------------------------+-----------------------------+------------+
Expected results:
[core@master-1 ~]$ sudo podman exec etcd etcdctl member list -w table +------------------+---------+----------+-----------------------------+-----------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+----------+-----------------------------+-----------------------------+------------+ | 3d9d9acb04427a83 | started | master-0 | https://192.168.111.20:2380 | https://192.168.111.20:2379 | false | | caedfca0719c8594 | started | master-1 | https://192.168.111.21:2380 | https://192.168.111.21:2379 | false | +------------------+---------+----------+-----------------------------+-----------------------------+------------+
Additional info:
https://redhat-internal.slack.com/archives/C07ABRBBDK3/p1759864654684679