-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.14.z, 4.15.z, 4.17.z, 4.16.z, 4.18.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
Rejected
-
None
-
-
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
We have one automation case to be executed in Prwo CI cluster of openstack.
After stopped and started KCM leader master node, etcd pod will run into CrashLoopBackOff status
Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.18.0-0.nightly-2024-11-14-090045 True False 5h37m Cluster version is 4.18.0-0.nightly-2024-11-14-090045
How reproducible:
always
Steps to Reproduce:
1. Shutdown the KCM leader master node,
$ oc debug node/<master node name>
sh-4.2# chroot /host
sh-4.2# poweroff
Make sure the master node has been powered off
$ oc get node
2. To verify the cluster works fine with two master nodes, run:
oc login
oc new-project test-project-1
oc new-app aosqe/hello-openshift -n test-project-1
3. Start the powered off master again. Check the cluster status:
oc get no
oc get co
oc get po -A
Actual results:
1. etcd operator is degraded.
$ oc get co
...
etcd 4.18.0-0.nightly-2024-11-14-090045 True False True 4h23m EtcdMembersDegraded: 2 of 3 members are available, qizv777c-4bb48-4vjc4-master-0 is unhealthy...
$ oc describe co/etcd
...
Status:
Conditions:
Last Transition Time: 2024-11-15T09:07:28Z
Message: EtcdMembersDegraded: 2 of 3 members are available, qizv777c-4bb48-4vjc4-master-0 is unhealthy
StaticPodsDegraded: pod/etcd-qizv777c-4bb48-4vjc4-master-0 container "etcd" is waiting: CrashLoopBackOff: back-off 5m0s restarting failed container=etcd pod=etcd-qizv777c-4bb48-4vjc4-master-0_openshift-etcd(ab2c484f207208c9eb9adb8c2f9b65e4)
Reason: EtcdMembers_UnhealthyMembers::StaticPods_Error
Status: True
Type: Degraded
Last Transition Time: 2024-11-15T05:04:14Z
Message: NodeInstallerProgressing: 3 nodes are at revision 8
EtcdMembersProgressing: No unstarted etcd members found
Reason: AsExpected
Status: False
Type: Progressing
Last Transition Time: 2024-11-15T04:47:13Z
Message: StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 8
EtcdMembersAvailable: 2 of 3 members are available, qizv777c-4bb48-4vjc4-master-0 is unhealthy
Reason: AsExpected
Status: True
Type: Available
Last Transition Time: 2024-11-15T04:43:44Z
Message: All is well
Reason: AsExpected
Status: True
Type: Upgradeable
Last Transition Time: 2024-11-15T04:43:44Z
Reason: NoData
Status: Unknown
Type: EvaluationConditionsDetected
$ oc get pod -n openshift-etcd
NAME READY STATUS RESTARTS AGE
...
etcd-qizv777c-4bb48-4vjc4-master-0 4/5 CrashLoopBackOff 11 (119s ago) 4h12m
...
Expected results:
Etcd operator should not be degraded.
Additional info:
workaround: [https://access.redhat.com/solutions/6962106]