-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
4.10
-
Moderate
-
None
-
Rejected
-
Unspecified
-
If docs needed, set a value
-
Description of problem:
- When a node goes down, Statefulset pods are stuck into a termination state.
- This is affecting a customers cluster release
Version-Release number of selected component (if applicable):
On all RHOCP 4.X 3.X versions.
How reproducible:
- Considering the setup below:
Datacenter1: node1: SatefulSet-pod1
Datacenter2: node2: SatefulSet-pod2
Datacenter3: node3: SatefulSet-pod3
- When a Datacenter1: node1 failure happens, the pod is stuck in a terminating state and is not rescheduled to another node
Actual results:
- Pod `SatefulSet-pod1` is stuck into a termination state
Expected results:
- Pod `SatefulSet-pod1` must be deleted and recreated on another working node.
Additional info:
- This is a kind of known bug/limitation with StatefulSet but the customer is looking correction planned for this bug.
- Upstream bug:
- `https://github.com/kubernetes/kubernetes/issues/67250`
- `https://github.com/kubernetes/kubernetes/issues/54368`
- KCS regarding this
- `https://access.redhat.com/solutions/5680591`
- `https://access.redhat.com/solutions/5012251`