-
Story
-
Resolution: Unresolved
-
Major
-
None
-
openshift-4.12, openshift-4.13, openshift-4.14
-
None
-
BU Product Work
-
False
-
-
False
-
-
Description of problem:
This is a followup of OCPBUGS-16008. We have a solution for the bug, but a better and more efficient solution implementation remains to be done as a separate task. The downside of the current implementation is that having a lot of pending pods at the same time will cause the reconcile cycle to take a long time. However, the current implementation still solves pods stuck in pending state, and is overall better than not having a fix. To do things the "proper" way, a non-trivial amount of work needs to be done, so this bug is to track this effort. There are two approaches.
1. Have a configmap configured by the user so that they can set the cron schedule themselves.
2. We will need to keep a list of the pending pods in the reconcile looper struct, and retry for them. This would also need to be integrated with the ip-control-loop to sync retries. This is the most correct approach but is probably not doable by the November deadlines.
Version-Release number of selected component (if applicable):
How reproducible:
Forcefully reboot a node, then force delete a pod in a stateful set that was created on the same node. This causes the pod to be recreated and remain indefinitely in the Pending state. (This will not reproducible when OCPBUGS-16008 is CLOSED and the associated PR merges - but this is still useful information, because we can safely say that we have broken something if the issue reappears.
- is related to
-
OCPBUGS-26986 whereabouts reconciler schedule is not configurable
- Closed
- relates to
-
OCPBUGS-16008 pods assigned with Multus whereabouts IP get stuck in ContainerCreating state after OCP upgrading [Backport 4.12]
- Closed