Loading...

XML

Word

Printable

Type: Story
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: openshift-4.12, openshift-4.13, openshift-4.14
Component/s: None
Labels:
- telco-4.12.z
- telco-5g

Work Type:
BU Product Work
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
Whereabouts: Reconciliation Hardening
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

This is a followup of OCPBUGS-16008. We have a solution for the bug, but a better and more efficient solution implementation remains to be done as a separate task. The downside of the current implementation is that having a lot of pending pods at the same time will cause the reconcile cycle to take a long time. 

However, the current implementation still solves pods stuck in pending state, and is overall better than not having a fix. To do things the "proper" way, a non-trivial amount of work needs to be done, so this bug is to track this effort. There are two approaches.

1. Have a configmap configured by the user so that they can set the cron schedule themselves.

2. We will need to keep a list of the pending pods in the reconcile looper struct, and retry for them. This would also need to be integrated with the ip-control-loop to sync retries. This is the most correct approach but is probably not doable by the November deadlines.

Version-Release number of selected component (if applicable):

How reproducible:

Forcefully reboot a node, then force delete a pod in a stateful set that was created on the same node. 

This causes the pod to be recreated and remain indefinitely in the Pending state.

(This will not reproducible when OCPBUGS-16008 is CLOSED and the associated PR merges - but this is still useful information, because we can safely say that we have broken something if the issue reappears.

is related to

OCPBUGS-26986 whereabouts reconciler schedule is not configurable

Closed

relates to

OCPBUGS-16008 pods assigned with Multus whereabouts IP get stuck in ContainerCreating state after OCP upgrading [Backport 4.12]

Closed

Assignee:: Peng Liu

Reporter:: Nikhil Simha (Inactive)

QA Contact:: Weibin Liang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2023/08/31 6:19 PM

Updated:: 2024/10/29 9:07 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates