Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Applications & Workloads
Labels:

Activity Type:
Product / Portfolio Work
Parent Link:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Size:
None

Target Version:

openshift-4.22
Release Blocker:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Priority Data:
None
PX Impact Score:
PX Technical Impact:
None
PX Impact Range:
None
PX Scheduling Request:
None
PX Technical Impact Notes:
None

Intelligence Requested:
Market:

Feature Overview (aka. Goal Summary)

OpenShift clusters running high-churn workloads (e.g., OpenShift Pipelines, GitOps, CI/CD jobs) often accumulate thousands of terminated pods and exited containers on worker nodes. This leads to:

kubelet failures due to gRPC message size limits (e.g., ListPodSandbox errors),

NotReady node states due to PLEG is not healthy,

Excessive memory usage by crio,

Slow API server performance and increased etcd load,

Operational risk and service degradation in production clusters.

The current threshold for terminated pod garbage collection (terminated-pod-gc-threshold) is set to a high default (12,500), and the only way to modify it is through unsupported overrides. This proposal introduces a supported, configurable mechanism to automatically manage terminated pods per node and at the cluster level.

✅ Key Use Cases

Prevent Node Failures Due to Excessive Exited Pods

- Nodes become NotReady when gRPC message sizes exceed limits due to too many exited containers (e.g., pipelines with long annotations or massive pod counts).

Avoid Kubelet PLEG Failures and Container Runtime Crashes

- Accumulated terminated pods lead to frequent PLEG is not healthy and container runtime is down errors, degrading node health and triggering service disruption.

Control etcd and API Server Load from Excess Pod Metadata

- High numbers of terminated pods increase object counts in etcd and slow down the control plane, complicating troubleshooting and inflating cluster load.

Replace Fragile Workarounds like CronJobs for Pod Cleanup

- Customers currently rely on manual cleanup scripts or jobs to delete completed pods, which are error-prone and not scalable.

Protect Multi-Tenant and High-Scale Environments

- In large clusters or shared environments, a single workload (e.g., misconfigured Tekton pipelines) can produce thousands of terminated pods, risking node and cluster health.

Support Configurable Policy for Terminated Pod Lifecycle

- Customers request a tunable threshold via supported APIs (not via unsupportedConfigOverrides), e.g., with values based on cluster topology:
  .1 * node_count * maxPodsPerNode (min 1,000; max 19,000)
  with changes applied after 24 hours or if delta > 25%.

depends on

OCPNODE-3601 Improve Node Reliability and Control Plane Load by Managing Terminated Pods Proactively (upstream work in 4.21)

To Do

Assignee:: Gaurav Singh

Reporter:: Gaurav Singh

Need Info From:: None

Contributors:: Ayato Tokubi

Architect:: None

QA Contact:: Aruna Naik

Doc Contact:: None

Product Operations Engineering Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/08/05 8:35 PM

Updated:: 2025/11/21 4:51 PM

Details

Description

Feature Overview (aka. Goal Summary)

✅ Key Use Cases

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates