Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-7771

Make terminated-pod-gc-threshold a Supported Configuration in OpenShift

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • kube-apiserver
    • None
    • None
    • Future Sustainability
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None

      Proposed title of this feature request:
      Make terminated-pod-gc-threshold a Supported Configuration in OpenShift

      What is the nature and description of the request?
      Please support terminated-pod-gc-threshold as a first-class, upgrade-safe configuration in OpenShift.
      This will let users tune pod garbage collection to prevent resource exhaustion and maintain cluster health, especially during node shutdown events. Refer https://access.redhat.com/solutions/6996490

      Why does the customer need this?
      The pod network-node-identity has broad toleration (matches all), so that when a node is doing graceful shutdown, this pod keeps getting scheduled on the shutting down node, and because the node correctly keeps rejecting it, these failed pods are accumulating with `ContainerStatusUnknown` and are not garbage collected soon enough that some services are OOM-ing.

      List any affected packages or components.

      What is the business impact?
      We encountered an outage due to accumulated pods in `ContainerStatusUnknown` that aren't garbage collected soon enough. And we had to disable graceful shutdown to mitigate this, but then we lose the graceful shutdown feature which makes our shutdown take longer that it needs to be.

              racedoro@redhat.com Ramon Acedo
              rhn-support-nchoudhu Novonil Choudhuri
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                None
                None