Uploaded image for project: 'OpenShift Workloads'
  1. OpenShift Workloads
  2. WRKLDS-1277

Consider terminating pods in job controller

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • None
    • Consider terminating pods in job controller
    • False
    • None
    • False
    • Not Selected
    • In Progress
    • OCPSTRAT-1115 - Consider terminating pods in job controller
    • OCPSTRAT-1115Consider terminating pods in job controller
    • 0% To Do, 0% In Progress, 100% Done
    • S

      As an OpenShift administrator

      1. I want to implement a pod failure policy in the job controller so that terminating pods (with a deletionTimestamp) are not immediately replaced and don't count as failed until they reach a terminal phase (Failed or Succeeded). This ensures a more efficient and accurate handling of pod failures.
      2. I want to avoid creating replacement pods for pods that are in the process of terminating but have not yet reached a Failed or Succeeded state so that I can prevent unnecessary resource allocation and align the creation of replacement pods with the pod failure policy.
      3. I want to extend Kubelet to mark pending terminating pods as failed so that the transition of these pods into the Failed phase is clearer and more consistent, enhancing the overall management of pod lifecycles.
      4. I want to add a DisruptionTarget condition for pods preempted by Kubelet to make room for critical pods so that there is better visibility and management of pods that are disrupted for critical workload prioritization.

       

      KEP : https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3939-allow-replacement-when-fully-terminated
      KEP issue: https://github.com/kubernetes/enhancements/issues/3939

            fkrepins@redhat.com Filip Krepinsky
            jchaloup@redhat.com Jan Chaloupka
            Rama Kasturi Narra
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: