Uploaded image for project: 'Knative Serving'
  1. Knative Serving
  2. SRVKS-1256

Support for configuring the progress-deadline of a Service

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Major Major
    • 1.34.0
    • None
    • None
    • None
    • False
    • None
    • False

      From Knative docs upstream, the progress-deadline of a Knative service is 10minutes. It seems to be the same default for Serverless. However, there are cases when workloads may take longer than 10minues. In such cases, Knative would mark the deployment as Failed and will scale it down.

      In OpenShift AI, as mentioned in RHOAIENG-7609 (which is a bug), this issue was hit when deploying AI models and node autoscaling is enabled. Deploying an AI model which required a GPU triggered provisioning of a new node which took more than 10 minutes for the node to become usable and pods could be scheduled on it. This is just an instance, but there could be other reasons for a pod to take long to start.

      Since AI models may have different resource requirements, it is not possible to use the same progress-deadline for all. Thus, proper support is needed for configuring the progress-deadline in a per-service basis.

              Unassigned Unassigned
              edgar.hernandez Edgar Hernández
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: