Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-2434

Guaranteed QoS for TaskRun pods

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • Operator
    • 2
    • False
    • None
    • False
    • Pipelines Sprint 227, Pipelines Sprint 228, Pipelines Sprint 229, Pipelines Sprint 230, Pipelines Sprint 231

      According to Tekton upstream documentation:

      To get a "Guaranteed" QoS, a TaskRun pod must have compute resources set for all of its containers, including init containers which are injected by Tekton, and all containers must have their requests equal to their limits. This can be achieved by using LimitRanges to apply default requests and limits.

      This is not what seems to happen when it has been tested.

      Find below the LimitRange created:

      $ oc get limitranges -n sre-ci-test project-limits -oyaml
      apiVersion: v1
      kind: LimitRange
      metadata:
        annotations:
          app.kubernetes.io/created-by: argocd
          app.kubernetes.io/managed-by: helm
          kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"v1","kind":"LimitRange","metadata":{"annotations":{"app.kubernetes.io/created-by":"argocd","app.kubernetes.io/managed-by":"helm"},"labels":{"app.kubernetes.io/instance":"sre-ci-test"},"name":"project-limits","namespace":"sre-ci-test"},"spec":{"limits":[{"max":{"cpu":"14","memory":"14Gi"},"type":"Pod"},{"default":{"cpu":"100m","ephemeral-storage":"20Gi","memory":"64Mi"},"defaultRequest":{"cpu":"100m","ephemeral-storage":"1Gi","memory":"64Mi"},"max":{"cpu":"14","memory":"14Gi"},"type":"Container"}]}}
        creationTimestamp: "2021-12-14T12:18:46Z"
        labels:
          app.kubernetes.io/instance: sre-ci-test
        managedFields:
        - apiVersion: v1
          fieldsType: FieldsV1
          fieldsV1:
            f:metadata:
              f:annotations:
                .: {}
                f:app.kubernetes.io/created-by: {}
                f:app.kubernetes.io/managed-by: {}
                f:kubectl.kubernetes.io/last-applied-configuration: {}
              f:labels:
                .: {}
                f:app.kubernetes.io/instance: {}
            f:spec:
              f:limits: {}
          manager: argocd-application-controller
          operation: Update
          time: "2021-12-20T21:27:54Z"
        name: project-limits
        namespace: sre-ci-test
        resourceVersion: "1466633760"
        uid: 117ae94b-9228-41fb-9204-1adb4d3c1a9a
      spec:
        limits:
        - max:
            cpu: "14"
            memory: 14Gi
          type: Pod
        - default:
            cpu: 100m
            ephemeral-storage: 20Gi
            memory: 64Mi
          defaultRequest:
            cpu: 100m
            ephemeral-storage: 1Gi
            memory: 64Mi
          max:
            cpu: "14"
            memory: 14Gi
          type: Container
      

      Default requests and limits for CPU and memory are the same. However, many containers are not set like that and that is triggering the QoS of the pods to Burstable. Example:

       

            limits:
              cpu: 100m
              ephemeral-storage: 20Gi
              memory: 1Gi
            requests:
              cpu: 50m
              ephemeral-storage: 512Mi
              memory: 1Gi
      [...]
      qosClass: Burstable
      

       

      Pod name used for the example: sre-casa-0fd04ea-lane-0-82v65-pod-dzbxr, which will be attached privately.

      We would like to set the QoS of all the task pods to Guaranteed, but that cannot be achieved by setting LimitRanges. Is there any workaround?

      Edit: this has been moved to "Feature" type because it is a feature available for Tekton 0.38 and later versions.

              vdemeest Vincent Demeester
              rhn-support-llopezmo Lucas López Montero
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: