Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-1783

Pipelines operator scaling deployment in conflict with HPA

XMLWordPrintable

    • 5
    • False
    • False
    • Hide
      The Horizontal Pod Autoscaler can manage the replica count of deployments controlled by the {pipelines-title} Operator. From this release onward, the {pipelines-title} Operator will not reset the replica count of deployments managed by it, if the count is changed by an end user or an on cluster agent. However, the replicas will be reset when you upgrade the {pipelines-title} Operator.
      Show
      The Horizontal Pod Autoscaler can manage the replica count of deployments controlled by the {pipelines-title} Operator. From this release onward, the {pipelines-title} Operator will not reset the replica count of deployments managed by it, if the count is changed by an end user or an on cluster agent. However, the replicas will be reset when you upgrade the {pipelines-title} Operator.
    • Pipelines Sprint 211

      We are encountering a strange issue where the HPA for tekton-pipelines-webhook is constantly scaling the deployment tekton-pipelines-webhook from 1 to 3+ and then back to 1 instantly.  This results in a lot of pods being started but instantly terminated. Some of these pods then get stuck in terminating state, and they start stacking up resulting in something like the image attached.
      We've tried changing the HPA to only scale to at most 1 pod, but this change eventually gets over-written by the operator anyways. We're wondering if this is a known issue/problem, or if there's some resolution/workaround we can do. 

      Scaling down the operator seems to allow HPA to work uninterrupted, indicating that this is likely due to the operator fighting with the HPA in scaling the replica counts.

      Additional slack chat for context: https://coreos.slack.com/archives/CSPS1077U/p1635355148029600

              smukhade Shivam Mukhade (Inactive)
              humairkhan Humair Khan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: