Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-8023

Tekton Results does not recover from extreme load spikes

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Release Note Not Required

      Description of problem:

      When Tekton Results becomes overloaded, for example when a surge of 5-6x the normal volume of PLRs completes Results can get into a state where the workqueue is so large that it is unable to process new object creation before the objects are deleted. It is unable to add finalizers to TaskRuns and PipelineRuns before they have completed and been pruned. Because now almost every queued event cannot be processed, Results appears to get into a state where it tries to reconcile every object, fails in a permanent way, but still attempts to retry the reconciliation after some time. This results in the workqueue being "low", and reconciliation latency being "low", but reconciliation success rate being extremely poor
      All of these thousands of stale reconciliations are not invisibly stored in the retry queue, even though their k8s objects have long since been deleted.

      Recovery for this is straightforward but manual: restart the pod. Results needs to be able to recover from this properly however. If an object no longer exists in the cluster, we shouldn't keep retrying to reconcile it.

      Prerequisites (if any, like setup, operators/versions):

      Steps to Reproduce

       # <steps>

       

      Actual results:

      Expected results:

      Reproducibility (Always/Intermittent/Only Once):

      Acceptance criteria: 

       

      Definition of Done:

      Build Details:

      Additional info (Such as Logs, Screenshots, etc):

       

       *

        1. image-2025-06-27-14-56-51-366.png
          26 kB
          Andrew Thorp
        2. image-2025-06-27-14-59-56-963.png
          192 kB
          Andrew Thorp
        3. image-2025-06-27-15-03-25-125.png
          103 kB
          Andrew Thorp

              Unassigned Unassigned
              rh-ee-athorp Andrew Thorp
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: