Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-9228

PaC: Cache list of changed files to reduce VCS API load

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • Pipelines 1.20.0, Pipelines 1.19.3
    • Pipelines as Code
    • None

      Cache list of changed files for an event to reduce API requests to VCS

      Goals

      External and Internal customers have both reported extremely high VCS API requests from PaC in the past few versions. Some customers have seen API requests on the order of tens of thousands per day and experienced severe API throttling causing PaC to behave unreliably. In one instance, a single push event resulted in over 500 Gitlab API requests.

      Caching highly redundant API requests can reduce this load significantly, and I have identified listing changed-files as a bad hotspot: if a repository has multiple pipelineruns in {.tekton/}, and each of them references the list of changed files, there will be at least three API requests for the changed-files per pipelinerun file. Even more requests depending on the type of event, the number of files changed by the event, and then all of that is multiplied by the number of {filesChanged()} references per pipelinerun. For example: in a monorepo with 20 pipelineruns, each of which checking if two globs have changed, every pushed commit results in 120 API requests (assuming the list of files is not large enough to require paging).  Since the files changed by an event are immutable, every call to after the full list is redundant.

      Requirements

      Requirements Notes IS MVP
      For any given event, no call to {vcs.GetFiles()} after the first successful response results in a VCS API request.   yes
        • (Optional) Use Cases

      < What are we making, for who, and why/what problem are we solving?>

      Out of scope

      - Refactoring when calls to {vcs.GetFiles()} are made. There is room for improvement there but that should be considered a separate issue.

      Dependencies

      < Link or at least explain any known dependencies. >

      Background, and strategic fit

      < What does the person writing code, testing, documenting need to know? >

      Assumptions

      • The list of files changed for a given event is immutable as far as PaC is concerned.

        Customer Considerations

      < Are there specific customer environments that need to be considered (such as working with existing h/w and software)?>

      Documentation Considerations

      < What educational or reference material (docs) is required to support this product feature? For users/admins? Other functions (security officers, etc)? >

      What does success look like?

      < Does this feature have doc impact? Possible values are: New Content, Updates to existing content, Release Note, or No Doc Impact?>

      QE Contact

      < Are there assumptions being made regarding prerequisites and dependencies?>

      < Are there assumptions about hardware, software or people resources?>

      Impact

      < If the feature is ordered with other work, state the impact of this feature on the other work>

      Related Architecture/Technical Documents

      <links>

      Done Checklist

      • Acceptance criteria are met
      • Non-functional properties of the Feature have been validated (such as performance, resource, UX, security or privacy aspects)
      • User Journey automation is delivered
      • Support and SRE teams are provided with enough skills to support the feature in production environment

              rh-ee-athorp Andrew Thorp
              rh-ee-athorp Andrew Thorp
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: