Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-4525

build or expose metrics to determine of results watcher/api are deadlocked or performance severely degraded

XMLWordPrintable

    • Pipelines Sprint Pioneers 10

      Story (Required)

      As a maintainer of Konflux trying to montior tekton health I want to know when tekton results if deadlocked or suffering from sufficient performance degradation.

      <Describes high level purpose and goal for this story. Answers the questions: Who is impacted, what is it and why do we need it? How does it improve the customer’s experience?>

      Background (Required)

      <Describes the context or background related to this story>

      Approximate what has been done so far for core tekton pipeline controller

      Out of scope

      <Defines what is not included in this story>

      Completion times of List and Get from the DB from the histogram will be a different exercise, as DB tuning, collaboration with Quay.io, and known needed UI optimization need to occur first.

      Approach (Required)

      <Description of the general technical path on how to achieve the goal of the story. Include details like json schema, class definitions>

      So once the memory leak is fixed and sufficient performance tuning if vetted, we establish baselines, excluding the remaining known log storage bugs, around

      • The api success rate metric we already expose
      • watcher work queue depth
      • watcher latency (though this will be much different than pipeline or chains controller since log storage has to be on thread)
      • Percentage success (95% at least hopefully) for CreateRecord, CreateResult, UpdateRecord, UpdateResult, UpdateLog GRPC calls

      Extra credit: metrics that confirm necessary labels, annotations, finalizers are set.

      Dependencies

      <Describes what this story depends on. Dependent Stories and EPICs should be linked to the story.>

       

      Acceptance Criteria  (Mandatory)

      <Describe edge cases to consider when implementing the story and defining tests>

      <Provides a required and minimum list of acceptance tests for this story. More is expected as the engineer implements this story>

       

      Done Checklist

      • Code is completed, reviewed, documented and checked in
      • Unit and integration test automation have been delivered and running cleanly in continuous integration/staging/canary environment
      • Continuous Delivery pipeline(s) is able to proceed with new code included
      • Customer facing documentation, API docs etc. are produced/updated, reviewed and published
      • Acceptance criteria are met

              gmontero@redhat.com Gabe Montero
              gmontero@redhat.com Gabe Montero
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: