-
Story
-
Resolution: Done
-
Major
-
None
-
5
-
False
-
-
False
-
Release Note Not Required
-
Done
-
-
-
Pipelines Sprint Crookshank 32, Pipelines Sprint Crookshank 33, Pipelines Sprint Crookshank 34
Story (Required)
As a <PERSONA> trying to <ACTION> I want <THIS OUTCOME>
<Describes high level purpose and goal for this story. Answers the questions: Who is impacted, what is it and why do we need it? How does it improve the customer’s experience?>
Background (Required)
**
Few details:
- stored_latency_seconds: distribution, facets: {type (TaskRun/PipelineRun), namespace: (maybe optional), success: boolean} - Recording the time between when the Run finished and when Results was able to mark it as Stored. Since this is a histogram we can extrapolate a lot: using `sum/count` we can get the average latency, and using `count` we can get the number of runs stored faceted on the storage success, namespace, type, etc.
> In all of the above, the metrics should be per unique Run. That is to say if a PipelineRun is upserted 12 times over its lifetime, it's useful to know it was stored 12 times but that isn't the purpose of these metrics; the discreet number of pipelineruns and taskruns is important. All of those are helpful to understand per namespace as well.
To clarify the above, I mean to say that most of the things metrics can tell us for Results will be inaccurate if they're emitted every reconcilation. For example, in the case of "latency between PipelineRun being Done and being Stored", I want to know how long after completion was a PLR stored in the database. However if we emit the metric on every reconcilation and a PLR is reconciled 5 times after completion (maybe 0.1s, 0.4s, 0.5s, 1s, and 5m after completion) I want the metric to report `0,1` but the metric is going to include other data points which will skew the actual values. This can be solved by emitting the metric(s) when we detect a transition from one state to another. Instead of emitting the metric if plr.IsDone(), we can emit the metric if plr.CompletionTime > plr.LastStoredTime.
Out of scope
<Defines what is not included in this story>
Approach (Required)
<Description of the general technical path on how to achieve the goal of the story. Include details like json schema, class definitions>
Dependencies
<Describes what this story depends on. Dependent Stories and EPICs should be linked to the story.>
Acceptance Criteria (Mandatory)
<Describe edge cases to consider when implementing the story and defining tests>
<Provides a required and minimum list of acceptance tests for this story. More is expected as the engineer implements this story>
INVEST Checklist
Dependencies identified
Blockers noted and expected delivery timelines set
Design is implementable
Acceptance criteria agreed upon
Story estimated
Legend
Unknown
Verified
Unsatisfied
Done Checklist
- Code is completed, reviewed, documented and checked in
- Unit and integration test automation have been delivered and running cleanly in continuous integration/staging/canary environment
- Continuous Delivery pipeline(s) is able to proceed with new code included
- Customer facing documentation, API docs etc. are produced/updated, reviewed and published
- Acceptance criteria are met