-
Story
-
Resolution: Done
-
Normal
-
None
-
None
-
None
-
False
-
None
-
False
TRT-347 outlines the background behind this card.
Original work for this ran a bunch of jobs on many platforms, measured the number of watches per operator, doubled each and gave 10% and recorded that data in the code. We then test that each operator does not exceed that limit.
The stated goal was to watch for exponential growth.
This leaves us in the unfortunate position where we violate the rule by a little, ignore it for some time, then someone has to manually go hunt for violations and manually update the value. Catching real problems may have happened but is likely exceedingly rare. Violations also can vary wildly, sometimes 1-2%, sometimes 100%+ and then back to normal. This causes test failures that are meaningless unless placed into the context of a trend.
After discussion with jchaloup@redhat.com in our team meeting today we reached some conclusions, see the attached tasks.
- is related to
-
TRT-347 Investigate: [sig-arch][Late] operators should not create watch channels very often
- Closed
1.
|
Provide TRT with script and instructions on how to process audit log data and calculate percentiles | Closed | Jan Chaloupka | ||
2.
|
Upload data to sippy's database in must-gather CI step | Closed | Unassigned | ||
3.
|
Add sippy API to fetch percentiles for each operator's watch requests | Closed | Unassigned | ||
4.
|
Add sippy UI to graph percentile growth over time for each operator. | Closed | Unassigned | ||
5.
|
Automate updating the limits in origin from sippy db data periodically. | Closed | Unassigned |