-
Story
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
False
-
False
-
NEW
-
NEW
-
-
Monitoring - Sprint 213
RHOBS queries are wide ranging in terms of resource requirements and Thanos Query components would need to evolve to handle concurrent queries.
As the resource requirements can be very different for each incoming query, the goal is to implement a knowledge system for understanding resource requirements for various queries and minimize the error rate; How to schedule queries to minimize errors due to resource limits.
Why is this important problem to solve ?
- Ability to handle all kinds of queries with reduced # of failures helps to define and improve SLOs for query (read) path.
Approach
One approach (project name : prom-router) to solve this problem is to route incoming queries to Thanos Query pods of appropriate resource sizes.
Proposal for how prom-router can reduce query failures is documented here