Details
-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Invalid Sprint
Description
Ops have requested us to review some of the new Prometheus metrics of Zync, so que jobs are not accounted twice in different types of que_jobs_scheduled_total
See: https://github.com/3scale/platform/issues/230
Summary of the improvements requested:
- Make "scheduled" not to include failed jobs scheduled for retry
- Make "failed" to count failed jobs scheduled for retry and not expired ones
- Introduce a new type "expired" for failed jobs that already ran out of attempts to retry and therefore won't be retried again
- Remove the type "retried" since it seems not be updated by Que, which handles retries based on the value of the error_count column only.