-
Task
-
Resolution: Unresolved
-
Normal
-
None
-
None
Context:
We can't detect and alert when the prometheus agent fails to federate metrics from the in-cluster prometheus server because:
- The federation client is not emitting the metrics we need (on the agent side)
- The federation server is emitting them but not split by client. So we can't reliably alert using it because it might be another federation client failing to federate metrics than MCOA.
- I didn't find any other proxy metric for reliably alert on that.
Acceptance:
Prometheus emits usual metrics for the federation client, like the http errors so that we can create the needed alerts for MCOA.