-
Bug
-
Resolution: Done
-
Major
-
ACM 2.8.Z, ACM 2.11.0, ACM 2.10.Z, ACM 2.9.Z
-
1
-
False
-
None
-
False
-
-
-
MCO Sprint 23, MCO Sprint 24
-
Critical
-
No
Description of problem:
Metrics forwarding should be retried from the metrics collector remote write client on non 2xx responses as per https://github.com/stolostron/multicluster-observability-operator/blob/main/collectors/metrics/pkg/metricsclient/metricsclient.go#L524-L563
However, in the ACM fork of observatorium, we use a customised forwarder behind the gateway to forward requests to multiple remote write sinks.
However, as per https://github.com/stolostron/observatorium/pull/87 it appears the client can never get a non-200 response in any case.
Steps to Reproduce:
- Take the backend down
- Send remote write requests
Actual results:
Data is dropped
Expected results:
Requests are retried
- clones
-
ACM-12087 Observatorium API is susceptible to data loss on metrics remote write path
- Testing