Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-12088

Observatorium API is susceptible to data loss on metrics remote write path

XMLWordPrintable

    • MCO Sprint 23, MCO Sprint 24
    • Critical
    • No

      Description of problem:

       

      Metrics forwarding should be retried from the metrics collector remote write client on non 2xx responses as per https://github.com/stolostron/multicluster-observability-operator/blob/main/collectors/metrics/pkg/metricsclient/metricsclient.go#L524-L563

       

      However, in the ACM fork of observatorium, we use a customised forwarder behind the gateway to forward requests to multiple remote write sinks. 

       

      However, as per https://github.com/stolostron/observatorium/pull/87 it appears the client can never get a non-200 response in any case. 

      Steps to Reproduce:

      1. Take the backend down
      2. Send remote write requests

      Actual results:

      Data is dropped

      Expected results:

      Requests are retried 

            pgough@redhat.com Philip Gough
            pgough@redhat.com Philip Gough
            Xiang Yin Xiang Yin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: