Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-4820

ACMRemoteWriteError fires on 409 unrecoverable errors

XMLWordPrintable

    • Observability Sprint 2023-11, Observability Sprint 2023-15
    • Low
    • +
    • No

      Description of problem:

      With the upgrade to 2.7 customers are now noticing more errors with metrics forwarding. In this case,

       

      2023-03-31T23:00:59.964080253Z level=error name=observatorium ts=2023-03-31T23:00:59.9640122Z caller=logchannel.go:133 msg="failed to forward metrics" returncode="409 Conflict" response="3 errors: replicate write request for endpoint ob
      servability-thanos-receive-default-1.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict; replicate write request for endpoint observability-thanos-receive-def
      ault-2.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict; replicate write request for endpoint observability-thanos-receive-default-0.observability-thanos-re
      ceive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict\n" url=http://observability-thanos-receive.open-cluster-management-observability.svc.cluster.local:19291/api/v1/receive
      
      2023-03-31T23:01:01.808996469Z level=error name=observatorium ts=2023-03-31T23:01:01.808943137Z caller=logchannel.go:133 msg="failed to forward metrics" returncode="409 Conflict" response="2 errors: replicate write request for endpoint 
      observability-thanos-receive-default-1.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict; replicate write request for endpoint observability-thanos-receive-d
      efault-0.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict\n" url=http://observability-thanos-receive.open-cluster-management-observability.svc.cluster.local
      :19291/api/v1/receive
       

       

       

      Version-Release number of selected component (if applicable):

      2.7

      How reproducible:

      customer environment

      Steps to Reproduce:

      1.  set up observation for multiple clusters
      2.  
      3. ...

      Actual results:

      errors due to quorum not reached while forwarding metrics

      2023-03-31T23:00:59.964080253Z level=error name=observatorium ts=2023-03-31T23:00:59.9640122Z caller=logchannel.go:133 msg="failed to forward metrics" returncode="409 Conflict" response="3 errors: replicate write request for endpoint ob
      servability-thanos-receive-default-1.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict; replicate write request for endpoint observability-thanos-receive-def
      ault-2.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict; replicate write request for endpoint observability-thanos-receive-default-0.observability-thanos-re
      ceive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict\n" url=http://observability-thanos-receive.open-cluster-management-observability.svc.cluster.local:19291/api/v1/receive
      
      2023-03-31T23:01:01.808996469Z level=error name=observatorium ts=2023-03-31T23:01:01.808943137Z caller=logchannel.go:133 msg="failed to forward metrics" returncode="409 Conflict" response="2 errors: replicate write request for endpoint 
      observability-thanos-receive-default-1.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict; replicate write request for endpoint observability-thanos-receive-d
      efault-0.observability-thanos-receive-default.open-cluster-management-observability.svc.cluster.local:10901: quorum not reached: conflict\n" url=http://observability-thanos-receive.open-cluster-management-observability.svc.cluster.local
      :19291/api/v1/receive
       

      Expected results:

      better handling of duplicate results

      Additional info:

      we need help investigate into those errors.

              pgough@redhat.com Philip Gough
              rhn-support-fdewaley Felix Dewaleyne
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: