Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-12250

Prepare for Thanos v0.35.0 in the MCO

XMLWordPrintable

    • False
    • None
    • False
    • No

      What

      We rolled out Thanos v0.35.1 to our production environment and hit issues in terms of ingest errors and CPU usage on the Thanos Receive component.

       

      Why

      https://github.com/thanos-io/thanos/pull/7045 introduced the "receive.forward.async-workers" flag with a default value of 5 with seems not be sufficient for high-scale environments. The default was rolled out to our MST instance without issues.

       

      How

      Add a param for this flag into the template in https://github.com/rhobs/configuration

      Try to determine from existing metrics and traces the suitable value for that field in both telemeter prod and hypershift prod instances.

      A.C

      1. The flag is added as a template param
      2. Thanos v0.35.0 is rolled out to all prod envs and ingest is reliable with the value chosen

       

            rh-ee-rfloren Roger Florén
            pgough@redhat.com Philip Gough
            Xiang Yin Xiang Yin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: