Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-5071

Support Prometheus AgentMode for user-workload monitoring

    XMLWordPrintable

Details

    • Feature Request
    • Resolution: Unresolved
    • Normal
    • None
    • openshift-4.13, openshift-4.14.z
    • Monitoring
    • None
    • False
    • None
    • False
    • Not Selected
    • 0
    • 0% 0%

    Description

      Proposed title of this feature request: 

      Support Prometheus AgentMode for user-workload monitoring

       

      Who is the end customer behind the request? 

      Lufthansa Technik AG

       

      What is the nature and description of the request?

      Currently both prometheus installations are based on cluster-monitoring-operator (ref: https://github.com/openshift/cluster-monitoring-operator), which allows us to configure remote_write targets which is basically supported by Prometheus itself.
      Cu has implemented an Observability solution (based on region distributed, multi-tenant Mimir clusters using cost-effective, reliable S3 storage for longterm data) which combines all data from all their different OpenShift clusters and so implemented remote_write. 
      Thus cu don't need any persistence within Prometheus and would love to reduce the compute footprint from all Prometheus containers. Achieving long-term storage for the data with Prometheus "hunger for compute" (think of memory required to store millions of series for e.g. 6 month or longer) is not cost-effective. 
      For that reason it would be great when the agent-mode (--enable-feature=agent) would be supported, or more general extend the cluster-monitoring-operator to support command + args as overwrites for user-workload. A Prometheus running in --enable-feature=agent is limited to discovery, scrape and remote write. That's exactly what cu need, as they don't query the data via Thanos anymore.

       

      Why does the end customer need this? (List the detailed business requirement here)
      Reduce compute footprint, use Prometheus just in the way they need, collect and forward data. At the end it's also more cost-effective for the cluster in general apart from the reduced complexity when using remote_write config.

      Does the customer have any specific timeline dependencies and which release would they like to target?
      YES, would like to see support for agent-mode until the end of 2024!

      List any affected packages or components.
      Prometheus as part of cluster-monitoring-operator needs to support command/args overwrites; ref: https://github.com/openshift/cluster-monitoring-operator/blob/master/Documentation/api.md#prometheusrestrictedconfig

      Would the end customer be able to assist in testing this functionality if implemented?
      Always

      Attachments

        Issue Links

          Activity

            People

              rh-ee-rfloren Roger Florén
              nikijain@redhat.com Nikita Jain
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: