Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-2693

Validate potential implementation of scrape profiles

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • None
    • None
    • False
    • None
    • False
    • NEW
    • NEW
    • Sprint 224, MON Sprint 225, MON Sprint 226

      The user interface would give users the option to specify a profile they want CMO to scrape. The set of possible profiles will be pre-defined by us.

      If this new option is used, CMO populates the [pod|service]MonitorSelector to select resources that carry the requested profile, probably as a label with the respective value (label name tbd, lets call it the profile label for now), and monitors that do not have the label set at all. So monitors will be picked from two sets: a monitor with the profile label and the requests label value and all monitors without the profile label present (additionally to the current namespace selector).

      After this it is up to the ServiceMonitors to implement the scrape profiles. Without any change to the ServiceMonitors, even after setting a profile in the CMO config, things should work as they did before. When a ServiceMonitor owner wants to implement scrape profiles, they needs to provide ServiceMonitors for all profiles and no unlabeled ServiceMonitor. If a profile label is not used, this ServiceMonitor will not be scraped at all for a given profile.

      Let's say that we support 3 scrape profiles:

      • "full" (same as today)
      • "operational" (only collect metrics for recording rules and dashboards)
      • "uponly" (collect the up metric only and none of the exposed metrics)

      When the cluster admin enables the "operational" profile, the k8s Prometheus resource would be

      apiVersion: monitoring.coreos.com/v1
      kind: Prometheus
      metadata:
        name: k8s
        namespace: openshift-monitoring
      spec:
        serviceMonitorSelector:
          matchExpressions:
          - key: monitoring.openshift.io/scrape-profile
            operator: NotIn
            values:
            - "full"
            - "uponly"
      

      An hypothetical component that want to support the scrape profiles would need to provision 3 service monitors for each service (1 service monitor per profile).

      ---
      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        labels:
          monitoring.openshift.io/scrape-profile: full
        name: foo-full
        namespace: openshift-bar
      spec:
        endpoints:
          port: metrics
        selector:
          matchLabels:
            app.kubernetes.io/name: foo
      ---
      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        labels:
          monitoring.openshift.io/scrape-profile: operational
        name: foo-operational
        namespace: openshift-bar
        metricRelabelings:
        - sourceLabels: [__name__]
          action: keep
          regex: "requests_total|requests_failed_total"
      spec:
        endpoints:
          port: metrics
        selector:
          matchLabels:
            app.kubernetes.io/name: foo
      ---
      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        labels:
          monitoring.openshift.io/scrape-profile: uponly
        name: foo-uponly
        namespace: openshift-bar
      spec:
        endpoints:
          port: metrics
        metricRelabelings:
        - sourceLabels: [__name__]
          action: drop
          regex: ".+"
        selector:
          matchLabels:
            app.kubernetes.io/name: foo
      

       

      A component that doesn't need/want to adopt scrape profile should be scraped as before irrespective of the configured scrape profile.

       

      AI

      • Demonstrate that the proposed implementation actually works.

              jmarcal@redhat.com Joao Marcal
              spasquie@redhat.com Simon Pasquier
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: