Uploaded image for project: 'Managed Service - API'
  1. Managed Service - API
  2. MGDAPI-4919

Rely on Openshift user defined workload for monitoring

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None

      Specifically for the prometheus and metric scrapping, instead of deploying a custom dedicated promethus instance using the observability operator, leverage OCP feature to monitor user defined projects.

      OCP 4.11 https://docs.openshift.com/container-platform/4.11/monitoring/enabling-monitoring-for-user-defined-projects.html

       

      The Thanos Querier should enable grafana dashboards and Prometheus Rules (alerts) work out of the box for metrics coming for both user workload metrics and platform metrics (like CPU, mem usage, ...)

       

      Relying on Openshift user defined workload brings three direct benefits:

      • No federation is needed between the dedicated prometheus and OCP prometheus (for platform metrics)
      • No need to copy monitoring resources to the observability ns from 3scale or keycloak namespaces
      • No need to manage a dedicated instance of prometheus.

      Grafana instance is still needed for dashboards.

       

      Outstanding Questions

      1. Does our Alert Manager still work i.e. sending alerts to SRE. Can we leverage the user workload Alert Manager also? 
      2. Are firing RHOAM alerts visible within the cluster / OCM when using the user workload prom? Need to check if this is ok with the BU?
      3. Is this bug still an issue? 

       

      Initial scope of this Jira

      First determine if point 3 above is still an issue - if yes, no point to continue. If no, get an answer from the BU

       

              Unassigned Unassigned
              eguzki Eguzki Astiz Lezaun
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: