Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-1988

Enable audit and query logging for all prometheus read paths

XMLWordPrintable

    • audit and query logging
    • False
    • False
    • NEW
    • Done
    • NEW
    • 0% To Do, 0% In Progress, 100% Done

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      Epic Goal

      • As a CFE team, we would like to enable query logging for all Prometheus read paths
      • As part of this, we would like to enable audit & query logging for Prometheus Adapter(aggregated server audit log), Prometheus(query log) and ThanosQuerier(query log)

      Why is this important?

      • This would help all parties(customers, app-sres, CCX, monitoring team,..) to debug an overloaded Prometheus instance.

      Scenarios

      1. When a customer faces a high cpu consumption in any of the Prometheus instance, they can enable audit logging in Prometheus Adapter to see which component is calling metrics API
      2. When a customer faces a high cpu consumption in any of the Prometheus instance, they can enable query logging in all Prometheus instances(PM & UWM) and ThanosQuerier to see which query is frequently executed
      3. https://bugzilla.redhat.com/show_bug.cgi?id=1982302

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • Prometheus Adapter audit logs must be enabled by default
      • Prometheus Adapter audit logs must be preserved after each CI run

      Open questions::

      1. Should we enable ThanosRuler query logs?

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

              arajkumar Arunprasad Rajkumar (Inactive)
              arajkumar Arunprasad Rajkumar (Inactive)
              Junqi Zhao Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: