Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-4975

Create a recording rule for acm_managed_cluster_info metric

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Normal Normal
    • Future
    • MCE 2.3.0
    • Server Foundation
    • None
    • False
    • None
    • False
    • Not Selected

      OCP/Telco Definition of Done
      <https://docs.google.com/document/d/1TP2Av7zHXz4_fmeX4q9HB0m9cqSZ4F6Jd4AiVoaF_2s/edit#heading=h.gaa58bzbvwde>
      Epic Template descriptions and documentation.
      <https://docs.google.com/document/d/14CUCEg6hQ_jpsFzJtWo29GfFVWmun2Uivrxq3_Fkgdg/edit>
      ACM-wide Product Requirements (Top-level Epics)
      <https://docs.google.com/document/d/1uIp6nS2QZ766UFuZBaC9USs8dW_I5wVdtYF9sUObYKg/edit>

      *<--- Cut-n-Paste the entire contents of this description into your new
      Epic --->*

      Epic Goal

      When reviewing changes of acm_managed_cluster_info metric with monitoring team, they suggested us to create a recording rule for this metric. With this rule, we don't have to review each change of the metric with them.

      In our current iteration of instructions about how to send data to telemetry, we have a (as of yet) optional step to create a recording rule before adding this to the telemetry pipeline. Maybe this here is the right time to add this for acm_managed_cluster_info as well?

      If we had this in place you could easily add labels to you metric, without being worried blowing up telemetry and we'd have explicit control over what gets send.

      Guidance from the monitoring team about how to create the recording rule.

      There is probably an operator that deploys the related components (like https://github.com/stolostron/clusterlifecycle-state-metrics). This operator should also create a PrometheusRule object with the respective recording rule.
      For an example of such a rule see https://github.com/openshift/cluster-monitoring-operator/pull/1925#discussion_r1150597974

       
      As for the expression the recording rule should implement: Here as a recent example of what we are after: https://github.com/openshift/cluster-monitoring-operator/blob/2dd8796cca1459b97cb8799df82df163d384c5d7/jsonnet/rules.libsonnet#L613

      This rule ensures that even if a new label value is added to the vsphere_csi_migration metric, we'll only forward the approved label set that is listed in the recording rule. This way new label values aren't automatically send to telemetry.

      For the ACM rule, maybe a name like cluster:acm_managed_cluster_info makes sense?

      The original jira: https://issues.redhat.com/browse/MON-3115

      Why is this important?

      ...

      Scenarios

      ...

      Acceptance Criteria

      ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      1. ...

      Open questions:

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
        Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub
        Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

              leyan@redhat.com Le Yang
              leyan@redhat.com Le Yang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: