-
Feature
-
Resolution: Unresolved
-
Normal
-
MCE 2.3.0
-
None
-
False
-
None
-
False
-
Not Selected
OCP/Telco Definition of Done
<https://docs.google.com/document/d/1TP2Av7zHXz4_fmeX4q9HB0m9cqSZ4F6Jd4AiVoaF_2s/edit#heading=h.gaa58bzbvwde>
Epic Template descriptions and documentation.
<https://docs.google.com/document/d/14CUCEg6hQ_jpsFzJtWo29GfFVWmun2Uivrxq3_Fkgdg/edit>
ACM-wide Product Requirements (Top-level Epics)
<https://docs.google.com/document/d/1uIp6nS2QZ766UFuZBaC9USs8dW_I5wVdtYF9sUObYKg/edit>
*<--- Cut-n-Paste the entire contents of this description into your new
Epic --->*
Epic Goal
When reviewing changes of acm_managed_cluster_info metric with monitoring team, they suggested us to create a recording rule for this metric. With this rule, we don't have to review each change of the metric with them.
In our current iteration of instructions about how to send data to telemetry, we have a (as of yet) optional step to create a recording rule before adding this to the telemetry pipeline. Maybe this here is the right time to add this for acm_managed_cluster_info as well?
If we had this in place you could easily add labels to you metric, without being worried blowing up telemetry and we'd have explicit control over what gets send.
Guidance from the monitoring team about how to create the recording rule.
There is probably an operator that deploys the related components (like https://github.com/stolostron/clusterlifecycle-state-metrics). This operator should also create a PrometheusRule object with the respective recording rule.
For an example of such a rule see https://github.com/openshift/cluster-monitoring-operator/pull/1925#discussion_r1150597974
As for the expression the recording rule should implement: Here as a recent example of what we are after: https://github.com/openshift/cluster-monitoring-operator/blob/2dd8796cca1459b97cb8799df82df163d384c5d7/jsonnet/rules.libsonnet#L613This rule ensures that even if a new label value is added to the vsphere_csi_migration metric, we'll only forward the approved label set that is listed in the recording rule. This way new label values aren't automatically send to telemetry.
For the ACM rule, maybe a name like cluster:acm_managed_cluster_info makes sense?
The original jira: https://issues.redhat.com/browse/MON-3115
Why is this important?
...
Scenarios
...
Acceptance Criteria
...
Dependencies (internal and external)
- ...
Previous Work (Optional):
- ...
Open questions:
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
Issue> - DEV - Upstream documentation merged: <link to meaningful PR or GitHub
Issue> - DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>
- is triggered by
-
MON-3115 Request approval to add a new label to acm_managed_cluster_info metric.
- Closed