Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-2879

Enhance foundation metrics to facilitate the creation of SD SLI/SLO

XMLWordPrintable

    • False
    • None
    • False
    • Hide
      Provide the required acceptance criteria using this template.
      * Expose acm_managed_cluster_addon_status_condition metrics for add-ons per managed cluster;
      * Report continuous data of acm_managed_cluster_status_condition / acm_managed_cluster_addon_status_condition / acm_manifestwork_status_condition metrics;
      Show
      Provide the required acceptance criteria using this template. * Expose acm_managed_cluster_addon_status_condition metrics for add-ons per managed cluster; * Report continuous data of acm_managed_cluster_status_condition / acm_managed_cluster_addon_status_condition / acm_manifestwork_status_condition metrics;
    • No

      Value Statement

      In order to simplify the creation of SLI/SLO for SD, the following enhancements on foundation metrics are required:

      1. Currently the acm_managed_cluster_addon_status_condition metrics are only generated for work-manager. Actually it is required by other add-ons, like hypershift/policy add-ons, as well.

      2. Report continuous data of acm_managed_cluster_status_condition / acm_managed_cluster_addon_status_condition / acm_manifestwork_status_condition metrics for all possible status values of a certain condition. For example, currently only one metric item is reported for a ManagedCluster that has a condition ManagedClusterConditionAvailable with status True:
      acm_managed_cluster_status_condition{managed_cluster_name="local-cluster",condition="ManagedClusterConditionAvailable",status="true"} 1

      while usually 3 data items are desired and 2 of them have value 0, which makes it easy to create SLO based on this condition.
      acm_managed_cluster_status_condition{managed_cluster_name="local-cluster",condition="ManagedClusterConditionAvailable",status="true"} 1
      acm_managed_cluster_status_condition{managed_cluster_name="local-cluster",condition="ManagedClusterConditionAvailable",status="false"} 0
      acm_managed_cluster_status_condition{managed_cluster_name="local-cluster",condition="ManagedClusterConditionAvailable",status="unknown"} 0{}

      Since some of the conditions ((see a list below)) only support two status values: true and false, there will be no metric generated for status unknown.

      • managed cluster
        • HubAcceptedManagedCluster
        • HubDeniedManagedCluster
        • ManagedClusterJoined
        • ManagedClusterImportSucceeded
        • ExternalManagedKubeconfigCreatedSucceeded
      • add-on
        • RegistrationApplied
        • ManifestApplied
        • ClusterCertificateRotated
        • UnsupportedConfiguration

      Definition of Done for Engineering Story Owner (Checklist)

      • ...

      Development Complete

      • The code is complete.
      • Functionality is working.
      • Any required downstream Docker file changes are made.

      Tests Automated

      • [ ] Unit/function tests have been automated and incorporated into the
        build.
      • [ ] 100% automated unit/function test coverage for new or changed APIs.

      Secure Design

      • [ ] Security has been assessed and incorporated into your threat model.

      Multidisciplinary Teams Readiness

      Support Readiness

      • [ ] The must-gather script has been updated.

              leyan@redhat.com Le Yang
              leyan@redhat.com Le Yang
              David Huynh David Huynh
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: