Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-22002

Doc how to enable Rightsizing for new MCOA metrics addon

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      Provide the required acceptance criteria using this template.

      • ...
      Show
      Provide the required acceptance criteria using this template. ...
    • None

      For new ACM 2.15 MCOA Addon (we will have GA for metrics)

      we need to create as much content as possible.

      This doc (can be also blog, etc) is about how to enable rightisizing.

      1. For custom metrics create a new Scrapeconfig

      apiVersion: monitoring.coreos.com/v1alpha1
      kind: ScrapeConfig
      metadata:
        name: acm-virtualization-metrics # A unique name for your ScrapeConfig
        namespace: open-cluster-management-observability # As specified in the documentation
        labels:
          app.kubernetes.io/component: platform-metrics-collector # For platform metrics
          app: metrics # Common label from your existing config
          app.kubernetes.io/managed-by: multicluster-observability-operator # From existing config
          app.kubernetes.io/part-of: multicluster-observability-addon # From existing config
          app.kubernetes.io/version: 1.0.0 # From existing config
          chart: metrics-1.0.0 # From existing config
          release: multicluster-observability-addon # From existing config
      spec:
        jobName: acm-virtualization # A descriptive job name
        metricsPath: /federate # As per the existing configuration for federated metrics
        params:
          match[]:
            # ACM Resource Claims (acm_rs) metrics
            - '{__name__="acm_rs:namespace:cpu_request_hard"}'
            - '{__name__="acm_rs:namespace:cpu_request"}'
            - '{__name__="acm_rs:namespace:cpu_usage"}'
            - '{__name__="acm_rs:namespace:cpu_recommendation"}'
            - '{__name__="acm_rs:namespace:memory_request_hard"}'
            - '{__name__="acm_rs:namespace:memory_request"}'
            - '{__name__="acm_rs:namespace:memory_usage"}'
            - '{__name__="acm_rs:namespace:memory_recommendation"}'
            - '{__name__="acm_rs:cluster:cpu_request_hard"}'
            - '{__name__="acm_rs:cluster:cpu_request"}'
            - '{__name__="acm_rs:cluster:cpu_usage"}'
            - '{__name__="acm_rs:cluster:cpu_recommendation"}'
            - '{__name__="acm_rs:cluster:memory_request_hard"}'
            - '{__name__="acm_rs:cluster:memory_request"}'
            - '{__name__="acm_rs:cluster:memory_usage"}'
            - '{__name__="acm_rs:cluster:memory_recommendation"}'
            # ACM Resource Claims VM (acm_rs_vm) metrics
            - '{__name__="acm_rs_vm:namespace:cpu_request"}'
            - '{__name__="acm_rs_vm:namespace:cpu_usage"}'
            - '{__name__="acm_rs_vm:namespace:memory_request"}'
            - '{__name__="acm_rs_vm:namespace:memory_usage"}'
            - '{__name__="acm_rs_vm:namespace:cpu_recommendation"}'
            - '{__name__="acm_rs_vm:namespace:memory_recommendation"}'
            - '{__name__="acm_rs_vm:cluster:cpu_request"}'
            - '{__name__="acm_rs_vm:cluster:cpu_usage"}'
            - '{__name__="acm_rs_vm:cluster:memory_request"}'
            - '{__name__="acm_rs_vm:cluster:memory_usage"}'
            - '{__name__="acm_rs_vm:cluster:cpu_recommendation"}'
            - '{__name__="acm_rs_vm:cluster:memory_recommendation"}'
        scheme: HTTPS # From existing configuration
        scrapeClass: ocp-monitoring # From existing configuration
        staticConfigs:
          - targets:
              - prometheus-k8s.openshift-monitoring.svc:9091 # From existing configuration
      

      2. scrapeconfig must be referenced:

      # 2. Strategic Merge Patch for ClusterManagementAddOn
      # Apply this YAML to your HUB CLUSTER.
      # This will SAFELY add the new ScrapeConfig reference to your existing
      # 'multicluster-observability-addon'.
      apiVersion: addon.open-cluster-management.io/v1alpha1
      kind: ClusterManagementAddOn
      metadata:
        name: multicluster-observability-addon
        namespace: open-cluster-management-observability # Ensure this matches your ClusterManagementAddOn's namespace
      spec:
        installStrategy:
          placements:
            - name: global # Name of the existing placement
              namespace: open-cluster-management-global-set # <<< ADDED: Required namespace from your ClusterManagementAddOn
              configs: # Path to the list of configs
                - $patch: append # Directive to add to the list
                  group: monitoring.coreos.com
                  resource: scrapeconfigs
                  name: acm-virtualization-metrics # Must match your ScrapeConfig's name
                  namespace: open-cluster-management-observability # Must match your ScrapeConfig's namespace
      

      3. verification

      Now I can see it is found:

      • desiredConfig:
        name: acm-virtualization-metrics
        namespace: open-cluster-management-observability
        specHash: 71c5b4244e7c13e1cd715901c811baf74359e8dc1060032d049fca17e727ef7f
        And we see it is deployed in the open-cluster-management-agent-addon namespace.
        Then, from the terminal of the prom-agent-platform-metrics-collector-0 pod in the open-cluster-management-agent-addon namespace, i see it is picked up:
      $ less /etc/prometheus/config_out/prometheus.env.yaml
      
      global:
        scrape_interval: 120s
        scrape_timeout: 30s
        external_labels:
          prometheus: open-cluster-management-agent-addon/platform-metrics-collector
          prometheus_replica: prom-agent-platform-metrics-collector-0
      scrape_configs:
      - job_name: scrapeConfig/open-cluster-management-agent-addon/acm-virtualization-metrics
        metrics_path: /federate
        params:
          match[]:
          - '{__name__="acm_rs:namespace:cpu_request_hard"}'
          - '{__name__="acm_rs:namespace:cpu_request"}'
      

      1. - [ ] Mandatory: Add the required version to the Fix version/s field.

      2. - [ ] Mandatory: Choose the type of documentation change or review.

      • [ ] We need to update to an existing topic
      • [ ] We need to add a new document to an existing section
      • [ ] We need a whole new section; this is a function not
        documented before and doesn't belong in any current section
      • [ ] We need an Operator Advisory review and approval
      • [ ] We need a z-Stream (Errata) Advisory and Release note for
        MCE and/or ACM

      3. - [ ] Mandatory: Find the link to where the documentation update
      should go and add it to the recommended changes. You can either use the
      published doc or the staged repo for this step:

      Note: As the feature and doc is understood, this recommendation may
      change. If this is new documentation, link to the section where you think
      it should be placed.

      Customer Portal published version

      https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12

      Doc staged repo within the ACM Workspace:
      https://github.com/stolostron/rhacm-docs

      4. - [ ] Mandatory for GA content:

      • [ ] Add steps, the diff, known issue, and/or other important
        conceptual information in the following space:
      • [ ] *Add Required access level *(example, *Cluster
        Administrator*) for the user to complete the task:
      • [ ] Add verification at the end of the task, how does the user
        verify success (a command to run or a result to see?)
      • [ ] Add link to dev story here:

      5. - [ ] Mandatory for bugs: What is the diff? Clearly define what the
      problem is, what the change is, and link to the current documentation. Only
      use this for a documentation bug.

              Unassigned Unassigned
              rhn-support-cstark Christian Stark
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: