Loading...

Type: Story
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: Documentation, Obs-analytics
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Acceptance Criteria:
Hide

Provide the required acceptance criteria using this template.

...
Show
Provide the required acceptance criteria using this template. ...
Intelligence Requested:
Market:

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

PX Impact Score:

For new ACM 2.15 MCOA Addon (we will have GA for metrics)

we need to create as much content as possible.

This doc (can be also blog, etc) is about how to enable rightisizing.

1. For custom metrics create a new Scrapeconfig

apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: acm-virtualization-metrics # A unique name for your ScrapeConfig
  namespace: open-cluster-management-observability # As specified in the documentation
  labels:
    app.kubernetes.io/component: platform-metrics-collector # For platform metrics
    app: metrics # Common label from your existing config
    app.kubernetes.io/managed-by: multicluster-observability-operator # From existing config
    app.kubernetes.io/part-of: multicluster-observability-addon # From existing config
    app.kubernetes.io/version: 1.0.0 # From existing config
    chart: metrics-1.0.0 # From existing config
    release: multicluster-observability-addon # From existing config
spec:
  jobName: acm-virtualization # A descriptive job name
  metricsPath: /federate # As per the existing configuration for federated metrics
  params:
    match[]:
      # ACM Resource Claims (acm_rs) metrics
      - '{__name__="acm_rs:namespace:cpu_request_hard"}'
      - '{__name__="acm_rs:namespace:cpu_request"}'
      - '{__name__="acm_rs:namespace:cpu_usage"}'
      - '{__name__="acm_rs:namespace:cpu_recommendation"}'
      - '{__name__="acm_rs:namespace:memory_request_hard"}'
      - '{__name__="acm_rs:namespace:memory_request"}'
      - '{__name__="acm_rs:namespace:memory_usage"}'
      - '{__name__="acm_rs:namespace:memory_recommendation"}'
      - '{__name__="acm_rs:cluster:cpu_request_hard"}'
      - '{__name__="acm_rs:cluster:cpu_request"}'
      - '{__name__="acm_rs:cluster:cpu_usage"}'
      - '{__name__="acm_rs:cluster:cpu_recommendation"}'
      - '{__name__="acm_rs:cluster:memory_request_hard"}'
      - '{__name__="acm_rs:cluster:memory_request"}'
      - '{__name__="acm_rs:cluster:memory_usage"}'
      - '{__name__="acm_rs:cluster:memory_recommendation"}'
      # ACM Resource Claims VM (acm_rs_vm) metrics
      - '{__name__="acm_rs_vm:namespace:cpu_request"}'
      - '{__name__="acm_rs_vm:namespace:cpu_usage"}'
      - '{__name__="acm_rs_vm:namespace:memory_request"}'
      - '{__name__="acm_rs_vm:namespace:memory_usage"}'
      - '{__name__="acm_rs_vm:namespace:cpu_recommendation"}'
      - '{__name__="acm_rs_vm:namespace:memory_recommendation"}'
      - '{__name__="acm_rs_vm:cluster:cpu_request"}'
      - '{__name__="acm_rs_vm:cluster:cpu_usage"}'
      - '{__name__="acm_rs_vm:cluster:memory_request"}'
      - '{__name__="acm_rs_vm:cluster:memory_usage"}'
      - '{__name__="acm_rs_vm:cluster:cpu_recommendation"}'
      - '{__name__="acm_rs_vm:cluster:memory_recommendation"}'
  scheme: HTTPS # From existing configuration
  scrapeClass: ocp-monitoring # From existing configuration
  staticConfigs:
    - targets:
        - prometheus-k8s.openshift-monitoring.svc:9091 # From existing configuration

2. scrapeconfig must be referenced:

—

# 2. Strategic Merge Patch for ClusterManagementAddOn
# Apply this YAML to your HUB CLUSTER.
# This will SAFELY add the new ScrapeConfig reference to your existing
# 'multicluster-observability-addon'.
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: multicluster-observability-addon
  namespace: open-cluster-management-observability # Ensure this matches your ClusterManagementAddOn's namespace
spec:
  installStrategy:
    placements:
      - name: global # Name of the existing placement
        namespace: open-cluster-management-global-set # <<< ADDED: Required namespace from your ClusterManagementAddOn
        configs: # Path to the list of configs
          - $patch: append # Directive to add to the list
            group: monitoring.coreos.com
            resource: scrapeconfigs
            name: acm-virtualization-metrics # Must match your ScrapeConfig's name
            namespace: open-cluster-management-observability # Must match your ScrapeConfig's namespace

3. verification

Now I can see it is found:

desiredConfig:
name: acm-virtualization-metrics
namespace: open-cluster-management-observability
specHash: 71c5b4244e7c13e1cd715901c811baf74359e8dc1060032d049fca17e727ef7f
And we see it is deployed in the open-cluster-management-agent-addon namespace.
Then, from the terminal of the prom-agent-platform-metrics-collector-0 pod in the open-cluster-management-agent-addon namespace, i see it is picked up:

$ less /etc/prometheus/config_out/prometheus.env.yaml

global:
  scrape_interval: 120s
  scrape_timeout: 30s
  external_labels:
    prometheus: open-cluster-management-agent-addon/platform-metrics-collector
    prometheus_replica: prom-agent-platform-metrics-collector-0
scrape_configs:
- job_name: scrapeConfig/open-cluster-management-agent-addon/acm-virtualization-metrics
  metrics_path: /federate
  params:
    match[]:
    - '{__name__="acm_rs:namespace:cpu_request_hard"}'
    - '{__name__="acm_rs:namespace:cpu_request"}'

1. - [ ] Mandatory: Add the required version to the Fix version/s field.

2. - [ ] Mandatory: Choose the type of documentation change or review.

[ ] We need to update to an existing topic

[ ] We need to add a new document to an existing section

[ ] We need a whole new section; this is a function not
documented before and doesn't belong in any current section

[ ] We need an Operator Advisory review and approval

[ ] We need a z-Stream (Errata) Advisory and Release note for
MCE and/or ACM

3. - [ ] Mandatory: Find the link to where the documentation update
should go and add it to the recommended changes. You can either use the
published doc or the staged repo for this step:

Note: As the feature and doc is understood, this recommendation may
change. If this is new documentation, link to the section where you think
it should be placed.

Customer Portal published version

https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12

Doc staged repo within the ACM Workspace:
https://github.com/stolostron/rhacm-docs

4. - [ ] Mandatory for GA content:

[ ] Add steps, the diff, known issue, and/or other important
conceptual information in the following space:

[ ] *Add Required access level *(example, *Cluster
Administrator*) for the user to complete the task:

[ ] Add verification at the end of the task, how does the user
verify success (a command to run or a result to see?)

[ ] Add link to dev story here:

5. - [ ] Mandatory for bugs: What is the diff? Clearly define what the
problem is, what the change is, and link to the current documentation. Only
use this for a documentation bug.

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates