Loading...

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: ACM 2.10.0, ACM 2.11.0, ACM 2.12.0, ACM 2.13.0
Affects Version/s: ACM 2.10.0, ACM 2.9.0, ACM 2.11.0, ACM 2.12.0
Component/s: Documentation
Labels:

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

PX Impact Score:
PX Review Complete:

Note: Doc team updates the current version of the documentation and the
two previous versions (n-2), but we address *only high-priority, or
customer-reported issues* for -2 releases in support.
Describe the changes in the doc and link to your dev story:

1. - [X] Mandatory: Add the required version to the Fix version/s field.

2. - [X] Mandatory: Choose the type of documentation change or review.

[X] We need to update to an existing topic

[ ] We need to add a new document to an existing section

[ ] We need a whole new section; this is a function not
documented before and doesn't belong in any current section

[ ] We need an Operator Advisory review and approval

[ ] We need a z-Stream (Errata) Advisory and Release note
for MCE and/or ACM

3. - [X] *Mandatory: *Use the following link to open the doc and find where the
documentation update should go. Note: As the feature and doc is
understood and developed, this placement decision may change:

Published doc: https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.10
Source: https://github.com/stolostron/rhacm-docs

Update should go for all versions under Observability documentation, in "Customize observability configuration" > "Adding custom metrics" aka in observability/customize_observability.adoc

4. - [ ] Mandatory for GA content:

[ ] Add steps, the diff, known issue, and/or other important
conceptual information in the following space:

[ ] *Add Required access level *(example, *Cluster
Administrator*) for the user to complete the task:

[ ] Add verification at the end of the task, how does the user
verify success (a command to run or a result to see?)

[ ] Add link to dev story here:

5. - [X] Mandatory for bugs: What is the diff? Clearly define what the
problem is, what the change is, and link to the current documentation. Only
use this for a documentation bug.

There are several problems with the documentation. In agreement with engineering, here is my proposition for entirely replacing the chapter. I tried to keep the format to match what was in observability/customize_observability.adoc while using the data I published in the article https://access.redhat.com/solutions/7099641

adding-custom-metrics
== Adding custom metrics

To monitor metrics from a remote cluster using RHACM, you first need to know if the metric is being exported as a `platform` or a `user workload` metric. This should be documented for the solution you want to monitor or be stomething support for that product should be able to tell you.
+
If information on how to monitor your solution is not available, you can identify the type of metric by looking at the console of the cluster, under `Observe > Metrics` the `prometheus` column should show what it originates from ; `user workload` metrics are identified as `openshift-user-workload-monitoring` while `platform` metrics would be listed as what provides them.
You may also look at the `ServiceMonitor` for the observed resource and see which annotation it uses :

`operator.prometheus.io/controller-id: openshift-user-workload-monitoring/prometheus-operator` means this is `user workload`
`operator.prometheus.io/controller-id: openshift-platform-monitoring/prometheus-operator` means this is `platform`
+
After you know what type of metric you need to setup RHACM to monitor, follow the steps from the appropriate documentation.

—

adding-platform-metrics
=== Adding Platform metrics

Platform metrics can be monitored by creating a `ConfigMap` on the hub cluster in the `open-cluster-management-observability` namespace named `observability-metrics-custom-allowlist`. It needs to be formed as in this example:

+
[source,yaml]

kind: ConfigMap
apiVersion: v1
metadata:
name: observability-metrics-custom-allowlist
namespace: open-cluster-management-observability
data:
metrics_list.yaml: |
names: <1>

node_memory_MemTotal_bytes
rules: <2>
record: apiserver_request_duration_seconds:histogram_quantile_90
expr: histogram_quantile(0.90,sum(rate(apiserver_request_duration_seconds_bucket {job=\"apiserver\", verb!=\"WATCH\"}[5m])) by (verb,le))
----
+
<1> Optional: Add the name of the custom metrics that are to be collected from the managed cluster.
<2> Optional: Enter only one value for the `expr` and `record` parameter pair to define the query expression. The metrics are collected as the name that is defined in the `record` parameter from your managed cluster. The metric value returned are the results after you run the query expression.
+
You can use either one or both of the sections.
+
This will apply to every cluster with monitor enabled. If you want to specifically use this configuration for only one cluster, you can instead use a similar configuration directly on the spoke cluster in the same namespace the `endpoint-observability-operator` is deployed, `open-cluster-management-addon-monitoring` :
+
[source,yaml]
----
kind: ConfigMap
apiVersion: v1
metadata:
name: observability-metrics-custom-allowlist
namespace: open-cluster-management-addon-observability
data:
metrics_list.yaml: |
names: <1>
- node_memory_MemTotal_bytes
rules: <2>
- record: apiserver_request_duration_seconds:histogram_quantile_90
expr: histogram_quantile(0.90,sum(rate(apiserver_request_duration_seconds_bucket{job="apiserver", verb!="WATCH"}
[5m])) by (verb,le))

+
<1> Optional: Add the name of the custom metrics that are to be collected from the managed cluster.
<2> Optional: Enter only one value for the `expr` and `record` parameter pair to define the query expression. The metrics are collected as the name that is defined in the `record` parameter from your managed cluster. The metric value returned are the results after you run the query expression.
+
You can use either one or both of the sections.

adding-user-workload-metrics
=== Adding user workload metrics

For this type of metric, configuration is performed by a different collector. You need to set configuration on the spoke cluster itself in the `namespace where the metric has to be captured`. It needs to be named `observability-metrics-custom-allowlist` and can be formated as follows :
+
[source,yaml]

kind: ConfigMap
apiVersion: v1
metadata:
name: observability-metrics-custom-allowlist
namespace: monitored_namespace <1>
data:
uwl_metrics_list.yaml: <2>
names: <3>

sample_metrics

+
<1> Enter the namespace where the metric is captured from?
<2> Enter the key for the config map data.
<3> Enter the value of the config map data in YAML format. The `names` section includes the list of metric names, which you want to collect from the `test` namespace. After you create the config map, the observability collector collects and pushes the metrics from the target namespace to the hub cluster.
+
This example monitors the user workload metric `sample_metrics` from the namespace `monitored_namespace`. If this configuration is instead created in the `open-cluster-management-addon-monitoring` namespace, it would be collected from *all* the namespaces of the spoke cluster.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

custom_metrics.adoc
2024/12/13 6:10 PM
5 kB
Felix Dewaleyne

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide