-
Bug
-
Resolution: Unresolved
-
Normal
-
4.18.0
-
None
-
None
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem:
checked in 4.18.0-0.nightly-2024-11-15-113437,
$ oc -n openshift-monitoring get deploy telemeter-client -ojsonpath='{.spec.template.spec.containers[?(@.name=="telemeter-client")].command}' | jq ... "--match={__name__=\"cluster:log_logged_bytes_total:sum\"}", "--match=openshift_logging:log_forwarder_pipelines:sum", "--match=openshift_logging:log_forwarders:sum", "--match=openshift_logging:log_forwarder_input_type:sum", "--match=openshift_logging:log_forwarder_output_type:sum", "--match=openshift_logging:vector_component_received_bytes_total:rate5m", "--match={__name__=\"cluster:kata_monitor_running_shim_count:sum\"}", ...
below metrics format in telemeter-client deploy is wrong,
openshift_logging:log_forwarder_pipelines:sum openshift_logging:log_forwarders:sum openshift_logging:log_forwarder_input_type:sum openshift_logging:log_forwarder_output_type:sum openshift_logging:vector_component_received_bytes_total:rate5m
take "-match=openshift_logging:log_forwarder_pipelines:sum" as example, should be like below
"-match={__name__=\"openshift_logging:log_forwarder_pipelines:sum\"}"
you can check other metrics in the deploy, due to the wrong format, these metrics won't be added to telemetry-config configmap
$ oc -n openshift-monitoring get cm telemetry-config -ojsonpath='{.data.metrics\.yaml}' | grep "{__name__=" | grep -E "openshift_logging:log_forwarder_pipelines:sum|openshift_logging:log_forwarders:sum|openshift_logging:log_forwarder_input_type:sum|openshift_logging:log_forwarder_output_type:sum|openshift_logging:vector_component_received_bytes_total:rate5m" no result
and the metrics format in telemeter server allow list is also wrong
https://github.com/rhobs/configuration/blob/main/configuration/telemeter/metrics.json#L2-L6
the issue is brought by
telemeter client PR: https://github.com/openshift/cluster-monitoring-operator/pull/2512
telemeter server PR: https://github.com/rhobs/configuration/pull/704
Version-Release number of selected component (if applicable):
4.18.0-0.nightly-2024-11-15-113437, issue is only with 4.18
How reproducible:
always
Steps to Reproduce:
see the description
- is depended on by
-
OCPBUGS-44971 Backport new telemetry for Cluster Logging Operator
- Verified
- is related to
-
MON-4051 Send metrics from Cluster Logging Operator via Telemetry
- In Progress
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update