-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.21
-
None
-
None
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
with telco-ran CR ReduceMonitoringFootprint.yaml
applied to a spoke running OCP 4.20 with ZTP/TALM 4.21 (IBU target host), the rendered ConfigMap
looks garbled:
apiVersion: v1
data:
config.yaml: "alertmanagerMain:\n enabled: false\ntelemeterClient:\n enabled:
false\nnodeExporter:\n collectors:\n buddyinfo: {}\n cpufreq: {}\n ksmd:
{}\n mountstats: {}\n netclass: {}\n netdev: {}\n processes: {}\n
\ systemd: {}\n tcpstat: {}\nprometheusK8s:\n additionalAlertmanagerConfigs:\n
\ - apiVersion: v2\n bearerToken:\n key: token\n name: observability-alertmanager-accessor\n
\ scheme: https\n staticConfigs:\n - \n tlsConfig:\n ca:\n key:
service-ca.crt\n name: hub-alertmanager-router-ca\n insecureSkipVerify:
false\n externalLabels:\n managed_cluster: b9ca26b2-55ed-4ec8-826b-3917eb8e24c7\n
\ retention: 24h\n"
kind: ConfigMap
metadata:
creationTimestamp: "2025-11-24T23:01:14Z"
name: cluster-monitoring-config
namespace: openshift-monitoring
resourceVersion: "12472"
uid: 0e5629d5-8da9-4c48-b001-75c3ad713bb5
######
When this CR is applied on a spoke via ZTP/TALM 4.20, the ConfigMap appears to render correctly,
and the prometheus-k8s-0 pod is stable.
apiVersion: v1
data:
config.yaml: |
alertmanagerMain:
enabled: false
telemeterClient:
enabled: false
prometheusK8s:
retention: 24h
kind: ConfigMap
metadata:
annotations:
ran.openshift.io/ztp-deploy-wave: "1"
creationTimestamp: "2025-11-24T23:19:41Z"
name: cluster-monitoring-config
namespace: openshift-monitoring
resourceVersion: "21867"
uid: 2c762218-020b-40a1-8916-1225287b96e1
with ZTP 4.21, if the CR is completely overriden, i.e. like this (PGT):
- fileName: ReduceMonitoringFootprint.yaml
data:
config.yaml: |
alertmanagerMain:
enabled: false
telemeterClient:
enabled: false
prometheusK8s:
retention: 24h
And the prometheus-k8s-0 pod is deleted after the openshift-monitoring cm
is updated to match 4.20 as above by the policy change,
the pod starts and is stable.
Version-Release number of selected component (if applicable):
In both cases spoke is deployed with OCP 4.20
hubs for working and non-working spoke env have these in common
$ oc get clusterversions.config.openshift.io
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.21.0-ec.3 True False 3d6h Cluster version is 4.21.0-ec.3
[kni@kni-qe-82 dgonyier]$ source ./csv-list
advanced-cluster-management.v2.15.0
multicluster-engine.v2.10.0
openshift-gitops-operator.v1.18.1
packageserver
Differences on hubs
working:
ZTP 4.20
TALM 4.20.1
non-working:
ZTP 4.21
TALM 4.21.0
How reproducible:
Always
Steps to Reproduce:
1. Deploy spoke with RAN DU profile, with hub csv versions for failing case as shown above
2. Review monitoring policy and live CR
3.
Actual results:
CR looks mangled
Expected results:
CR is sane
Additional info:
Currently the source CR is identical between main and release-4.20 branches.
I am not sure if this is due to ZTP 4.21 or TALM 4.21
Workarounds:
don't use the ReduceMonitoringFootprint.yaml CR
OR
override the `data:` field in the policy template with the working config as described above.