-
Bug
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
Quality / Stability / Reliability
-
1
-
False
-
-
False
-
-
-
Tracing Sprint # 282
Problem:
The OpenTelemetry Operator fails to create a ServiceMonitor for its own metrics on OpenShift clusters, causing operator controller metrics (e.g., controller_runtime_reconcile_time_seconds_count{controller="opentelemetrycollector"}) to be unavailable in Prometheus/Thanos.
Root Causes:
Two issues were identified:
1. Missing operator args in OpenShift deployment (introduced in PR #4576){}
When kube-rbac-proxy was removed in commit 5379a2e8, the manager_auth_proxy_tls_patch.yaml file was modified to include only TLS cert args. Since this is a strategic merge patch, it replaces the entire args array, overwriting the correct args set by manager-patch.yaml (which is a JSON patch).
As a result, critical flags like --create-sm-operator-metrics=true, --enable-leader-election, --enable-cr-metrics=true, etc. were removed from the final deployment.
2. Uppercase scheme value rejected by CRD validation (introduced in PR #3858){}
The prometheus-operator dependency bump changed the SchemeHTTPS constant from lowercase "https" to uppercase "HTTPS". However, the ServiceMonitor CRD validation only accepts lowercase values ("http", "https"), causing the ServiceMonitor creation to fail with:
spec.endpoints[0].scheme: Unsupported value: "HTTPS": supported values: "http", "https"
Fix:{}
1. Add TLS cert args (--metrics-tls-cert-file, --metrics-tls-key-file) to config/overlays/openshift/manager-patch.yaml and remove the args section from manager_auth_proxy_tls_patch.yaml to prevent overwriting.
2. Change ptr.To(monitoringv1.SchemeHTTPS) to ptr.To(monitoringv1.Scheme("https")) in internal/operator-metrics/metrics.go to use lowercase scheme value.
Affected Components:{}
- config/overlays/openshift/manager-patch.yaml
- config/overlays/openshift/manager_auth_proxy_tls_patch.yaml
- internal/operator-metrics/metrics.go
Testing:{}
Run the e2e-openshift monitoring test:
chainsaw test --skip-delete tests/e2e-openshift/monitoring