-
Bug
-
Resolution: Done
-
Major
-
None
-
RHODS_1.24.0_GA
-
2
-
False
-
None
-
False
-
Testable
-
No
-
-
-
-
-
-
-
No
-
No
-
Pending
-
None
-
-
-
-
-
ML Serving Sprint 1.27, ML Serving Sprint 1.28
-
Medium
Description of problem:
The redhat-ods-applications namespace does not get cluster-monitoring=true label. There is no option to enable cluster monitoring at operator install time.
It looks like this is manually set on the operator namespace (causing another issue) here: https://github.com/red-hat-data-services/odh-deployer/blob/main/deploy.sh#L148
But never set on the applications ns.
With it missing on the redhat-ods-applications namespace we see alerts and logs like this for UWM:
level=warn ts=2023-04-13T18:57:37.539617816Z caller=operator.go:2255 component=prometheusoperator msg="skipping servicemonitor" error="it accesses file system via bearer token file which Prometheus specification prohibits" servicemonitor=redhat-ods-applications/odh-model-controller-metrics-monitor namespace=openshift-user-workload-monitoring prometheus=user-workload AlertManager alert for PrometheusOperatorRejectedResources
Prerequisites (if any, like setup, operators/versions):
N/A
Steps to Reproduce
- Fresh install
Actual results: Alerts and warning logs
Expected results: No alerts or warn logs
Reproducibility (Always/Intermittent/Only Once): Always
Build Details: 1.24
Workaround: Add the label manually: https://access.redhat.com/solutions/6706741
Additional info:
- links to
- mentioned on
(1 mentioned on)