-
Story
-
Resolution: Done
-
Minor
-
None
-
None
-
None
-
None
-
False
-
-
False
-
None
-
None
-
None
-
None
-
None
HIVE-1581 tracked some work to have hive set up metrics configuration. Specifically, via #1525, setting HiveConfig.Spec.ExportMetrics=true would cause hive-operator to do the following to the TargetNamespace:
- Add the openshift.io/cluster-monitoring: "true" label
- Create ServiceMonitors for hive-controllers and hive-clustersync
- Create a Role and RoleBinding allowing the cluster's prometheus instances (both the cluster and user monitoring ones) to read Services, Endpoints and Pods
HIVE-1655 resulted from testing the above, complaining that this configuration was not undone when ExportMetrics was removed or set to false. #1577 made it so.
Couple issues here:
- The monitoring infrastructure has matured since then. It is not clear that the configuration laid down when ExportMetrics=true conforms to current recommended/supported best practices. For instance, I think the cluster-monitoring label will cause the metrics to be scraped by the cluster monitoring stack as opposed to the user one. This is (probably? usually?) not appropriate for an operator installed via OLM – which according to
HIVE-1581is what this was meant to help with. - Consumers attempting to set things up manually (without ExportMetrics) can unexpectedly have their configuration removed. This happened in real life when Openshift CI tried to create ServiceMonitors with the same name as the ones hive manages. Fortunately, they were able to work around the issue by a) renaming their ServiceMonitors; and b) using user workload monitoring, which uses the openshift.io/workload-monitoring: "true" label rather than the cluster-monitoring one.
This card is to crisp up our story around ExportMetrics. What scenarios is it supposed to be used for? Is it (still) necessary/desirable to provide something in this space? If so, let's clean it up so it doesn't interfere with manual setup. If not, let's deprecate and disable it. And either way, let's make docs/monitoring.md reflect reality.
- links to