Productize something like what we have now with the ops team (add it to the operator).
Define and implementation of:
- Grafana dashboards
- Prometheus rules
- zync components exposing metrics.
- kube-state-metrics (https://issues.redhat.com/browse/THREESCALE-4679)
- aggregated health endpoints metrics. (https://issues.redhat.com/browse/THREESCALE-4681)
Zync guys have the domain knowledge to define useful dashboards and effective prometheus rules. Ops teams can help you with the syntax.
I will provide an environment to easily run the loop: update -> check on dashboards and prometheus rules.
- Cluster: https://console-openshift-console.apps.dev-eng-operator-ocp4-2.dev.3sca.net (github auth provider)
- Namespace to use: 3scale-metrics
- Prometheus admin site: https://prometheus-route-application-monitoring.apps.dev-eng-operator-ocp4-2.dev.3sca.net
- Grafana admin site: https://grafana-route-application-monitoring.apps.dev-eng-operator-ocp4-2.dev.3sca.net
System team internal tracking doc: https://docs.google.com/document/d/1aLI2VoK0kPds4YsuPPCeXmhmvxo99_XKAVCCm5ruucs/edit?usp=sharing
Suggestion of steps to implement:
- Make sure code is ready, i.e., exporting the metrics we want in the dashboard
- Deploy a version of 3scale that exports the metrics
- Configure the Grafana dashboard for Zync - pair with ops to reuse their experience with configuring the Zync Grafana dashboard for Saas
- Pair with the operators team to "productize" the newly configure dashboard, i.e., to put it for the operator to set on a fresh new deployment of 3scale so customers can have the dashboard OTB.