-
Story
-
Resolution: Duplicate
-
Undefined
-
None
-
None
-
False
-
-
False
-
RHIDP-1431 - Engineering Improvements
-
-
[2138854594] Upstream Reporter: Armel Soro
Upstream issue status: Closed
Upstream description:
See https://sdk.operatorframework.io/docs/overview/operator-capabilities/#level-4---deep-insights
Goal Setup full monitoring and alerting for your operand. All resources such as Prometheus rules (alerts) and Grafana dashboards should be created by the operator when the operand CR is instantiated.
TODO
- ☐ Add ability in CRD to create a ServiceMonitor resource (follow-up to https://github.com/janus-idp/operator/issues/180)
- ☐ Implement Prometheus metrics in RHDH Operator for Backstage CR reconciliation failure/success
- ☐ Implement Grafana dashboards to monitor a) Whether the operator is up and running as well as how long it has been running, b) memory & CPU consumption by the operator
- ☐ Implement alerts so that when the operator is down, certain actions get triggered, eg, a notification gets sent to the user's slack channel, a Jira ticket is created, etc.
Upstream URL: https://github.com/janus-idp/operator/issues/205
- is related to
-
RHIDP-1069 Include build and runtime information on RHDH Metrics
- New
- links to