-
Epic
-
Resolution: Done
-
Critical
-
ACM 2.11.0
-
MCO Fleetview Dashboard Optimisations
-
False
-
-
False
-
Not Selected
-
To Do
-
0% To Do, 0% In Progress, 100% Done
Epic Goal
...
Currently the fleet view dashboards that ship alongside MCO rely on raw metrics from managed clusters, this means the dashboards are often very heavyweight and do not load quickly for users, with poor cacheability on failed queries.
The dashboards should rely on summary metrics where possible, that are pre-aggregated recording rules that reduce the dashboard cardinality from raw metrics, to summary recording rules per MC/Namespace.
Why is this important?
For our large customers (100s MC's) the dashboards are effectively unusable in their current state, and they cannot view trending fleet metrics beyond 6 days in some cases.
Scenarios
...
Acceptance Criteria
- Fleet view metrics should be responsive and load up to the maximum retention configured on default MCO's
- All grafana dashboard queries should be based on recording rules, not raw metric queries. These recording rules can be evaluated in the spokes, or in the central Hub MCO recording rule stack.
- New optimized grafana dashboard should be released side-by-side with old one (for how many releases?), as the recording rules need to accumulate data for a while
- If possible released to z-streams, as there should be no backwards-incompatible changes (might need to check adding recording rules doesn't negatively impact resource usage on spokes)
Dependencies (internal and external)
- ...
Previous Work (Optional):
- ...
Open questions:
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
Issue> - DEV - Upstream documentation merged: <link to meaningful PR or GitHub
Issue> - DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Doc issue opened with a completed template. Separate doc issue
opened for any deprecation, removal, or any current known
issue/troubleshooting removal from the doc, if applicable.
- relates to
-
ACM-12061 Observability - Choosing a date range past 30 days hangs the Grafana UI
-
- Closed
-