Epic Goal

Why is this important?

While it is possible to define how long metrics should be retained on disk, it's not possible to tell the cluster monitoring operator how much data it should keep. For OSD/ROSA in particular, it would facilitate the management of the fleet if the retention size could be configured based on the persistent volume size because it would avoid issues with the storage getting full and monitoring being down when too many metrics are produced.

As a cluster admin, I want to define the maximum amount of data to be retained on the persistent volume.

CI - MUST be running successfully with tests automated
Release Technical Enablement - Provide necessary release enablement details and documents.
The cluster-monitoring-config config and the user-workload-monitoring-config configmap allow to configure the retention size for
- Prometheus (Platform and UWM)
- Thanos Ruler (to be confirmed)
Proper validation is in place preventing bad user inputs from breaking the stack.

TBD

documents

MON-2193 Size-based retention

is related to

MON-2336 Docs Tracker

links to