Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-27800

ACM Observability Provide guidance on how to implement very high retention

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • Observability
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected

      Feature Overview

      the below summary needs to be verified and enhanced, it should not just work ootb

      Summary: Multi-Year Metrics Retention via ACM Observability

      Overview:

      Implement long-term (5–10 year) metrics retention by leveraging the native Thanos architecture within ACM Observability. This is a configuration-only task requiring no custom code, focusing on the MultiClusterObservability Custom Resource (CR).

      Technical Implementation
      Retention is managed via three resolution tiers. To achieve multi-year storage, the 1-hour resolution parameter must be explicitly defined.

      Key Configuration (Hub Cluster): Modify the MultiClusterObservability CR to include the retentionConfig block:

      YAML

      spec:
        advanced:
          retentionConfig:
            retentionResolutionRaw: 30d   # Raw high-fidelity data
            retentionResolution5m: 180d  # Mid-tier resolution
            retentionResolution1h: 5y    # Long-term retention target (e.g., 5y or 10y)
      

      Infrastructure Requirements
      To support 5+ years of data, the following infrastructure dependencies must be validated:

      Object Storage (S3/Ceph): Ensure the backend bucket is configured for high scalability. This serves as the primary data store for all historical metrics.

      Thanos Compactor: Must be allocated sufficient CPU/Memory to handle the downsampling and block restructuring process for massive datasets.

      Thanos Query/Receiver: Resource limits should be increased to prevent timeouts when users query large historical date ranges (5+ years), as this requires loading significant indexing data.

      Action Items
      [ ] Update the MultiClusterObservability CR with the 5y/10y retention parameters.

      [ ] Audit S3 bucket storage quotas and lifecycle policies.

      [ ] Monitor Compactor and Query component resource utilization post-change.

       

      Requirements

      This Section: A list of specific needs or objectives that a Feature must
      deliver to satisfy the Feature.. Some requirements will be flagged as MVP.
      If an MVP gets shifted, the feature shifts. If a non MVP requirement slips,
      it does not shift the feature.

      Requirement Notes isMvp?
      CI - MUST be running successfully with test automation This is a
      requirement for ALL features.
      YES
      Release Technical Enablement Provide necessary release enablement details
      and documents.
      YES

      (Optional) Use Cases

      This Section:

      • Main success scenarios - high-level user stories
      • Alternate flow/scenarios - high-level user stories
      • ...

      Questions to answer

      • ...

      Out of Scope

      Background, and strategic fit

      This Section: What does the person writing code, testing, documenting
      need to know? What context can be provided to frame this feature?

      Assumptions

      • ...

      Customer Considerations

      • ...

      Documentation Considerations

      Questions to be addressed:

      • What educational or reference material (docs) is required to support this
        product feature? For users/admins? Other functions (security officers, etc)?
      • Does this feature have a doc impact?
      • New Content, Updates to existing content, Release Note, or No Doc Impact
      • If unsure and no Technical Writer is available, please contact Content
        Strategy.
      • What concepts do customers need to understand to be successful in
        [action]?
      • How do we expect customers will use the feature? For what purpose(s)?
      • What reference material might a customer want/need to complete [action]?
      • Is there source material that can be used as reference for the Technical
        Writer in writing the content? If yes, please link if available.
      • What is the doc impact (New Content, Updates to existing content, or
        Release Note)?

              Unassigned Unassigned
              rhn-support-cstark Christian Stark
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: