Uploaded image for project: 'Observability and Data Analysis Program'
  1. Observability and Data Analysis Program
  2. OBSDA-1078

Observability supporting OpenShift Virtualization use cases

XMLWordPrintable

    • Icon: Outcome Outcome
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None
    • Market Problem
    • False
    • None
    • False
    • Not Selected
    • 0
    • 25% To Do, 50% In Progress, 25% Done

      The integration of Observability components and features with specialized requirements for OpenShift Virtualization aims to enhance visibility into virtualized workloads across single and multicluster environments. By embedding the Perses dashboarding framework into the Advanced Cluster Management (ACM), users gain access to customizable dashboards tailored to virtualization metrics, leveraging features from OpenShift Analytics to gain insights about virtualized workloads and to improve their troubleshooting.

      Key capabilities:

      • Dashboarding framework: Internal teams will use the open specification for dashboards. The framework (Perses) will be available within ACM, available with Multi-Cluster Observability. 
      • Custom Dashboards: Users can create, modify, and personalize dashboards to monitor critical virtualization metrics, ensuring visibility into both cluster-level and workload-level health.
      • Actionable Links from Dashboards: Interactive links within dashboards allow users to navigate directly to related resources or actions, streamlining issue resolution and reducing mean time to recovery (MTTR).
      • Multicluster Alerting UI (high prio): A unified alerting interface provides a centralized view of alerts across multiple clusters, reducing alert fatigue and improving incident management workflows.
      • Alert Aggregation and Toggling: Alerts from multiple clusters are aggregated and prioritized based on severity and impact. Users can toggle alerts on/off based on cluster, namespace, or resource type to focus on what matters most.
      • Networking: TBD
      • Cluster and Components Health dashboard in ACM and OCP: Create a dashboard in ACM that will help to identify clusters that have health issues. Allow to drill down and see in what component is there an issue. Focus on Core operators and Nodes. Preferably have it also connected to alerts for easier troubleshooting. Add this in a way that users can collect the statuses also to external monitoring solutions (Recording rules/metrics). Allow a way to add additional components that User wants to monitor.

      These Observability capabilities will empower operators and SRE teams to efficiently monitor, diagnose, and manage OpenShift Virtualization environments across multicluster deployments, ensuring high availability and performance of virtualized workloads in hybrid cloud scenarios.

       

      Dependencies:

      TBD

       

      Roadmap

      RHACM Roadmap

              rhn-engineering-rvokal Radek Vokal
              rhn-engineering-rvokal Radek Vokal
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: