Uploaded image for project: 'Observability UI'
  1. Observability UI
  2. OU-374

Create an Observability vision for troubleshooting a cluster

XMLWordPrintable

    • Cluster troubleshooting journey
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • NEW
    • To Do
    • QE Needed, Docs Needed, TE Needed, Customer Facing, PX Needed
    • NEW
    • 20% To Do, 60% In Progress, 20% Done

      Description

      As an OpenShift Observability user, I want to make use of Observability data and correlated signals via the OpenShift UI to ease my day-to-day debug journey.

      As part of this journey, we will explore introducing an Observability Overview page. This could serve as a central location for Observability. SREs could glean cluster health status here and see critical alerts. This page could also serve as the starting point for a cluster troubling shooting journey.

      We also want to leverage Korrel8r (signal correlation) technology for this effort.

      Note from Alan Conway (Architect, Correlation of Signals Initiative) 

      The traditional observable signals in a cluster include:

      • Logs (text records emitted by containers, structured or unstructured records)
      • Metrics (numeric values collected periodically)
      • Alerts (structured records indicating an important transition in metric values) 
      • Traces (coming soon, tree-structured records of function calls or network requests, traceable across multiple containers)

      We also consider these to be signals:

      • K8s Events - effectively these are structure logs stored as API objects instead of log file records.
      • Network Events (coming soon, records of network-level events)'

      Other aspects to this effort could be:

      • Alert grouping
      • Anomaly detection
      • Incidents

      Goals & Outcomes

      Product Requirements:

      Map out a cluster/s troubleshooting journey for SREs/Platform Engineers/ from start to finish. 

      Engineering/Data Analytics Requirements:

      [List here]

      Success KPIs

      [If applicable]

      Documentation

      Google doc

      High-level flows 

      Design mockup flow

      Open Questions

      Admin vs Developer Perspective?

            fkargbo Foday Kargbo
            fkargbo Foday Kargbo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: