Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-35467

Actionable Telemetry for internal clusters - 4.16

XMLWordPrintable

    •  cnv-actionable-telemetry-4.16
    • Hide

      Create bugs based on internal clusters suspicious alerts, based on observations of RH internal environments (cnv.engineering2).

      Unclutter alerting on internal clusters, so they are clean and showing relevant alerts that require action.

      Add additional information the the weekly report about non-reporting accounts

      Show
      Create bugs based on internal clusters suspicious alerts, based on observations of RH internal environments (cnv.engineering2). Unclutter alerting on internal clusters, so they are clean and showing relevant alerts that require action. Add additional information the the weekly report about non-reporting accounts
    • Green
    • To Do
    • CNV-25453 - SD: Fleet Alerting Dashboards
    • CNV-25453SD: Fleet Alerting Dashboards
    • 0% To Do, 50% In Progress, 50% Done
    • dev-ready, doc-ready, po-ready, qe-ready, ux-ready
    • Hide

      2024-03-14: in progress...

      Show
      2024-03-14: in progress...

      Goal

      Use existing telemetry data to trigger practical actions. This is rather to process the data we have right now, than to add additional metrics or alerts. Examples are included in the user stories below.

      Based on CNV-31126 -
      Created doc https://docs.google.com/spreadsheets/d/1xYCyn-Y35ZA3ABZNYowt9ZonCIoUtNps0P7GhrLMTFY/edit?usp=sharing that include logs that we need to check and this epic includes the spikes to go over each of the issues found in the logs.

      User Stories

      • As an OpenShift Virtualization team member I'd like to see trends in the telemetry to get a feel on what 's going on in the field. One of the examples would be to group alerts by z and y streams, so with each released version we can see trends with telemetry data, can compare them with previous versions and take some actions.
      • As an OpenShift Virtualization team member I'd like to be notified about any suspicious trends in alerts (by a section dedicated to it in weekly telemeter report)
      • As an OpenShift Virtualization engineer/manager I'd like to have the bug open for suspiciously looking alerts so my team can investigate it.
      •  

      Non-Requirements

      • List of things not included in this epic, to alleviate any doubt raised during the grooming process.

      Notes

      • Any additional details or decisions made/needed

      Done Checklist

      Who What Reference
      DEV Upstream roadmap issue (or individual upstream PRs) <link to GitHub Issue>
      DEV Upstream documentation merged <link to meaningful PR>
      DEV gap doc updated <name sheet and cell>
      DEV Upgrade consideration <link to upgrade-related test or design doc>
      DEV CEE/PX summary presentation label epic with cee-training and add a <link to your support-facing preso>
      QE Test plans in Polarion <link or reference to Polarion>
      QE Automated tests merged <link or reference to automated tests>
      DOC Downstream documentation merged <link to meaningful PR>

            sradco Shirly Radco
            kmajcher@redhat.com Krzysztof Majcher
            Debarati Basu-Nag Debarati Basu-Nag
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: