Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-2152

Service monitoring and alerting for Server Foundation components

XMLWordPrintable

      Epic Goal

      • Define SLIs for components that will be used by Service Delivery.
      • Code instrumentation for agreed upon SLIs, expose metrics.
      • Define alerting rules for SLIs.
      • Determine starting SLO based on aggregation of our SLIs.

      Why is this important?

      • Meet SLA requirements that will be established as part of SD.
      • Service monitoring and alerting will be essential for quick RCA and resolution for service disruptions across environments.

      Scenarios

      Any code paths that are executed in any/all of the Server Foundation components must undergo review and work as part of this Epic requirement.

      Referencing component list from:  https://docs.google.com/spreadsheets/d/1d7nfEl7OhvDe69HDK132NX9NzWRLjOmHrm2nDlJOaXw/edit#gid=1946150399 

      • Components:
        • cluster-manager
        • registration-controller
        • registration-webhook
        • work-webhook
        • placement-controller
        • managedcluster-import-controller
        • ocm-controller
        • ocm-proxyserver?
        • ocm-webhook?
        • klusterlet-addon-controller

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      1. Server Foundation F2F 2022 discussion

      Open questions::

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

              leyan@redhat.com Le Yang
              showeimer Sho Weimer
              Yuanyuan He Yuanyuan He
              Le Yang Le Yang
              Song Lai Song Lai
              Qiu Jian Qiu Jian (Inactive)
              Sho Weimer Sho Weimer
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: