Uploaded image for project: 'Hybrid Cloud Console'
  1. Hybrid Cloud Console
  2. RHCLOUD-28943

Define SLO for notifications-gw

XMLWordPrintable

    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • Unset
    • No

      The importance of the notifications-gw component is increasing with OCM onboarding. The current set of SLOs does not seem to include notifications-gw https://gitlab.cee.redhat.com/service/app-interface/-/blob/master/data/services/insights/notifications/slo-documents/notifications.yml?ref_type=heads

      We should ensure SLOs are defined for this component.

      • Metrics to include in SLOs

       As Josef mentioned in the following Slack thread:

      Hello, firstly, there seems to be a mismatch between https://gitlab.cee.redhat.com/core-platform-apps/notifications-docs/-/blob/master/modules/SLO-document/pages/SLO-definitions.adoc and https://gitlab.cee.redhat.com/service/app-interface/-/blob/master/data/services/insights/notifications/slo-documents/notifications.yml?ref_type=heads

      The former defines availability using the up() metric (i.e. are any pods running?) while the latter defines availability as the proportion of non-500 responses in the public API (excluding notifications-gw) and sets the goal for <1%. Both measure latency of the public APIs only (i.e. excluding notifications-gw).

      As for the alerts, the only alert that touches notifications-gw seems to be the one you linked (i.e. an alert will fire if all the pods dissapear). There seems to be no alert for when notifications-gw keeps returning non-2xx responses or takes forever to respond.

      mbarcina@redhat.com to add details from previous Slack

       

      -kriedese 

              Unassigned Unassigned
              rhn-engineering-jharting Jozef Hartinger
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: