Uploaded image for project: 'OpenStack as Infra'
  1. OpenStack as Infra
  2. OSASINFRA-2405

Kuryr: Monitor status of critical OpenStack resources

XMLWordPrintable

    • Kuryr: Monitor status of OpenStack resources
    • False
    • False
    • Done
    • 0% To Do, 0% In Progress, 100% Done
    • Undefined

      Goal

      As an operator running OCP with Kuryr I'd like to have metrics and alerts regarding important Kuryr resources - e.g. if K8s API or DNS LB goes into ERROR status or some if it's members cannot be created.

      Problem

      Currently there's no way to monitor the OpenStack resources created by Kuryr other than querying OpenStack APIs ourselves. We'd need to identify viable metrics and implement utility that would calculate and expose them, as well as alerts in case anything is troubling.

      Why is this important

      We had an incident when the OCP cluster was considered healthy even though the OpenShift API LB was missing 2 members. If this was signaled to the user, they would be able to avoid downtime and escalation.

       

      Estimate (XS, S, M, L, XL, XXL): M

          1.
          QE Tracker Sub-task Closed Undefined Unassigned
          2.
          Docs Tracker Sub-task Closed Undefined Unassigned
          3.
          TE Tracker Sub-task Closed Undefined Unassigned

              mdemaced Maysa De Macedo Souza
              mdulko Michał Dulko (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: