Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3392

OVN-K alerts: add ovn db connection alerts

    XMLWordPrintable

Details

    • Story
    • Resolution: Done
    • Major
    • None
    • None
    • OVN Kubernetes
    • 1
    • False
    • None
    • False
    • Need add manual test
    • SDN Sprint 223, SDN Sprint 224
    • 0
    • 0.0

    Description

      OVN northbound and southbound Databases both have RAFT clusters. See here for more details: https://web.stanford.edu/~ouster/cgi-bin/cs190-winter21/lecture.php?topic=raft

      This story is mainly focused around creating alerts based on metrics ovn_db_cluster_inbound_connections_error_total / ovn_db_cluster_inbound_connections_total.

       

      ovn_db_cluster_inbound_connections_total gauge

      • Description
      • A metric that returns the total number of inbound  connections to the server labeled by database name, cluster uuid, and server uuid. The inbound connections are only from other RAFT cluster members.
      • Normal/Expected values
      • The total count of RAFT cluster members minus one. So, if you have a 3 member RAFT cluster, expect 2 inbound connections.

      ovn_db_cluster_inbound_connections_error_total gauge

      • Description
      • A metric that returns the total number of failed inbound connections to the server labeled by  database name, cluster uuid, and server uuid. The inbound connection errors are only generated from other RAFT cluster members and not from the CMS or northd or ovn-controller.
      • Normal/Expected values
      • 0

      PR that includes the commit for this change: https://github.com/openshift/cluster-network-operator/pull/1526

      See commit titled: OVN-K alerts: add ovn db connection alerts

      See commit code for for the 8 alerts.

      Attachments

        Issue Links

          Activity

            People

              mkennell@redhat.com Martin Kennelly
              mkennell@redhat.com Martin Kennelly
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: