Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3391

OVN-K alerts: add ovn db cluster member error

    XMLWordPrintable

Details

    • Story
    • Resolution: Done
    • Major
    • None
    • None
    • OVN Kubernetes
    • 1
    • False
    • None
    • False
    • Need add manual test
    • SDN Sprint 223, SDN Sprint 224
    • 0
    • 0.0

    Description

      OVN northbound and southbound Databases both have RAFT clusters. See here for more details: https://web.stanford.edu/~ouster/cgi-bin/cs190-winter21/lecture.php?topic=raft

      This story is to create alerts based on metric ovn_db_cluster_server_status.

      ovn_db_cluster_server_status gauge

      • Description
      • A metric with a constant '1' value labeled by database name, cluster uuid, server uuid server status. The label ‘server_status’ which represents the RAFT status of the db, can be: ‘joining cluster’, ‘leaving cluster’, ‘left cluster’, ‘failed’, ‘disconnected from the cluster (election timeout)’ or ‘cluster member’.
      • Normal/Expected values
      • For each database, (nb or sb), for the vast majority of time, there is an entry with label ‘server_status’ is ‘cluster member’

       

      PR that includes the commit for this change: https://github.com/openshift/cluster-network-operator/pull/1526

      See commit titled: OVN-K alerts: add ovn db cluster member error

      See code for two alerts which will fire if there is a cluster member error.

      Attachments

        Issue Links

          Activity

            People

              mkennell@redhat.com Martin Kennelly
              mkennell@redhat.com Martin Kennelly
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: