Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3597

[alert] Netlink overflow

    XMLWordPrintable

Details

    • Story
    • Resolution: Done
    • Major
    • None
    • None
    • OVN Kubernetes
    • None
    • 3
    • False
    • None
    • False
    • OCPSTRAT-558 - OVN Health Monitoring with Prometheus
    • ---
    • SDN Sprint 232
    • 0
    • 0.0

    Description

      Create a severity warning alert to alert to admin that there is packet loss occurring due to failed ovs vswitchd lookups. This may occur if vswitchd is cpu constrained and there are also numerous lookups.

      Use metric  ovs_vswitchd_netlink_overflow which shows netlink messages dropped by the vswitchd daemon due to buffer overflow in userspace.

      For the kernel equivalent, use metric ovs_vswitchd_dp_flows_lookup_lost . Both metrics usually have the same value but may differ if vswitchd may restart.

      Both these metrics should be aggregate into a single alert if the value has increased recently.

       

      DoD: QE test case, code merged to CNO, metrics document updated ( https://docs.google.com/document/d/1lItYV0tTt5-ivX77izb1KuzN9S8-7YgO9ndlhATaVUg/edit )

      Attachments

        Activity

          People

            mkennell@redhat.com Martin Kennelly
            mkennell@redhat.com Martin Kennelly
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: