-
Story
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
BU Product Work
-
3
-
False
-
None
-
False
-
OCPSTRAT-558 - OVN Health Monitoring with Prometheus
-
---
-
-
-
SDN Sprint 232
-
0
-
0.000
Create a severity warning alert to alert to admin that there is packet loss occurring due to failed ovs vswitchd lookups. This may occur if vswitchd is cpu constrained and there are also numerous lookups.
Use metric ovs_vswitchd_netlink_overflow which shows netlink messages dropped by the vswitchd daemon due to buffer overflow in userspace.
For the kernel equivalent, use metric ovs_vswitchd_dp_flows_lookup_lost . Both metrics usually have the same value but may differ if vswitchd may restart.
Both these metrics should be aggregate into a single alert if the value has increased recently.
DoD: QE test case, code merged to CNO, metrics document updated ( https://docs.google.com/document/d/1lItYV0tTt5-ivX77izb1KuzN9S8-7YgO9ndlhATaVUg/edit )