-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
openshift-4.14, openshift-4.16, 4.16, openshift-4.16.z
-
Improvement
-
False
-
None
-
False
-
Not Selected
-
-
-
1. Proposed title of this feature request
document metrics to measure the performance of ovn-kubernetes and OVN
2. What is the nature and description of the request?
document how to use ovs/ovn/ovnk metrics to know how to spot performance issues
3. Why does the customer need this? (List the business requirements here)
we run big clusters with ovn-kubernetes on Azure. While scaling up we hit various performance issues (eg lately
https://issues.redhat.com/browse/FDP-399
or
https://issues.redhat.com/browse/FDP-509
). For these particular performance issues we found that we could monitor metric ovs_vswitchd_interface_up_wait_seconds_total to find which node was affected. But this is some ad-hoc metric.
We would like to know your advice regarding the ovs/ovn metrics we could monitor to detect performance issues on our clusters. Our goal is to set-up alerting and be more reactive in case the cluster performance degrades
4. List any affected packages or components.
OCP / ovn-kubernetes documentation