-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
OVS/OVN (p)sample testing for OCP
-
False
-
-
False
-
To Do
-
rhel-sst-network-fastdatapath
-
100% To Do, 0% In Progress, 0% Done
-
ssg_networking
Goal: the ability to quantify the impact to both control and data plane when OVS (and optionally) OVN sampling is configured. This has to be measured in scenarios that are relevant (and similar) to OpenShift deployments with NetObserv enabled. Tests must be performed at each layer (kernel, OVS, OVN) in order to allow to easily pinpoint potential causes of performance degradation.
Components and versions needed for testing:
- RHEL 9.4 (for OCP 4.17) and RHEL 9.6 (for OCP 4.18)
- openvswitch3.4
- ovn24.09
Kernel:
- Metrics to collect and compare (according to configuration variables)
-
- packet forwarding latency
- traffic throughput impact
- cpu and memory usage
- Configuration variables
- X - number of OVS datapath flows concurrently forwarding traffic
- Y - number of psample actions per datapath flow
- Z - probability of each psample action to sample packets
- psample listener configuration:
- no listener registered and no ebpf program attached to the psample hook
- no listener registered and ebpf program attached to the psample hook (accessing the packet headers and psample metadata)
- netlink listener attached to the psample netlink multicast group
OVS:
- Metrics to collect and compare (according to configuration variables){}
- cpu and memory usage of ovs-vswitchd
- added latency to install datapath flow rules that sample packets (if possible)
- Configuration variables{}
- X - number of OpenFlow rules
- Y - number of sample actions per OpenFlow rule{}
OVN:
- Metrics to collect and compare (according to configuration variables)
- cpu and memory usage of OVN components (NB/SB/ovn-northd/ovn-controller)
- control plane metrics:
- number of logical flows (size of SB database)
- time to recompute NB -> SB translation (in ovn-northd)
- time to recompute SB -> OF translation (in ovn-controller)
- dataplane metrics (in an OVN topology simulating an OpenShift deployment measure):
- packet forwarding latency
- traffic throughput impact
- Configuration variables
- X - number of ACLs with sampling enabled
- Y - number of collectors configured for each ACL sample
- Z - number of nodes (hypervisors)