We see very high cpu usage (600%) by ovnkube-node pod during resource deletion while running cluster-density-v2 scale test on 120 node OVN-IC environment. ovnkube-controller and ovnkube-node containers are using 500% and 80% CPU respectively.
Cluster-density-v2 test when churn is enabled executes below steps in sequence
1. Creates 800 namespace(each namespace with 10 pods, 5 services, 2 routes and 3 network policies)
2. After creating all these resources, it deletes 10% of the namespaces (i.e 80 namespaces), waits till all namespaces are deleted
3. Recreates them (i.e 80 namespaces with the resources)
4. Idle for 10 minutes i.e no resource creation or deletion (CPU usage will be dropped to normal)
5. Follow steps 2 to 4 (i.e deletion, creation and idle) for 1 hour duration. In total we will have 4 deletion, creation, idle events in the entire test run.
During resource deletion phase we see very high cpu usage by OVNKubernetes components
1. Ovnkube-controller is using 500% cpu during deletion, while it uses only 20% during creation
2. Ovnkube-node container 80% cpu at deletion vs 6% at creation
Because of this Ovnkube-node pod in OVN-IC is using 600% cpu during deletion phase.
As this issue is alwasy reproducible, I can share the environment when the engineer wants to debug.
Grafana snapshots -