-
Bug
-
Resolution: Cannot Reproduce
-
Undefined
-
None
-
4.13, 4.12
-
Moderate
-
None
-
SDN Sprint 228, SDN Sprint 229, SDN Sprint 230
-
3
-
Rejected
-
False
-
Description of problem:
Network-check-source which part of the openshift-network-diagnostics to check the connectivity in the cluster is consuming high amounts of CPU ( ~1-1.5 cores ) at regular intervals ( ~8 mins ) ( screenshot of the prometheus metrics and logs attached ). This impacts the resources available especially in case of HyperShift Management cluster given that the worker nodes are used for Hosted cluster control plane pods and it's important to reduce the resource consumption to save on costs. Potential solution can be to reduce the interval at which the connectivity checks are run or reduce the CPU overhead if we can.
Version-Release number of selected component (if applicable):
4.11 and 4.12
How reproducible:
Always
Steps to Reproduce:
1. Install HyperShift Management cluster managing 3 hosted clusters with 7 nodes each. 2. Observe the CPU usage of network-check-source pod in openshift-network-diagnostics namepsace
Actual results:
network-check-source pod is consuming high amounts of CPU at regular intervals
Expected results:
network-check-source pod consumes less CPU or reduce the interval for the checks
Additional info:
Logs: http://dell-r510-01.perf.lab.eng.rdu2.redhat.com/chaos/hypershift/network-diagnostics/