-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
NetObserv - Sprint 224, NetObserv - Sprint 225, NetObserv - Sprint 226, NetObserv - Sprint 227
-
None
-
None
-
None
Using the OOTB flowcollector CRD, the ebpf flow collectors repeatedly CrashLoop for reason OOMKill under a very modest network load.
- AWS cluster withh 9 m5.2xlarge workers
- Install NO with its default 100Mi memory limit in the flowcollector CRD
- run the hey-ho app (https://github.com/jotak/hey-ho) with 10 projects, 10 deployments and 1 replica
- oc get pods and watch the netobserv-ebpf-agent pods CrashLoop
The network traffic per node is 20K-300K flows/minute and roughly 200Mb/s spread for 1-2 pods per node.
We should remove the memory limit for the collector unless we know a correct limit for our target flow and traffic rates. The OOTB default should not crash so easily.