-
Story
-
Resolution: Done
-
Critical
-
None
-
None
-
Proactive Architecture
-
False
-
None
-
False
-
-
-
NetObserv - Sprint 238, NetObserv - Sprint 239, NetObserv - Sprint 240, NetObserv - Sprint 241
while profiling the memory it was clear the userspace allocate a huge map to hold the entire ebpf hash table every 5s and leave it to go garbage collection to free
top5
Showing nodes accounting for 95.46GB, 97.45% of 97.96GB total
Dropped 215 nodes (cum <= 0.49GB)
Showing top 5 nodes out of 17
flat flat% sum% cum cum%
40.85GB 41.70% 41.70% 96.11GB 98.12% github.com/netobserv/netobserv-ebpf-agent/pkg/ebpf.(*FlowFetcher).LookupAndDeleteMap <<<<<<<<<<<<<<
16.05GB 16.39% 58.09% 16.05GB 16.39% reflect.unsafe_NewArray
12.89GB 13.16% 71.25% 12.89GB 13.16% encoding/binary.Read
12.85GB 13.12% 84.36% 41.75GB 42.63% github.com/cilium/ebpf.unmarshalPerCPUValue
12.82GB 13.09% 97.45% 12.82GB 13.09% github.com/cilium/ebpf.makeBuffer (inline)
its know golang map even when its deleted it won't free the memory it allocates
so in this user story we propose the following
1- change the data type used by the map instead of the entire mertics structure use ptr to metircs that will reduce the map size
2- force GC at the end of every collection we are expecting this will increase the CPU slightly but in favour of returning back lagr chunk of allocated memory
3- add GOMEMLIMIT env setting which will trigger GC aggresively when resource limit is reached to avoid OOM conditions
- is related to
-
NETOBSERV-1142 Run performance tests for 1.4 release
- Closed
- relates to
-
NETOBSERV-1470 netobserv-ebpf-agent performance degradation between 1.5 and 1.4.2
- Closed
- links to
-
RHSA-2023:116729 Network Observability 1.4.0 for OpenShift
- mentioned on