Loading...

Type: Story
Resolution: Done
Priority: Critical
Fix Version/s: netobserv-1.4
Affects Version/s: None
Component/s: eBPF
Labels:
None

Work Type:
Proactive Architecture
Blocked:
False
Blocked Reason:
None
Ready:
False
Epic Link:
tech-debt-1.4
Intelligence Requested:
Market:

Sprint:
NetObserv - Sprint 238, NetObserv - Sprint 239, NetObserv - Sprint 240, NetObserv - Sprint 241

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

while profiling the memory it was clear the userspace allocate a huge map to hold the entire ebpf hash table every 5s and leave it to go garbage collection to free
top5
Showing nodes accounting for 95.46GB, 97.45% of 97.96GB total
Dropped 215 nodes (cum <= 0.49GB)
Showing top 5 nodes out of 17
flat flat% sum% cum cum%
40.85GB 41.70% 41.70% 96.11GB 98.12% github.com/netobserv/netobserv-ebpf-agent/pkg/ebpf.(*FlowFetcher).LookupAndDeleteMap <<<<<<<<<<<<<<
16.05GB 16.39% 58.09% 16.05GB 16.39% reflect.unsafe_NewArray
12.89GB 13.16% 71.25% 12.89GB 13.16% encoding/binary.Read
12.85GB 13.12% 84.36% 41.75GB 42.63% github.com/cilium/ebpf.unmarshalPerCPUValue
12.82GB 13.09% 97.45% 12.82GB 13.09% github.com/cilium/ebpf.makeBuffer (inline)
its know golang map even when its deleted it won't free the memory it allocates

so in this user story we propose the following

1- change the data type used by the map instead of the entire mertics structure use ptr to metircs that will reduce the map size

2- force GC at the end of every collection we are expecting this will increase the CPU slightly but in favour of returning back lagr chunk of allocated memory

3- add GOMEMLIMIT env setting which will trigger GC aggresively when resource limit is reached to avoid OOM conditions

is related to

NETOBSERV-1142 Run performance tests for 1.4 release