-
Epic
-
Resolution: Done
-
Critical
-
None
-
None
-
dynamic-pinning-net
-
False
-
None
-
False
-
-
Green
-
To Do
-
TELCOSTRAT-37 - Efficient CPU resources allocation
-
0% To Do, 17% In Progress, 83% Done
-
dev-ready, doc-ready, po-ready, px-ready, qe-ready
-
-
Telco 5G Core
-
---
-
0
-
0
Epic Goal
- Figure out how to increase the traffic handling capability for kernel networking workloads on clusters that do not use all cpus for guaranteed workloads.
Why is this important?
- Telco Core has only handful of guaranteed pods, but a lot of burstable kernel networking services. So they need cpu partitioning, but the networking stack needs to handle pretty high traffic too.
Scenarios
- High traffic over kernel and OVS with a small guaranteed pod running on the node. Reserved using the least amount of cpu threads (4/8).
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
- Kernel networking must be a supported use case
- OVS must get the necessary amount of cpus either automatically or by configuration
- All logic must support overrides for emergencies and manual tweaks
Dependencies (internal and external)
- cri-o / OCI hooks
- systemd / OVS slice configuration
- kernel / RPS mask tunables
Previous Work (Optional):
- IRQ balancing https://github.com/cri-o/cri-o/blob/9ed9393df13cee1bb056be0f2068ed972e5cc05d/internal/runtimehandlerhooks/high_performance_hooks.go#L76
- RPS mask https://github.com/openshift/cluster-node-tuning-operator/blob/master/assets/performanceprofile/scripts/set-rps-mask.sh
- https://issues.redhat.com/browse/CNF-1360
- Dynamic OVS pinning: https://docs.google.com/document/d/18BtBkB3tHldt-zLLqNWA94JSd7aDUGnd1XwmYElde4E/edit
Open questions::
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>