Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3840

Better pinning of the networking stack

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Done
    • Icon: Critical Critical
    • openshift-4.14
    • None
    • None
    • dynamic-pinning-net
    • False
    • None
    • False
    • Hide
      - Kernel networking must be a supported use case
      - OVS must get the necessary amount of cpus either automatically or by configuration
      - All logic must support overrides for emergencies and manual tweaks
      Show
      - Kernel networking must be a supported use case - OVS must get the necessary amount of cpus either automatically or by configuration - All logic must support overrides for emergencies and manual tweaks
    • Green
    • To Do
    • TELCOSTRAT-37 - Efficient CPU resources allocation
    • 0% To Do, 17% In Progress, 83% Done
    • dev-ready, doc-ready, po-ready, px-ready, qe-ready
    • Hide

      2023-02-21:

      Dev:  [YELLOW]  waiting for OVN-K changes, NTO POC PR posted

       

      Show
      2023-02-21: Dev:  [YELLOW]   waiting for OVN-K changes, NTO POC PR posted  
    • Telco 5G Core
    • ---
    • 0
    • 0

      Epic Goal

      • Figure out how to increase the traffic handling capability for kernel networking workloads on clusters that do not use all cpus for guaranteed workloads.

      Why is this important?

      • Telco Core has only handful of guaranteed pods, but a lot of burstable kernel networking services. So they need cpu partitioning, but the networking stack needs to handle pretty high traffic too.

      Scenarios

      1. High traffic over kernel and OVS with a small guaranteed pod running on the node. Reserved using the least amount of cpu threads (4/8).

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • Kernel networking must be a supported use case
      • OVS must get the necessary amount of cpus either automatically or by configuration
      • All logic must support overrides for emergencies and manual tweaks

      Dependencies (internal and external)

      1. cri-o / OCI hooks
      2. systemd / OVS slice configuration
      3. kernel / RPS mask tunables

      Previous Work (Optional):

      1. IRQ balancing https://github.com/cri-o/cri-o/blob/9ed9393df13cee1bb056be0f2068ed972e5cc05d/internal/runtimehandlerhooks/high_performance_hooks.go#L76
      2. RPS mask https://github.com/openshift/cluster-node-tuning-operator/blob/master/assets/performanceprofile/scripts/set-rps-mask.sh
      3. https://issues.redhat.com/browse/CNF-1360
      4. Dynamic OVS pinning: https://docs.google.com/document/d/18BtBkB3tHldt-zLLqNWA94JSd7aDUGnd1XwmYElde4E/edit

      Open questions::

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

              rravaiol@redhat.com Riccardo Ravaioli
              msivak@redhat.com Martin Sivak
              Ross Brattain Ross Brattain
              Ronan Hennessy Ronan Hennessy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: