Uploaded image for project: 'OpenShift Node'
  1. OpenShift Node
  2. OCPNODE-548

Hyper-threading awareness

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Won't Do
    • Icon: Critical Critical
    • 2021Q2 Plan, openshift-4.9
    • None
    • None
    • None
    • Hyper-threading awareness
    • Product / Portfolio Work
    • False
    • None
    • False
    • None
    • None
    • None
    • 0

      This is a tracking/planning Epic to make the dependency between CNF and OCPNODE explicit.

       

      Epic Goal

      • Enhance the existing cpu/topology manager kubelet policies, or post new ones, to make sure we enable latency optimal container pinning in constrained environments. The biggest example is RAN-like workers, with 20-24 cores, possibly hyperthreaded. There are two colliding requirements - reducing overhead (using all cores) vs. avoiding noisy neighbours.

      Why is this important?

      • Not enough threads in total if we keep some of them unused
      • Latency sensitive workload needs to avoid any neigbours on the same core(s)

      Scenarios

      1. The isolated cpu pool contains a partial core (one thread from a core that has a sibling in the reserved pool). The platform needs to make sure that anything latency sensitive is not pinned to that thread, because otherwise if will be affected by a noisy neighbour. This scenario is useful for minimizing the number of threads used for housekeeping (one thread for reserved and one for infrastructure pods).
      2. A workload that is latency sensitive must be the only workload running on a core or must be rejected / report a noisy neigbour warning of some kind
      3. A workload that is security sensitive must be the only workload running on a core or must be rejected to make sure it cannot be compromised using timing and other cache related attacks (Spectre and other vulnerabilities included).
      4. Being the only workload on a core might mean using all threads or making unused threads unavailable to others

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • E2E or functional test must demonstrate the correct allocation happens.
      • A guaranteed latency sensitive workload has a way to be isolated from noisy neighbours on sibling threads
      • A guaranteed latency sensitive workload that does not occupy a whole core (all its threads) must be rejected with a meaningful error

      Dependencies (internal and external)

      1. cpu manager
      2. (topology manager as it shares some data with cpu manager)

      Previous Work (Optional):

      1. TBD

      Open questions::

      1. Upstream or downstream first?
      2. (related to previous work to some extent) can the existing cpumanager static policy guarantee the desired behaviour?
      3. where does the testsuite belong? not sure it fits k8s (same reasons of the policy, too narrow use case?), and we (telco 5g) we want to run anyway. Perhaps submit u/s first and take it in ocp/cnf if u/s rejects?
      4. Is rejection the only way if the pod is not requesting the whole core? Can the infrastructure "block" other threads from the rest of the system?

      Risk assessment and work estimate

      There is significant risk here if upstream solution is expected. We have a design proposal, but the KEP process is lenghty and uncertain. Downstream only solution depends on the willingness of OCP team.

      The proposed solution is mostly isolated from existing code at node (kubelet) level. The impact of the policies on the resource accounting can be relevant, increasing the risk of quick acceptance.

      L

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

              fromani@redhat.com Francesco Romani
              msivak@redhat.com Martin Sivak
              None
              None
              Sunil Choudhary Sunil Choudhary
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: