-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
rhel-9.2.0
-
Allow runtime configuration of isolcpus=managed_irq
-
-
rhel-sst-kernel-ft
-
ssg_core_kernel
-
False
-
Description
In some usecases (e.g. Telco 5G RAN DU), a single server can be used to run a combination of ultra low-latency workloads and “normal” workloads without the same low latency requirements. To support the ultra low-latency workloads, the kernel must be configured to isolate the CPUs running these workloads from managed interrupts using the isolcpus=managed_irq command-line parameter.
However, the managed_irq option can only be changed with a reboot. This is a problem for containerized workloads running on OpenShift (i.e. kubernetes) where a mix of low latency and “normal” workloads can be created/destroyed dynamically and the number of CPUs allocated to each workload is often not known at boot time. To handle this limitation, the managed_irq option must be applied to all CPUs that could potentially run a low latency workload. This results in a much smaller set of CPUs being available to process the managed interrupts.
The long term goal is to allow the runtime configuration of CPU isolation related options (see also RHEL-14487). The runtime configuration of managed_irq is a lower priority than nohz_full/rcu_nocbs, but should be done to allow consistent configuration of CPU isolation at runtime.
What SSTs and Layered Product teams should review this?
I think this falls under the RHEL Core Kernel Group. jlelli@redhat.com and philauld would be good people to review.