1. Proposed title of this feature request
Ability to direct exec operations to certain CPUs in a guaranteed QoS class pod
2. What is the nature and description of the request?
When a partner develops a containerized DPDK application, it will want to give full exclusive CPU access to the busy-loop polling threads. It is also possible that some housekeeping process will be running on a separate CPU inside the same pod.
However, we have seen that certain common Kubernetes operations cannot be done when this type of configuration is run on the RT kernel. The list of these operations is:
- Running oc exec / oc rsh / oc cp / oc rsync on the pod
- Having exec probes for livenessProbe or readynessProbe
- Having an exec postStart or preStop hook
Those operations cannot be done because the new processes started in the pod will run at a non-RT priority, and could land on one of the CPUs running the busy-loop polling threads. This can add latency to the DPDK application, and in a worse case scenario cause a deadlock between the non-RT process and some kernel thread. Several support cases have been opened, where the vmcore crash analysis showed the issue.
Currently, there is no control over the CPU(s) where a newly exec'ed process will be run on a pod. If we had a way to ensure the new process will not run on the CPUs owned by the busy-loop polling threads, we would be able to run those common admin tasks on the pod.
3. Why does the customer need this? (List the business requirements here)
As of now, there is a growing list of "forbidden" operations when it comes to running a containerized DPDK application on OpenShift with the RT kernel.
- This creates administrative overhead, since those applications and the clusters running them need to be treated differently.
- It creates extra work on the Red Hat support side, since every time the situation is triggered, a new case to analyze a vmcore is opened.
- It can degrade confidence on OpenShift's ability to run this type of workload.
4. List any affected packages or components.
crio/runc/crun
- is related to
-
OCPSTRAT-1292 Don't interrupt pinned CPU pods by exec probes
- In Progress