-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18.z
-
None
-
False
-
-
None
-
Important
-
None
-
aarch64
-
UAT
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
deployment running on OpenShift Container Platform (OCP) version 4.18.11, the du-vpp container crashes because it is scheduled onto the shared CPU pool (CPUs 0–3, 71) instead of receiving its designated isolated CPU cores (CPUs 4–6).
Cluster Details:
- Cluster Version: 4.18.11
- Desired Version: 4.18.11
- CNI Plugin: OVNKubernetes
- Network Type: OVNKubernetes
- httpProxy: None
- httpsProxy: None
The hardware configuration and node CPU isolation settings have been verified and confirmed to be correct.
Root Cause:
This behavior is caused by a known CPU Manager race condition in Kubernetes (Upstream Issue #107906). When both the Init container (du-init) and the main container (du-vpp) request the same exclusive integer CPU value (e.g., cpu: "4"), a race condition can occur. After the Init container exits, Kubelet may incorrectly return those CPUs to the shared pool before the main container claims them.
As a result, the main container is assigned CPUs from the shared pool rather than the isolated cores, leading to instability and container crashes