-
Bug
-
Resolution: Done-Errata
-
Major
-
None
-
4.18.0
-
None
-
Important
-
Yes
-
1
-
Rejected
-
False
-
-
Release Note Not Required
-
In Progress
This is a clone of issue OCPBUGS-43280. The following is the description of the original issue:
—
Description of problem:
NTO CI starts falling with: • [FAILED] [247.873 seconds] [rfe_id:27363][performance] CPU Management Verification of cpu_manager_state file when kubelet is restart [It] [test_id: 73501] defaultCpuset should not change [tier-0] /go/src/github.com/openshift/cluster-node-tuning-operator/test/e2e/performanceprofile/functests/1_performance/cpu_management.go:309 [FAILED] Expected <cpuset.CPUSet>: { elems: {0: {}, 2: {}}, } to equal <cpuset.CPUSet>: { elems: {0: {}, 1: {}, 2: {}, 3: {}}, } In [It] at: /go/src/github.com/openshift/cluster-node-tuning-operator/test/e2e/performanceprofile/functests/1_performance/cpu_management.go:332 @ 10/04/24 16:56:51.436 The failure happened due to the fact that the test pod couldn't get admitted after Kubelet restart. Adding the failure is happening at this line: https://github.com/openshift/kubernetes/blob/cec2232a4be561df0ba32d98f43556f1cad1db01/pkg/kubelet/cm/cpumanager/policy_static.go#L352 something has changed with how Kubelet accounts for `availablePhysicalCPUs`
Version-Release number of selected component (if applicable):
4.18 (start happening after OCP rebased on top of k8s 1.31
How reproducible:
Always
Steps to Reproduce:
1. Set up a system with 4 CPUs and apply performance-profile with single-numa-policy 2. Run pao-functests
Actual results:
Tests falling with: • [FAILED] [247.873 seconds] [rfe_id:27363][performance] CPU Management Verification of cpu_manager_state file when kubelet is restart [It] [test_id: 73501] defaultCpuset should not change [tier-0] /go/src/github.com/openshift/cluster-node-tuning-operator/test/e2e/performanceprofile/functests/1_performance/cpu_management.go:309 [FAILED] Expected <cpuset.CPUSet>: { elems: {0: {}, 2: {}}, } to equal <cpuset.CPUSet>: { elems: {0: {}, 1: {}, 2: {}, 3: {}}, } In [It] at: /go/src/github.com/openshift/cluster-node-tuning-operator/test/e2e/performanceprofile/functests/1_performance/cpu_management.go:332 @ 10/04/24 16:56:51.436
Expected results:
Tests should pass
Additional info:
NOTE: The issue occurs only on system with small amount of CPUs (4 in our case)
- clones
-
OCPBUGS-43566 [4.17] E2E: test related to cpumanager state file check during kubelet restart fails
- Closed
- depends on
-
OCPBUGS-43566 [4.17] E2E: test related to cpumanager state file check during kubelet restart fails
- Closed
- links to
-
RHBA-2024:8986 OpenShift Container Platform 4.16.z bug fix update