-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
-
4.13.z, 4.14.0
-
+
-
Moderate
-
Yes
-
CNF Compute Sprint 234
-
1
-
False
-
-
-
Description of problem:
When Performance profile is modified from realTime true to realTime false , the change in workloadhints doesnt stop stalld.
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-03-14-053612
How reproducible:
everytime
Steps to Reproduce:
1. Setup multinode BM cluster with 2 workers 2. Create a mcp pool worker-cnf 3. Create a profile as show below:
spec: cpu: balanceIsolated: false isolated: 2-39,42-79 reserved: 0-1,40-41 machineConfigPoolSelector: machineconfiguration.openshift.io/role: worker-cnf nodeSelector: node-role.kubernetes.io/worker-cnf: "" numa: topologyPolicy: single-numa-node realTimeKernel: enabled: true workloadHints: realTime: true
4. Wait for nodes to comeback .
5. Check stalld process is running
6. Then modify the profile to disable the realTime workload hint to false. as show below:
spec: cpu: balanceIsolated: false isolated: 2-39,42-79 reserved: 0-1,40-41 machineConfigPoolSelector: machineconfiguration.openshift.io/role: worker-cnf nodeSelector: node-role.kubernetes.io/worker-cnf: "" numa: topologyPolicy: single-numa-node realTimeKernel: enabled: false workloadHints: realTime: false
7. check the nodes are in ready state.
[root@registry kni]# oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready control-plane,master 5d6h v1.26.2+bc894ae master-1 Ready control-plane,master 5d6h v1.26.2+bc894ae master-2 Ready control-plane,master 5d6h v1.26.2+bc894ae worker-0 Ready worker,worker-cnf 5d5h v1.26.2+bc894ae worker-1 Ready worker 5d5h v1.26.2+bc894ae
Actual results:
[root@registry kni]# oc debug node/worker-0 Temporary namespace openshift-debug-k9knn is created for debugging node... Starting pod/worker-0-debug ... To use host binaries, run `chroot /host` Pod IP: 10.46.80.2 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-5.1# systemctl status stalld ● stalld.service - Stall Monitor Loaded: loaded (/usr/lib/systemd/system/stalld.service; enabled; preset: disabled) Active: active (running) since Tue 2023-03-21 16:36:13 UTC; 12min ago Main PID: 1785 (stalld) Tasks: 1 (limit: 3299464) Memory: 720.0K CPU: 66ms CGroup: /system.slice/stalld.service └─1785 /usr/bin/stalld --systemd -p 1000000000 -r 20000 -d 3 -t 20 --foreground --pidfile /run/stalld.pidMar 21 16:36:13 localhost stalld[1785]: lockdown mode is off Mar 21 16:36:13 localhost stalld[1785]: /sys/kernel/debug/sched/features exists Mar 21 16:36:13 localhost stalld[1785]: dl_runtime is shorter than 1ms, setting HRTICK_DL Mar 21 16:36:13 localhost stalld[1785]: /sys/kernel/debug/sched/debug exists Mar 21 16:36:13 localhost stalld[1785]: boosted pid 0 (undef) using SCHED_DEADLINE Mar 21 16:36:13 localhost systemd[1]: Started Stall Monitor. Mar 21 16:36:13 localhost stalld[1785]: using SCHED_DEADLINE for boosting Mar 21 16:36:13 localhost stalld[1785]: initial config_buffer_size set to 614400 Mar 21 16:36:13 localhost stalld[1785]: detected new task format Mar 21 16:36:13 localhost stalld[1785]: single threaded mode sh-5.1# ps -ef | grep stalld root 1785 1 0 16:36 ? 00:00:00 /usr/bin/stalld --systemd -p 1000000000 -r 20000 -d 3 -t 20 --foreground --pidfile /run/stalld.pid root 13676 13420 0 16:49 ? 00:00:00 grep stalld
Expected results:
stalld should be disabled
Additional info:
- blocks
-
OCPBUGS-11384 Switching from enabling realTime to disabling Realtime Workloadhint causes stalld to be enabled
- Closed
- is cloned by
-
OCPBUGS-11384 Switching from enabling realTime to disabling Realtime Workloadhint causes stalld to be enabled
- Closed
- links to
-
RHEA-2023:5006 rpm