-
Bug
-
Resolution: Can't Do
-
Undefined
-
None
-
rhel-9.1.0
-
None
-
None
-
rhel-sst-kernel-tps
-
ssg_core_kernel
-
None
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
Unspecified
-
None
Description of problem:
cpuunclaimed script by bcc-tools throws ERROR "CPU samples arrived at skewed offsets"
Version-Release number of selected component (if applicable):
- cat /etc/redhat-release
Red Hat Enterprise Linux release 9.1 (Plow)
- uname -a
Linux hp-bl460cg9-1.gsslab.pnq2.redhat.com 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 30 07:36:03 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
- dmidecode | grep -iA 7 "system information"
System Information
Manufacturer: HP
Product Name: ProLiant BL460c Gen9
Version: Not Specified
Serial Number: SGH512VBBE
UUID: 30373237-3132-4753-4835-313256424245
Wake-up Type: Power Switch
SKU Number: 727021-B21
- rpm -qa | grep bcc
bcc-tools-0.24.0-4.el9.x86_64
bcc-0.24.0-4.el9.x86_64
python3-bcc-0.24.0-4.el9.noarch
How reproducible:
Always
Steps to Reproduce:
Got to /usr/share/bcc/tools
and Run:
- ./cpuunclaimed
OR
/usr/share/bcc/tools/cpuunclaimed
- ./cpuunclaimed
Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 7161382 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting.
- ./cpuunclaimed 5 10
Sampling run queues... Output every 5 seconds. Hit Ctrl-C to end.
ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 6643347 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting.
When we run the cpuunclaimed script by full path, it shows output but with ERROR as well:
- /usr/share/bcc/tools/cpuunclaimed
Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
%CPU 0.00%, unclaimed idle 0.00% <<---
ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 5163630 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting.
I have also ran a stress-ng test in parallel on another terminal and it still throws error.
Actual results:
Script throws error:
ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle)
Expected results:
It should not throw error "ERROR: CPU samples arrived at skewed offsets"
Additional info:
Looks like there was a fix with commit 77f4f663ad567e1ecf4528d25f00af548ac746b9 in upstream bcc :
$ git show 77f4f663
commit 77f4f663ad567e1ecf4528d25f00af548ac746b9
Author: yonghong-song <ys114321@gmail.com>
Date: Thu Jan 24 12:48:25 2019 -0800
fix cpuunclaimed.py with cfs_rq structure change (#2164)
Similar to runqlen.py, make proper adjustment for
cfs_rq_partial structure so it can align with
what the kernel expects.
Signed-off-by: Yonghong Song <yhs@fb.com>
diff --git a/tools/cpuunclaimed.py b/tools/cpuunclaimed.py
index b862bad2..75ee9324 100755
— a/tools/cpuunclaimed.py
+++ b/tools/cpuunclaimed.py
@@ -62,8 +62,9 @@ from time import sleep, strftime
from ctypes import c_int
import argparse
import multiprocessing
-from os import getpid, system
+from os import getpid, system, open, close, dup, unlink, O_WRONLY
import ctypes as ct
+from tempfile import NamedTemporaryFile
- arguments
examples = """examples:
@@ -98,6 +99,66 @@ wakeup_s = float(1) / wakeup_hz
ncpu = multiprocessing.cpu_count() # assume all are online
debug = 0
+# Linux 4.15 introduced a new field runnable_weight
+# in linux_src:kernel/sched/sched.h as
+# struct cfs_rq
+# and this tool requires to access nr_running to get
+# runqueue len information.
+#
+# The commit which introduces cfs_rq->runnable_weight
+# field also introduces the field sched_entity->runnable_weight
+# where sched_entity is defined in linux_src:include/linux/sched.h.
+#
+# To cope with pre-4.15 and 4.15/post-4.15 releases,
+# we run a simple BPF program to detect whether
+# field sched_entity->runnable_weight exists. The existence of
+# this field should infer the existence of cfs_rq->runnable_weight.
+#
+# This will need maintenance as the relationship between these
+# two fields may change in the future.
+#
.
.
.
<..>
- external trackers