-
Bug
-
Resolution: Not a Bug
-
Minor
-
None
-
4.8
-
Quality / Stability / Reliability
-
None
-
None
-
None
-
Low
-
None
-
Unspecified
-
None
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
None
-
None
-
None
-
None
-
None
Description of problem:
The node_exporter sometimes uses high cpu under load and it looks like a spinlock race on multiple CPUs.
Version-Release number of selected component (if applicable):
4.8.23, large number of CPUs host like 96 CPUs
How reproducible:
Always in customer env
Steps to Reproduce:
1. Generate load and monitor node resources using top command with 10 sec interval
2.
3.
Actual results:
node_exporter sometimes use N * 100% CPU for a while. Normally it uses only 5%.
Expected results:
No unexpected high CPU usage with node_exporter
Additional info:
Similar spinlock race high CPU usage is reported in upstream when cpufreq collector is enabled. It sounds like the spinlock race happens without the cpufreq where the node_exporter cannot get metrics smoothly for some reason.
https://github.com/prometheus/node_exporter/issues/1963
https://github.com/prometheus/node_exporter/pull/1964
https://github.com/prometheus/node_exporter/issues/1880