Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: 4.21.0
Affects Version/s: 4.19.z
Component/s: HyperShift
Labels:
- cee.neXT
- go-lang
- groomed
- rits-work

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:

4.21.0
Release Blocker:
None
Sprint:
None

Customer Impact:

Customer Escalated, Customer Facing
RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Go applications in OpenShift clusters (such as kube-apiserver) can have performance issues when running on high-core-count systems (e.g. 512 cores). With applications creating threads aligned to the number of CPU visible (in the cpuset). This can cause less then optimal scheduling performance, and significant CPU spikes may be observed during operations such as garbage collection which utilise a large number of threads.

This default behaviour can resource contention and reduced cluster performance and introduce instabilities, even when the applications may not be under load. Reducing the number of logical cores available to OpenShift (e.g. reducing a 512 core system to 128 cores) can improve performance, which is counter-intuitive due to reducing the effective capability of the hardware.

This was seen on customer cluster on OpenShift 4.19: 
- 3 Master Node running 512 core CPU and 2TB RAM each 
- 1 worker Node running 512 core CPU and 2TB RAM each

Version-Release number of selected component (if applicable):

 OpenShift 4.19

How reproducible:

Reproducible to varying effect on (>128) high core count CPU nodes, where higher core counts introduce more significant peaks.

This was tested and verified on Baremetal SNO Lab cluster with 128core CPU count. https://docs.google.com/document/d/1_yE2DPHjqH89box3d1lv5moyWC_VZFkN50hIgTUneH4/edit?tab=t.0#heading=h.1outy5ivqwra

Reproduced in baremetal 576 core:
https://docs.google.com/document/d/1GU4PdsuoVbqhQ3rxl3f39t3ALlzniIeaVYL0fjuQ0rI/edit

Steps to Reproduce:

1.Deploy OpenShift cluster on high-core-count hardware (128+ cores)   

2. Check thread count for Go applications: 
  $ oc exec -n openshift-apiserver <pod> -- sh -c 'ps -eLf | grep apiserver | grep -v grep | wc -l'   

3. Observe CPU performance directly via pidstat or some other means:
  $ pidstat -p $PID 1 | awk 'NR>3 {print $1,$8}'
  06:34:13 18.00
  06:34:14 17.00
  06:34:15 12.00
  06:34:16 12.00
  06:34:17 4416.00
  06:34:18 14.00
  06:34:19 14.00

Actual results:

Significant CPU utilisation peaks, with significant thread counts for Go based applications

e.g.
 - 512-core cluster: Go applications create 500+ threads per process
 - 128-core cluster: Go applications create 120+ threads per process

Expected results:

- CPU utilisation should remain stable on a system, regardless of the number of cores enabled on the hardware.
  i.e. under the same load on the same hardware, a 512-core system should perform as well (or better) than a smaller core-count system.

Additional info:

We believe that the Go runtime, and interactions based on how many threads Go uses for operations (as defined internally by GOMAXPROCS) is a significant contributor to this performance inconsistency.

The Go runtime uses sched_getaffinity() to calculate the number of CPUs available to the process, which determines the default GOMAXPROCS value. By artificially reducing this via the GOMAXPROCS environment variable, we are able to override the default values and emulate how the application might run on a smaller core-count system, which in several instances has improved performance and stability of OpenShift components under test.

This forced adjustment of GOMAXPROCS can be instated container runtime wide by using CRI-O config similar to the following:
``` 
[crio.runtime]
default_env = [
    "NSS_SDB_USE_CACHE=no",
    "GOMAXPROCS=32",
]
```

WARNING: by adjusting GOMAXPROCS artificially such as via the environment above, you are practically restricting the amount of CPU that Go can use concurrently.


For core workloads, instating a PerformanceProfile (and optionally Workload Partitioning) may help to isolate core workloads and change the cpuset available to processes, which impacts the default Go runtime behaviour. This however does not have an impact on any workloads that are running outside of the systemd system.slice.

account is impacted by

RFE-7881 Set a sensible GOMAXPROCS if none is provided by build/developer

Closed

is duplicated by

OCPBUGS-62265 Go runtime GOMAXPROCS defaults to all visible CPU cores, causing CPU throttling and performance degradation

Closed

links to

openshift/hypershift#7063: OCPBUGS-61881: Red Hat Konflux update hypershift-gomaxprocs-webhook

Assignee:: Cesar Wong

Reporter:: Abdullah Sikder

Need Info From:: None

Contributors:: Juan Manuel Parrilla Madrid

QA Contact:: He Liu

Doc Contact:: None

Votes:: 2 Vote for this issue

Watchers:: 32 Start watching this issue

Created:: 2025/09/18 12:37 AM

Updated:: 2025/11/08 12:36 AM

Resolved:: 2025/10/23 7:47 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates