Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: 4.20.0
Component/s: Node / CRI-O
Labels:
- cnv
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Approved
Sprint:
OCP Node Sprint 273 (blue), OCP Node Sprint 274 (blue), OCP Node Sprint 275 (blue)
sprint_count:
3

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
In Progress
Release Note Type:
Bug Fix
Release Note Text:
Fixed wrong cpu.max settings when running GuaranteedQoS Pods with cpu manager.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

When starting a vmi with guaranteed QOS, the resulting virt-launcher pod is very slow. After some analysis, it shows that the cgroups are configured as the following: 
```
/<path-to-pod>/<our-container>/cpu.max (1000 100000)
/<path-to-pod>/<our-container>/container/cpu.max (max 100000)
```
Which means that the cpu is throttled.

The same pod with 4.19 the cgroups:
```
/<path-to-pod>/<our-container>/cpu.max (200000 100000)
/<path-to-pod>/<our-container>/container/cpu.max (200000 100000)
```

Version-Release number of selected component (if applicable):

4.20

How reproducible:

Everytime

Steps to Reproduce:

    1. Start a vm with dedicated cpu
    2. Look at the cgroups in the node where the pod has been scheduled
    3. Try to console the vmi

Actual results:

The vmi is booting very very slow

Expected results:

The vmi booting at normal speed

Additional info:

U/S cri-o issue: https://github.com/cri-o/cri-o/issues/9251


We also noticed a general slowness in the runtime of the CI with provider 1.33. In particular, with 1.32 provider the testsuite takes ~1h30m; against the 4h20m with provider 1.33.
My personal suspect is that, even with non guuaranteed QOS, there is something wrong.

blocks

CNV-63585 [Tracker Bug] VMI pod slowness due to CPU throttling with k8s 1.33

Closed

is blocked by

RHEL-101023 Update crun to 1.22

Closed

split from

CNV-64418 kubevirtci 1.33 workaround: disable fg to make lanes work as before

Dev Complete

links to

Fix incorrectly set cpu.max when quota is -1. #1794

Assignee:: Ayato Tokubi

Reporter:: Federico Fossemo

Need Info From:: None

Contributors:: None

QA Contact:: Min Li

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/06/12 12:13 PM

Updated:: 2025/08/14 3:51 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates