Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: CNV v4.14.0
Affects Version/s: None
Component/s: Storage Platform
Labels:
- blocker+
- cnv-4+
- cnvbugsm
- devel_ack+
- needinfo?
- pm_ack+
- qa_ack+

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
BZ Status:
CLOSED
BZ URL:
https://bugzilla.redhat.com/show_bug.cgi?id=2236223
Bugzilla Bug:
RHBZ: 2236223
[QE] How to address?:
---
[QE] Why QE missed?:
---
Market:

Severity:
Urgent

Regression:
No

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:
On recent openshift nightlies simple image pulls (fedora) will simply not converge,
unless the memory limit on CDI pods is kicked up to ridiculous values (1600M),
suggesting that memory throttling may be taking place on the importer pod

Version-Release number of selected component (if applicable):
OCP 4.14.0-0.nightly-2023-08-28-154013
CNV v4.14.0.rhel9-1796

How reproducible:
100%

Steps to Reproduce:
1. Create DV

Actual results:
Basically never converge

Expected results:
Success in a timely manner

Additional info:
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
annotations:
cdi.kubevirt.io/storage.bind.immediate.requested: "true"
name: test-dv-node-import-needs-convert
spec:
source:
http:
url: http://.../Fedora-Cloud-Base-35-1.2.x86_64.qcow2
pvc:
accessModes:

ReadWriteOnce
resources:
requests:
storage: 12Gi

Edit HCO.spec with
resourceRequirements:
storageWorkloads:
limits:
cpu: 750m
memory: 1600M
requests:
cpu: 100m
memory: 60M
To observe how the issue is alleviated

Some inspection of the same issue on GCP clusters importing a Windows image
showed high mem usage values (though not as high as the limit) - attached to the bug

Some notes:

Is it possible the entire image stays on the page cache?
Note this is before qemu-img convert
Why did OOMs/throttles not happen before, say, in 4.14.0-ec.3?
For some images, 2x CDI pod limits unclog
have to go a lot higher for large images (Windows) to work though
cgroupsv2 is default now (throttles instead of OOM - https://kubernetes.io/blog/2021/11/26/qos-memory-resources/)

blocks

CNV-30277 Comments in "feat: add tekton-tasks-operator" PR #532

Closed

external trackers

PnT-DevOps Jira CNV-28673

Red Hat Errata Tool 113931

Red Hat Issue Tracker OCPBUGS-18965

Red Hat Product Errata RHSA-2023:6817

Assignee:: Alex Kalenyuk

Reporter:: Alex Kalenyuk

QA Contact:: Natalie Gavrielov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2023/08/30 5:44 PM

Updated:: 2023/11/13 2:12 PM

Resolved:: 2023/11/08 2:06 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates