Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: CNV v4.11.0
Affects Version/s: None
Component/s: CNV Virtualization
Labels:
- blocker+
- cnv-4+
- cnvbugsm
- devel_ack+
- pm_ack+
- qa_ack+
- qe_test_coverage?

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Ready:
False
BZ Status:
CLOSED
BZ URL:
https://bugzilla.redhat.com/show_bug.cgi?id=2052689
Bugzilla Bug:
RHBZ: 2052689

Sprint:
CNV Virtualization Sprint 215, CNV Virtualization Sprint 216, CNV Virtualization Sprint 217, CNV Virtualization Sprint 218, CNV Virtualization Sprint 219
Severity:
Critical

Regression:
No

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

The overhead memory calculations that Openshift uses is incorrect.

The VMI's Pod is getting OOMed causing the VM guest to terminate.

The sum of the RSS memory within the VMI's pod exceeds the Pod's memory limit. This means that the sum of all the processes in this pod (virt-launcher, qemu, libvirt, etc...) somehow crosses the memory overhead we calculate being needed for the Pod which hosts the guest VM.

Version-Release number of selected component (if applicable):

4.x Current customer case takes place in 4.8

How reproducible:

Based on the information in this PR, I can see that it appears possible for the memory limit to be exceeded. Here's the evidence I've gathered.

When the VM has a request of 22 Gi set on the VM's spec, the corresponding pod request is 22.23 Gi. That extra .23 GI on the pod spec is meant to represent the overhead required for virt-launcher and libvirt to run within the same environment as the qemu guest.

Looking at the ps aux output, the actual overhead of all the processes (minux qemu) within the Pod is around .21 Gi for libvirt and virt-launcher. That only leaves .02 Gi (or 20Mi) overhead for the qemu process itself.

Looking at the KubeVirt code, I see this being added as overhead to account for shared libraries and processes within the pod that exclude qemu.

overhead.Add(resource.MustParse("138Mi"))
The reality of what I'm seeing from the ps aux screenshot is that the RSS for all the processes in the virt launcher pod is more like 214Mi (or 0.21 Gi), not 138Mi.

Steps to Reproduce:

Not sure

Actual results:

The fact that the customer is hitting the memory limits and the VMI pod is being killed shows that our memory calculations aren't completely accurate. More memory is being consumed then expected.

Expected results:

Based on the memory calculations this shouldn't an issue but the VMI is constantly getting OOM a workaround had to be established in order to stabilize the customer environment. If our overhead calculations here are inaccurate, more customers will get impacted by OOM errors soon.

Additional info:

As a workaround that was established by shift-virtualization to get around this: There's a trick we can give to increase the Pod's overhead to avoid the OOM. This trick is basically the inverse of how we document memory overcommit here.

The trick is to tell the Guest OS that it has less memory than what we're actually requesting for the Pod's environment.

On the domain spec, If we originally set the request/limit to 22 GI like this...

resources:
requests:
cpu: '4'
memory: 22Gi
limits:
cpu: '4'
memory: 22Gi
We'd change that section to look like this...

memory:
guest: 22Gi
resources:
requests:
cpu: '4'
memory: 23Gi
limits:
cpu: '4'
memory: 23Gi

That tells the guest it has only 22Gi of memory, while giving the Pod's environment more memory.

blocks

CNV-20801 [2121340] Overhead Memory consumption calculations are incorrect