Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: 4.15.0
Affects Version/s: 4.15
Component/s: Networking / multus
Labels:
- PerfScale
- regression

Severity:
Important
Regression:
No
Release Blocker:
Approved
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:
N/A
Release Note Status:
In Progress
Target Version:

4.15.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

In Reliability (loaded longrun, the load is stable) test, the 3 multus pods memory increased from <100 MiB to 700+MB in 7 days.

The multus pods have requests memory: 65Mi, while there is no memory limit. If the test run for longer time and the memory keep increasing, this issue can impact the nodes' resource.

Version-Release number of selected component (if applicable):

4.15.0-0.nightly-2023-11-13-174800

How reproducible:

Met this the first time. I did not see this in 4.14's Reliability test.

Steps to Reproduce:

1. Install a AWS compact cluster with 3 masters, workers are on master nodes too. O
2. Run reliability-v2 test https://github.com/openshift/svt/tree/master/reliability-v2. The test will long run and simulate multiple customers usage on the cluster.
config: 1 admin, 5 dev-test, 5 dev-prod, 1 dev-cron.
3. Monitor the metrics: container_memory_rss{container="kube-multus",namespace="openshift-multus"}

Actual results:

3 multus pods memory increased from <100 MiB to 700+MB in 7 days.
After the test load stopped, the memory increase stopped, but didn't drop down.

Expected results:

memory should not continuous increase

Additional info:

% oc adm top pod -n openshift-multus --containers=true --sort-by memory -l app=multus
POD NAME CPU(cores) MEMORY(bytes)
multus-xp474 kube-multus 12m 1275Mi
multus-xp474 POD 0m 0Mi
multus-xt64s kube-multus 21m 971Mi
multus-xt64s POD 0m 0Mi
multus-d9xcs kube-multus 6m 757Mi
multus-d9xcs POD 0m 0Mi

The monitoring screenshots:

multus-memory-increase.png

multus-memory-increase-stop.png

Must-gather: must-gather.local.4628887688332215806.tar.gz

links to

openshift/multus-cni#201: OCPBUGS-23475: Fix to use lumberjack only for logging files

RHSA-2023:7198 OpenShift Container Platform 4.15 security update

Assignee:: Tomofumi Hayashi

Reporter:: Qiujie Li

QA Contact:: Qiujie Li

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2023/11/21 3:18 AM

Updated:: 2024/02/27 9:04 PM

Resolved:: 2024/02/27 9:04 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates