Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-23475

[Reliability][regression]multus pods memory increased from <100M to 700+M in 7 days

XMLWordPrintable

    • Important
    • No
    • Approved
    • False
    • Hide

      None

      Show
      None
    • N/A
    • In Progress

      In Reliability (loaded longrun, the load is stable) test, the 3 multus pods memory increased from <100 MiB to 700+MB in 7 days.
      
      The multus pods have requests memory: 65Mi, while there is no memory limit. If the test run for longer time and the memory keep increasing, this issue can impact the nodes' resource.

      Version-Release number of selected component (if applicable):

      4.15.0-0.nightly-2023-11-13-174800

      How reproducible:

      Met this the first time. I did not see this in 4.14's Reliability test.

      Steps to Reproduce:

      1. Install a AWS compact cluster with 3 masters, workers are on master nodes too. O
      2. Run reliability-v2 test https://github.com/openshift/svt/tree/master/reliability-v2. The test will long run and simulate multiple customers usage on the cluster.
      config: 1 admin, 5 dev-test, 5 dev-prod, 1 dev-cron.
      3. Monitor the metrics: container_memory_rss{container="kube-multus",namespace="openshift-multus"}

      Actual results:

      3 multus pods memory increased from <100 MiB to 700+MB in 7 days.
      After the test load stopped, the memory increase stopped, but didn't drop down.

      Expected results:

      memory should not continuous increase

      Additional info:

      % oc adm top pod -n openshift-multus --containers=true --sort-by memory -l app=multus
      POD NAME CPU(cores) MEMORY(bytes)
      multus-xp474 kube-multus 12m 1275Mi
      multus-xp474 POD 0m 0Mi
      multus-xt64s kube-multus 21m 971Mi
      multus-xt64s POD 0m 0Mi
      multus-d9xcs kube-multus 6m 757Mi
      multus-d9xcs POD 0m 0Mi 

      The monitoring screenshots:

      multus-memory-increase.png

      multus-memory-increase-stop.png

      Must-gather: must-gather.local.4628887688332215806.tar.gz

              tohayash@redhat.com Tomofumi Hayashi
              rhn-support-qili Qiujie Li
              Qiujie Li Qiujie Li
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: