-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
rhel-8
-
None
-
rhel-net-ovn
-
-
-
-
ssg_networking
Problem Description: Clearly explain the issue.
Memory usage of ovn-controller on some worker nodes is extremely high and it causes OOM Killer repeatedly.
# oc adm top pods --containers -n openshift-ovn-kubernetesoc adm top pods --containers -n openshift-ovn-kubernetes POD NAME CPU(cores) MEMORY(bytes) : ovnkube-node-aaaaa kube-rbac-proxy 0m 48Mi ovnkube-node-aaaaa kube-rbac-proxy-ovn-metrics 0m 50Mi ovnkube-node-aaaaa ovn-acl-logging 0m 2Mi ovnkube-node-aaaaa ovn-controller 992m 113667Mi ovnkube-node-aaaaa ovnkube-node 122m 102Mi ovnkube-node-bbbbb kube-rbac-proxy 0m 47Mi ovnkube-node-bbbbb kube-rbac-proxy-ovn-metrics 0m 48Mi ovnkube-node-bbbbb ovn-acl-logging 0m 2Mi ovnkube-node-bbbbb ovn-controller 993m 114988Mi ovnkube-node-bbbbb ovnkube-node 120m 100Mi :
# oc adm node-logs -l kubernetes.io/hostname=<node_name> --path='journal' | grep "Out of memory" kernel: Out of memory: Killed process 801747 (td-agent) total-vm:623012kB, anon-rss:55908kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:508kB oom_score_adj:998 kernel: Out of memory: Killed process 801659 (fluentd-entrypo) total-vm:12068kB, anon-rss:360kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:68kB oom_score_adj:998 kernel: Out of memory: Killed process 829234 (docker-entrypoi) total-vm:12212kB, anon-rss:572kB, file-rss:4kB, shmem-rss:0kB, UID:1000 pgtables:68kB oom_score_adj:995 kernel: Out of memory: Killed process 835139 (sh) total-vm:12080kB, anon-rss:476kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:72kB oom_score_adj:995 kernel: Out of memory: Killed process 800413 (ovs-vswitchd) total-vm:4414620kB, anon-rss:278016kB, file-rss:41924kB, shmem-rss:0kB, UID:800 pgtables:1008kB oom_score_adj:0 kernel: Out of memory: Killed process 800497 (NetworkManager) total-vm:393344kB, anon-rss:6944kB, file-rss:2356kB, shmem-rss:0kB, UID:0 pgtables:376kB oom_score_adj:0 kernel: Out of memory: Killed process 852934 (systemd-udevd) total-vm:104272kB, anon-rss:4016kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:188kB oom_score_adj:0 kernel: Out of memory: Killed process 852926 (systemd-udevd) total-vm:104272kB, anon-rss:4016kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:188kB oom_score_adj:0
Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).
The high memory usage of ovn-controller causes OOM killer of other many processes and worker nodes is flapping between Ready and NotReady due to the OOM Killler.
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
ovn22.12-22.12.0-18.el8fdp.x86_64
ovn22.12-central-22.12.0-18.el8fdp.x86_64
ovn22.12-vtep-22.12.0-18.el8fdp.x86_64
ovn22.12-host-22.12.0-18.el8fdp.x86_64
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
Not sure.
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
Not sure.
This high memory usage issue started happening suddenly one day on some worker nodes.
We tried restarting ovn-controller and worker nodes, but its memory usage increase soon and causes OOM Killer again.
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
Not sure
Expected Behavior: Describe what should happen under normal circumstances.
Memory usage of ovn-controller is low.
Observed Behavior: Explain what actually happens.
Memory usage of ovn-controller is extremely high.
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.
We tried restarting ovn-controller and worker nodes, but its memory usage increase soon and causes OOM Killer again.