Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-8948

[1906151] High CPU/Memory usage of Kube API server following a CNV installation

XMLWordPrintable

    • Urgent
    • No

      Description of problem:
      I'm not really convinced this falls under the installation component, but for the lack of a better space, will open it against that for now. Please feel free to move it where it needs to be.

      Immediately following the iinstall the of CNV on a 500 node baremetal clusteer (3 masters + 497 workers), we see 3511 CNV related pods brought up which is quite a lot. But along with that, there is an increased consumption of kube-apiserver memory and CPU coinciding with the install of CNV and it continues to be high. It seems like we need to investigate if there is any unwanted polling that these pods are doing or too many watches

      I will update the bug with more info as I find. Also happy to provide access to the cluster while it's up.

      Version-Release number of selected component (if applicable):
      2/4/3 (OCP 4.6.4)

      How reproducible:
      100% following a CNV install on a large scale environemnt.

      Steps to Reproduce:
      1. Install a large OCP cluster (500 nodes)
      2. Install CNV
      3.

      Actual results:

      API server on all masters is overloaded

      Expected results:

      The increase in load on API server should not be that significant
      Additional info:
      [kni@e16-h18-b03-fc640 ansible]$ oc get csv -n openshift-cnv
      NAME DISPLAY VERSION REPLACES PHASE
      kubevirt-hyperconverged-operator.v2.4.3 OpenShift Virtualization 2.4.3 kubevirt-hyperconverged-operator.v2.4.2 Installing
      [kni@e16-h18-b03-fc640 ansible]$ oc get ds
      NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
      bridge-marker 498 498 498 498 498 beta.kubernetes.io/arch=amd64 4h57m
      hostpath-provisioner 495 495 495 495 495 <none> 4h32m
      kube-cni-linux-bridge-plugin 498 498 498 498 498 beta.kubernetes.io/arch=amd64 4h57m
      kubevirt-node-labeller 495 495 495 495 495 <none> 4h57m
      nmstate-handler 498 498 498 498 498 beta.kubernetes.io/arch=amd64 4h57m
      ovs-cni-amd64 498 498 498 498 498 beta.kubernetes.io/arch=amd64 4h57m
      virt-handler 495 495 495 495 495 <none> 4h53m

      In the attached pictures, you can see around 13:30 when the CNV install first happened, API server CPU/Memory usage increased and continued to be high.

              ellorent Felix Enrique Llorente Pastora
              smalleni@redhat.com Sai Sindhur Malleni
              Meni Yakove Meni Yakove
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: