Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-58270

autoSizingReserved allocated inadequate CPU reservation in control plane nodes

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Low
    • No
    • None
    • None
    • OCP Node Sprint 273 (blue)
    • 1
    • Done
    • Release Note Not Required
    • N/A
    • None
    • None
    • None
    • None

      Description of problem:

      `autoSizingReserved: true` feature reserves excessively low (i.e. 90 millicores vs. 500 millicores default where autosizing is disabled) on control plane nodes, leading to inadequate systemReserved allocation and disrupting control plane operations

      Version-Release number of selected component (if applicable):

      OCP 4.16

      How reproducible:

      Always

      Steps to Reproduce:

      1. enable `autoSizingReserved: true` feature in KubeletConfig
      2. check CPU reservation on nodes /etc/node-sizing.env after KubeletConfig applied
      
      https://docs.openshift.com/container-platform/4.10/nodes/nodes/nodes-nodes-resources-configuring.html#nodes-nodes-resources-configuring-auto_nodes-nodes-resources-configuring 

      Actual results:

      $ awk 'NR == 1 || $3 ~ "2805"' sos_commands/process/ps_alxwww
      F   UID     PID    PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
      4     0    2805       1  20   0 5164428 339568 -    Rsl  ?        125:45 /usr/bin/kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime-endpoint=/var/run/crio/crio.sock --runtime-cgroups=/system.slice/crio.service --node-labels=node-role.kubernetes.io/control-plane,node-role.kubernetes.io/master,node.openshift.io/os_id=rhcos --node-ip=0.0.0.0 --minimum-container-ttl-duration=6m0s --cloud-provider=external --volume-plugin-dir=/etc/kubernetes/kubelet-plugins/volume/exec --image-credential-provider-bin-dir=/usr/libexec/kubelet-image-credential-provider-plugins --image-credential-provider-config=/etc/kubernetes/credential-providers/acr-credential-provider.yaml --hostname-override= --provider-id= --register-with-taints=node-role.kubernetes.io/master=:NoSchedule --pod-infra-container-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6485f12a32f3b065c6e9ada0f0cd0a766def9458e88ab423cac16a6ba6e9cc7 --system-reserved=cpu=0.09,memory=5Gi,ephemeral-storage=1Gi --v=2

      Expected results:

      `autoSizingReserved` should not reserve lesser CPU than would be reserved by default without any KubeletConfig in place. Allocations calculated by `autoSizingReserved` feature should add to the base CPU reservation of 500m

      Additional info:

      In OpenShift Container Platform 4.10 (and newer), half of a CPU core (500 millicore) is now reserved by the system by default to the system slice. However, when `autoSizingReserved: true` is set, this is significantly reduced to 90 millicores... 
      
      https://docs.openshift.com/container-platform/4.10/scalability_and_performance/recommended-host-practices.html#master-node-sizing_recommended-host-practices

              svanka@redhat.com Sai Ramesh Vanka
              rhn-support-rsandu Robert Sandu
              None
              None
              Bhargavi Gudi Bhargavi Gudi
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: