Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-63498

Uneven HCP pods scheduling: 352 pods from 27 HCP scheduled on a single worker node

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • 4.17.z
    • HyperShift
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • Customer Escalated
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      In a management cluster with 55 Hosted Control Planes (HCPs) and (18 nodes, 3 master+15 worker nodes), 352 HCP pods from 27 different hosted clusters (49% of all HCPs) were scheduled onto a single worker node (we08d23). Cluster is experiencing severe latency issues, with upto 20+ second delays for basic system operations like `busctl` calls to systemd-hostnamed. Some nodes are experiencing high workload pressure from these HCP workloads that are observed to be unevenly distributes among the 15 worker nodes. 
      
      19 kube-apiserver pods were scheduled to that worker. 

       

      $ cat sos_commands/crio/crictl_pods | grep "clusters-" | awk '{print $6}' | grep kube-apiserver | wc -l
      19 
      $ cat sos_commands/crio/crictl_pods | grep "clusters-" | wc -l
      352
      $ cat sos_commands/crio/crictl_pods | grep "clusters-" | awk '{print $7}' | cut -d- -f1-2 | sort -u | wc -l
      27
      

      Environment:

       

       

      OpenShift Version: 4.17
      Worker Nodes: 15 total (256 CPU)
      - Reserved CPUs for housekeeping: 32 (CPUs 0-31) - via Performance Profile
      - Isolated CPUs: 224 (CPUs 32-255)
      Hosted Control Planes: ~55 

       

      Steps to Reproduce:

          1. 50+ Hosted Control Planes with the cluster spec above.
          2. Worker nodes with CPU isolation (limited allocatable CPUs)        

      Actual results:

      19 kube-apiservers in one node     

      Expected results:

      55 apiservers ÷ 15 nodes * 3(HA) = 11 per node for kube-apiservers

      Additional info:

      All 19 apiservers are from different hosted clusters, meaning the affinity is working correctly. 

              cewong@redhat.com Cesar Wong
              abdullahsikder Abdullah Sikder
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

                Created:
                Updated: