Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25273

workload not distributed evenly in the cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.12
    • Documentation / Node
    • None
    • No
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      Description of problem:

      Customer is having designed with multi-AZ support for both block storage and Compute, is encountering issues with the default kube-scheduler, specifically related to LowNodeUtilization. Despite having approximately a dozen nodes per availability zone (AZ-A, AZ-B, AZ-C), workloads are not evenly distributed, resulting in underutilization in AZ-C. This has led to blocked nodes once they reach the maximum Pod capacity of 250.
      
      As of now two solution
      
      Transitioning to HighNodeUtilization:
      
      Leveraging node binpacking and static Kubelet reservations to establish an upper limit on resources.
      
      Aiming for a more balanced distribution of workloads across nodes and availability zones.
      
      Increasing Maximum Pod Count:Adjusting the maximum Pod count to a value that aligns with the customer's specific use case.
      
      Addressing the underutilization in AZ-C by accommodating more Pods on the available nodes.
      
      
      Customer Questions:
      Which approach, HighNodeUtilization or increasing the maximum Pod count, would be more suitable for the customer's use case?
      
      Is it possible to fine-tune scheduling beyond the existing profiles through additional supported configurations?References:
      
      [1] https://docs.openshift.com/container-platform/4.12/nodes/scheduling/nodes-scheduler-profiles.html
      
      [2] https://docs.openshift.com/container-platform/4.12/nodes/nodes/nodes-nodes-resources-configuring.html
      
      To be more specific regarding fine-tuning scheduling beyond the existing profiles, Customer  would also like to know whether the `requestedToCapacityRatio` scoringStrategy [1] is supported.
      
      [1] https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/#tuning-the-score-function
          

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

            ocp-docs-bot OCP DocsBot
            rhn-support-bhab Bharathi B
            Rama Kasturi Narra Rama Kasturi Narra
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: