Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-167

AWS winc 240 pods per node causes NodeNotReady

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Normal Normal
    • None
    • 4.11
    • Windows Containers
    • Moderate
    • 3
    • False
    • Hide

      None

      Show
      None

      Description of problem:
      In AWS, when running node-pod-density workload to fill a worker node to maximum 250 pods per node, I am unable to reach 250 pods. Windows workers begin to be unable to stay reliably Ready above 230 pods.

      Version-Release number of selected component (if applicable):
      OCP 4.11
      WMCO 6.0.0

      How reproducible:
      80%

      Steps to Reproduce:
      1. Provision a 4.11 cluster in AWS with 2-8 Windows workers (i.e. ovn hybridnetworking, WMCO 6.0.0)
      2. Run node-pod-density workload with PODS_PER_NODE>=230

      Actual results:
      One or more windows workers will become NotReady.
      Sometimes they will reboot themselves, other times they have remained in a NotReady state and have been unrecoverable.

      Expected results:
      Windows workers will remain Ready under 250 pod load.

      Additional info:
      Windows worker instance types tested: m5.2xlarge
      load pod images tested: mcr.microsoft.com/oss/kubernetes/pause:3.6 , k8s.gcr.io/pause:3.6
      node-density workload from: https://github.com/afcollins/e2e-benchmarking/commit/d09ea2accd5d9b0aed6defe6da02282840459e04
      See kube-burner env values from (except set PODS_PER_NODE=240): https://github.com/afcollins/airflow-kubernetes/blob/0b7514875cb960cf0853ff9517a903cf5c1af683/dags/openshift_nightlies/config/benchmarks/control-plane-winc.json

            team-winc Team WinC
            ancollin@redhat.com Andrew Collins
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: