Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9979

IBM Cloud: Node density tests fail due to increased pod latency times

XMLWordPrintable

    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Node density tests fail due to increased pod latency times on IBM Cloud

       

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-03-05-104719

       

      How reproducible: 100%

      Steps to reproduce:

      1. Build/create an IBM Cloud cluster with the following parameters:
      vm_type_masters: bx2-8x32
      vm_type_workers: bx2-4x16
      region: 'us-east'
      installer_payload_image: latest 4.13 nightly build

      2. Run the node-density kube burner test with the following parameters:
      VARIABLE: 200
      NODE_COUNT: 40 (you can scale up to 40 worker nodes during the kube-burner job or before)
      QPS=50
      BURST=50

       

      Actual results:

      The job fails (increased pod latency).

      OCP Version Flexy Id Scale Ci Job Grafana URL Status Cloud Arch Type Network Type Worker Count PODS_PER_NODE NODES Avg Pod Ready (ms) Avg Pod Scheduled (ms) Avg Initialized (ms) Avg Containers Ready (ms) Google Sheet Data Time/Date ENV_VARS
      4.13.0-0.nightly-2023-03-05-104719 183166 2159 59eabcb1-d8b9-4d79-a7ee-f31578738625 FAIL ibmcloud amd64 OVN 40 200 40 3900 0 7 3900   2023-03-06 10:10:35.961914-05:00 QPS=50
      BURST=50

       

      03-06 10:02:27.095  ###############################################
      03-06 10:02:27.095  Mon Mar  6 15:02:26 UTC 2023 Indexing enabled, using metrics from metrics-profiles/metrics.yaml
      03-06 10:02:27.095  ~/ws/workspace/multibranch-pipeline_kube-burner/workloads/kube-burner/workloads/node-pod-density ~/ws/workspace/multibranch-pipeline_kube-burner/workloads/kube-burner
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="🔥 Starting kube-burner (0.17.3@c38fe7eb37c62686b68e2b64bdc8311a4d73d8f1) with UUID 59eabcb1-d8b9-4d79-a7ee-f31578738625"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="👽 Initializing prometheus client"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="📁 Creating indexer: elastic"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="📈 Creating measurement factory"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="Registered measurement: podLatency"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="Preparing create job: node-density"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="Job node-density: 7460 iterations with 1 Pod replicas"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="Pre-load: images from job node-density"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="Pre-load: Creating DaemonSet using image gcr.io/google_containers/pause:3.1 in namespace preload-kube-burner"
      03-06 10:02:27.095  time="2023-03-06 15:02:26" level=info msg="Pre-load: Sleeping for 2m0s"
      03-06 10:04:33.463  time="2023-03-06 15:04:26" level=info msg="Pre-load: Deleting namespace preload-kube-burner"
      03-06 10:04:33.463  time="2023-03-06 15:04:27" level=info msg="Deleting namespaces with label kube-burner-preload=true"
      03-06 10:04:33.463  time="2023-03-06 15:04:27" level=info msg="Waiting for namespaces to be definitely deleted"
      03-06 10:04:45.616  time="2023-03-06 15:04:44" level=info msg="Triggering job: node-density"
      03-06 10:04:45.616  time="2023-03-06 15:04:44" level=info msg="Creating Pod latency watcher for node-density"
      03-06 10:04:45.616  time="2023-03-06 15:04:44" level=info msg="QPS: 50"
      03-06 10:04:45.616  time="2023-03-06 15:04:44" level=info msg="Burst: 50"
      03-06 10:04:45.616  time="2023-03-06 15:04:44" level=info msg="Running job node-density"
      03-06 10:07:22.020  time="2023-03-06 15:07:13" level=info msg="Waiting up to 1h0m0s for actions to be completed"
      03-06 10:07:28.544  time="2023-03-06 15:07:28" level=info msg="Actions in namespace 59eabcb1-d8b9-4d79-a7ee-f31578738625 completed"
      03-06 10:07:28.544  time="2023-03-06 15:07:28" level=info msg="Finished the create job in 2m44s"
      03-06 10:07:28.544  time="2023-03-06 15:07:28" level=info msg="Verifying created objects"
      03-06 10:07:35.177  time="2023-03-06 15:07:34" level=info msg="pods found: 7460 Expected: 7460"
      03-06 10:07:35.177  time="2023-03-06 15:07:34" level=info msg="Stopping measurement: podLatency"
      03-06 10:07:35.177  time="2023-03-06 15:07:34" level=info msg="Evaluating latency thresholds"
      03-06 10:07:35.177  time="2023-03-06 15:07:34" level=error msg="❗ P99 Ready latency (11.60s) higher than configured threshold: 5s"
      03-06 10:08:01.654  time="2023-03-06 15:07:58" level=info msg="node-density: PodScheduled 50th: 0 99th: 6 max: 119 avg: 0"
      03-06 10:08:01.654  time="2023-03-06 15:07:58" level=info msg="node-density: ContainersReady 50th: 3339 99th: 11599 max: 24006 avg: 3900"
      03-06 10:08:01.654  time="2023-03-06 15:07:58" level=info msg="node-density: Initialized 50th: 0 99th: 141 max: 1286 avg: 7"
      03-06 10:08:01.654  time="2023-03-06 15:07:58" level=info msg="node-density: Ready 50th: 3339 99th: 11599 max: 24006 avg: 3900"
      03-06 10:08:01.654  time="2023-03-06 15:07:58" level=info msg="Job node-density took 194.93 seconds"
      03-06 10:08:01.654  time="2023-03-06 15:07:58" level=info msg="Indexing metadata information for job: node-density"
      03-06 10:08:01.654  time="2023-03-06 15:07:59" level=info msg="Waiting 30s extra before scraping prometheus"
      03-06 10:08:33.660  time="2023-03-06 15:08:29" level=info msg="🔍 Scraping prometheus metrics for benchmark from 2023-03-06 15:04:44.064206001 +0000 UTC to 2023-03-06 15:08:29.273780395 +0000 UTC"
      03-06 10:09:55.151  time="2023-03-06 15:09:43" level=info msg="Finished execution with UUID: 59eabcb1-d8b9-4d79-a7ee-f31578738625"
      03-06 10:09:55.151  time="2023-03-06 15:09:43" level=info msg="👋 Exiting kube-burner"
      03-06 10:09:55.151  ~/ws/workspace/multibranch-pipeline_kube-burner/workloads/kube-burner 

      Expected results:

      Cluster density test passed (pod latency results are within accepted values).

       

              lhorsley@redhat.com Lena Horsley
              lhorsley@redhat.com Lena Horsley
              Lena Horsley Lena Horsley
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: