Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4127

High load average on SNO with DU profile running non-rt kernel causing kubeapi-server instability

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When running a vdu test app workload on an SNO with DU profile running non-rt kernel the load average increases to ~130. While the node is under load, some of the times the kube api cannot recover following a rollout.

      Version-Release number of selected component (if applicable):

      4.12.0-rc.0 with 4.18.0-372.36.1.el8_6.x86_64 kernel

      How reproducible:

      Consistently

      Steps to Reproduce:

      1. Deploy and configure SNO with DU profile with 4.18.0-372.36.1.el8_6.x86_64 kernel
      
      2. Deploy test app
      
      3. Force a kube api server rollout:
      oc patch kubeapiserver cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge 
      
      4. Wait for kube-apiserver to achieve a new revision

      Actual results:

      Some of the times the kube-apiserver doesn't recover and remains unreachable.

      Expected results:

      kube-apiserver always recovers

      Additional info:

       

              bwensley@redhat.com Bart Wensley
              mcornea@redhat.com Marius Cornea
              None
              None
              Marius Cornea Marius Cornea
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: