Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-24274

API server restart during the stress-ng load testing resulting that cluster was inaccessible

XMLWordPrintable

    • Critical
    • No
    • False
    • Hide

      None

      Show
      None

      • When running the 12h stability test with stress-ng with 60% load the API-server was restarted and never recover there for the all the stress pods stay running and continue loading the cluster. From the console it was visible that cgroup oom happend. Master0 and 2 was not possible to access via SSH. Master 1 was reachable but could not execute any oc-command. After restart master 2 via KVM it was possible to delete stress deployment. however master 0 and 2 needed to restart again that cluster started to be more stabile again.

      • it was confirmed that the 4.14.4 set being used was identical to the GA version.

              Unassigned Unassigned
              rhn-support-vismishr Vishvranjan Mishra
              Xingxing Xia Xingxing Xia
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: