-
Bug
-
Resolution: Can't Do
-
Critical
-
None
-
4.14.z
- When running the 12h stability test with stress-ng with 60% load the API-server was restarted and never recover there for the all the stress pods stay running and continue loading the cluster. From the console it was visible that cgroup oom happend. Master0 and 2 was not possible to access via SSH. Master 1 was reachable but could not execute any oc-command. After restart master 2 via KVM it was possible to delete stress deployment. however master 0 and 2 needed to restart again that cluster started to be more stabile again.
it was confirmed that the 4.14.4 set being used was identical to the GA version.