Add troubleshooting information to the EAP on OpenShift docs that includes advice on how to identify and troubleshoot OOM problems. This is mainly an issue for OpenShift Online, but I think it would also be useful for OCP.
From email thread with kwills@redhat.com and bozpeyni@redhat.com:
>> I've spent some time testing this, and forcing out of memory conditions etc,
>> I'm not sure that there is any special documentation for the end user
>> required on this. OpenShift (at least in my testing) clearly stops the pod
>> in the case of OOM, and the log message is explicitly clear about the cause
>> (out of heap, etc).
>
> ^^^ Thanks Ken. When this happens (pod stopped and then possibly new pod was
> re-started by OpenShift); as an administrator of EAP, how do I monitor this
> event? What do I need to do when this happens ? Which logs do I check that
> would help me troubleshoot EAP and re-configure default memory or other
> possible action items etc. ? How / Where are my parameters are set and
> re-test ? Such guidance to the EAP DevOps staff on container environment
> would be invaluable.I believe you should look in the event log for the pod. For logs, you
can use 'oc logs --previous <pod>' to get the application logs for the
previous pod (i.e. the one that was killed) to see what happened
within the application.Pod restarts should be configured via the strategy in the deployment
configuration. This field specifies what will be done if a pod dies.
I believe most of the templates use Recreate or Rolling.
- clones
-
JBEAP-14057 [7.1] Openshift: Add troubleshooting information for OOM
- Closed