Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Blocker
Fix Version/s: None
Affects Version/s: 2.8 GA
Component/s: Gateway
Labels:
- MGDAPI
- RHMI-3scale

3Scale PT Tested upstream:
Not Started
3scale PT Docs:
Not Started
3scale PT Product Specs:
Not Started
3scale PT Product Update Ready:
Not Started
3scale PT Released In Saas:
Not Started
3scale PT Verified Product:
Not Started
Department:
Engineering
Target Release:

2.10 ER1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

WHAT

Container apicast-production must terminate when one of its processes gets OOM killed. The reason for this is so that we get a predictable OOM kill behavior by k8s. When a container runs multiple processes and one of these processes (non-init / main) gets OOM killed by the underlying OSD node's kernel, the container continues to run which results in k8s not knowing that the container was OOM killed, therefore the pod where the container runs is not restarted. This behavior remains undetected by CSSRE because the pod/container is still running.

HOW

The options I can think of are;
1. Use one process per container
2. Ensure the whole container exits when one of its processes gets OOM killed
3. Use a liveness probe that checks for OOM killed processes other than the main/init process.

DONE

Container exits when one of its child processes gets OOM killed. This allows k8s to automatically perform a restart policy for the pod and avoid undetected OOM'd processes in a container.

is related to

THREESCALE-6047 Add new Prometheus alert on Operator

Closed

THREESCALE-5965 Worker process metric

Closed

Assignee:: Unassigned

Reporter:: Ciaran Byrne (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2020/07/13 8:25 AM

Updated:: 2021/10/24 6:34 AM

Resolved:: 2020/10/05 7:03 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates