Uploaded image for project: 'Red Hat 3scale API Management'
  1. Red Hat 3scale API Management
  2. THREESCALE-5630

apicast-production container didn't terminate when one of its processes was OOM killed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • None
    • 2.8 GA
    • Gateway
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Engineering

      WHAT

      Container apicast-production must terminate when one of its processes gets OOM killed. The reason for this is so that we get a predictable OOM kill behavior by k8s. When a container runs multiple processes and one of these processes (non-init / main) gets OOM killed by the underlying OSD node's kernel, the container continues to run which results in k8s not knowing that the container was OOM killed, therefore the pod where the container runs is not restarted. This behavior remains undetected by CSSRE because the pod/container is still running.

      HOW

      The options I can think of are;
      1. Use one process per container
      2. Ensure the whole container exits when one of its processes gets OOM killed
      3. Use a liveness probe that checks for OOM killed processes other than the main/init process.

      DONE

      Container exits when one of its child processes gets OOM killed. This allows k8s to automatically perform a restart policy for the pod and avoid undetected OOM'd processes in a container.

              Unassigned Unassigned
              cbyrne@redhat.com Ciaran Byrne (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: