Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4887

[4.11] Pods completed + deleted may leak

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • 4.10.z
    • None

      Description of problem:

      When a pod runs to a completed state, we typically rely on the update event that will indicate to us that this pod is completed. At that point the pod IP is released and the port configuration is removed in OVN. The subsequent delete event for this pod will be ignored because it should have been cleaned up in the previous update.
      
      However, there can be cases where the update event is missed with pod completed. In this case we will only receive a delete with pod completed event, and ignore tearing down the pod. The end result is the pod is not cleaned up in OVN and the IP address remains allocated, reducing the amount of address range available to launch another pod. This can lead to exhausting all IP addresses available for pod allocation on a node.

      Version-Release number of selected component (if applicable):

      4.10.24

      How reproducible:

      Not sure how to reproduce this. I'm guessing some lag in kapi updates can cause the completed update event and the final delete event to be combined into a single event.

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      Port still exists in OVN, IP remains allocated for a deleted pod.

      Expected results:

      IP should be freed, port should be removed from OVN.

      Additional info:

       

              trozet@redhat.com Tim Rozet
              trozet@redhat.com Tim Rozet
              Zhanqi Zhao Zhanqi Zhao
              Red Hat Employee
              Arti Sood
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: