Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-76917

Kubelet Cgroup Manager race condition with CRI-O causes pods to get stuck in ContainerCreating with empty imageID

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.18
    • Node / CRI-O
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          A race condition exists between the Kubelet cgroup manager and CRI-O. When a new pod is created, the cgroup manager receives an inotify event for the new cgroup path before the container is fully registered in CRI-O. Kubelet attempts to query the container status, receives a 404, and marks the pod as failed internally. This prevents all future synchronization for that pod.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          Customer Environment

      Steps to Reproduce:

      1.Deploy a pod with multiple containers (e.g., NetApp Trident controller).
      
      2. The issue occurs intermittently, often under high system load or when CNI setup takes >100ms.     

      Actual results:

       Kubelet Logs: manager.go:1169: "Failed to process watch event... Status  404 returned error can't find the container".
      
      Pod Status: Stuck in Pending / ContainerCreating.
      
      Status Fields: imageID is empty and PodReadyToStartContainers is False, even though crictl ps shows the container is actually Running.

      Expected results:

       Kubelet should implement a retry mechanism with exponential backoff when receiving a 404 error from the runtime during a cgroup watch event, allowing for the natural delay in CRI-O registration.   

      Additional info:

          

              pehunt@redhat.com Peter Hunt
              rhn-support-anamdev Aryan Namdev
              None
              None
              Min Li Min Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: