Pod Lifecycle Event Generator (PLEG)

In Kubernetes, Kubelet is a per-node daemon that manages the pods on the node, driving the pod states to match their pod specifications (specs). To achieve this, Kubelet needs to react to changes in both (1) pod specs and (2) the container states. For the former, Kubelet watches the pod specs changes from multiple sources; for the latter, Kubelet polls the container runtime periodically (e.g., 10s) for the latest states for all containers.

Polling incurs non-negligible overhead as the number of pods/containers increases, and is exacerbated by Kubelet's parallelism – one worker (goroutine) per pod, which queries the container runtime individually. Periodic, concurrent, large number of requests causes high CPU usage spikes (even when there is no spec/state change), poor performance, and reliability problems due to overwhelmed container runtime. Ultimately, it limits Kubelet's scalability.

(Related issues reported by users: #10451, #12099, #12082)

Goals and Requirements

The goal of this proposal is to improve Kubelet's scalability and performance by lowering the pod management overhead.

Reduce unnecessary work during inactivity (no spec/state changes)
Lower the concurrent requests to the container runtime.

Assignee:: Gaurav Singh

Reporter:: Gaurav Singh

Contributors:: Harshal Patil, Sai Ramesh Vanka

QA Contact:: Aruna Naik

Doc Contact:: Matthew Werner

Architect:: Mrunal Patel

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2023/03/20 6:41 PM

Updated:: 2024/12/16 10:19 PM

Target end:: 2025/03/02

Details

Description

Pod Lifecycle Event Generator (PLEG)

Goals and Requirements

Attachments

Easy Agile Planning Poker

Activity

People

Dates