-
Bug
-
Resolution: Done
-
Critical
-
4.18.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
All
-
None
-
None
-
None
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
This is a clone of issue OCPBUGS-56785. The following is the description of the original issue:
—
Description of problem:
The kubelet podresources endpoint returns allocated exclusive resources to active pods. The endpoint incorrectly returns resources allocated to terminated pods. There are 2 factors which concur to create the bug 1. the podresources API depends on inner working of kubelet to retrieve the list of currently active pods. Previously, the related function incorrectly returned active and terminated pods. 2. if the podresources API incorrectly consider a terminated pod, we run into another issue in memory manager. The memory manager collects stale resources (assignment to terminated pods) only in the allocation flow. Thus, if no pods manage to get admitted, the kubelet through podresources API incorrectly reports memory resources assigned to a terminated pods. This reporting is bogus as these resources are not reserved anymore, but the podresources API cannot know that. This does NOT affect the allocation flow (first thing it does is cleanup) but does affect the reporting, and this behavior is not fixed upstream. why this affects only memory? 1. device assignment is explicitely cleaned by the podresources API endpoint 2. the cpu assignment is not (and it should) but it is automatically cleaned every cpuManagerReconcilePeriod seconds, so it automatically recovers this breaks in an unrecoverable way numa aware scheduling.
Version-Release number of selected component (if applicable):
4.18.z (any) actually reproduced in Server Version: 4.18.0-0.nightly-2025-04-13-142946
How reproducible:
100%
Steps to Reproduce:
1. configure the kubelet with memory manager policy = Static 2. run a job whose pod qualify for memory pinning (see example manifest below) 3. query the podresources endpoint on the nodes. The endpoint is node-local exposed through a unix domain socket. It has to be queried programmatically. Probably the simplest option is to download the `knit` tool from https://github.com/openshift-kni/debug-tools/releases/tag/v0.2.1 and to use it like `knit podres` with root privileges. example manifest: ``` apiVersion: batch/v1 kind: Job metadata: labels: app: idle-gu-job-sched-stall generateName: generic-pause- spec: backoffLimit: 6 completionMode: NonIndexed completions: 2 manualSelector: false parallelism: 2 podReplacementPolicy: TerminatingOrFailed suspend: false template: metadata: labels: app: idle-gu-job-sched-stall spec: containers: - args: - 1s command: - /bin/sleep image: quay.io/openshift-kni/pause:test-ci imagePullPolicy: IfNotPresent name: generic-job-idle resources: limits: cpu: 100m memory: 256Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Never schedulerName: default-scheduler terminationGracePeriodSeconds: 30 topologySpreadConstraints: - labelSelector: matchLabels: app: idle-gu-job-sched-stall matchLabelKeys: - pod-template-hash maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule ```
Actual results:
kubelet returns memory resources assigned to terminated pod
Expected results:
either: 1. kubelet does not return the terminated pod 2. kubelet return the terminated pod, but without any resource assigned to it
Additional info:
possibly affects older versions of openshift solved kubernetes upstream by the pod workers refactoring: the podresources endpoint (correctly) ignores terminated pods and only lists active pods
- clones
-
OCPBUGS-60524 [4.18] kubelet podresources API incorrectly reports memory assignments of terminated pods
-
- Closed
-
- depends on
-
OCPBUGS-60524 [4.18] kubelet podresources API incorrectly reports memory assignments of terminated pods
-
- Closed
-
- is cloned by
-
OCPBUGS-60553 [4.16] kubelet podresources API incorrectly reports memory assignments of terminated pods
-
- Closed
-
- is depended on by
-
OCPBUGS-60553 [4.16] kubelet podresources API incorrectly reports memory assignments of terminated pods
-
- Closed
-
- links to