-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
4.12.z
-
No
-
OCPNODE Sprint 238 (Blue)
-
1
-
False
-
-
-
7/27: telco priority wrt 4.12.z backport pending input from field (DM)
Description of problem:
In single node k8s cluster, kube-api-server or etcd failed due to healthz check for more than 5 minutes. During kube-api-server failure, a previously connected device plugin server is stopped by application. Kubelet will change resource from cm.deviceplugin.healthy to cm.deviceplugin.unhealthy but failed to commit to k8s etcd. Then after 5minutes, kubelet will remove resource from device plugin manager endpoints map, but failed to commit to k8s etcd. From now on, there will be no removed resource information in kubelet device plugin manager.After k8s service stable, kubelet syncNodeStatus will get a k8s node object from k8s etcd, and the k8s node object contains the removed Resource information in k8s node capacity and allocatable. When kubelet updates node status, it depends on the k8s node object got from api-server, commit changed status and patch to k8s etcd. The removed resource in k8s node capacity and allocatable will have no chance to be removed.Basically, if we get a k8s node object with a capacity and allocatable that kubelet don’t know, Then the capacity and allocatable cannot be changed any more. In k8s source v1.27.1, pkg/kubelet/nodestatus/setters.go func MachineInfo, Kubelet update capacity and allocatable according to what we get from devicePluginResourceCapacityFunc, but the unexisted resource in devicePluginResourceCapacityFunc cannot be removed.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
Reproduce steps: Preconditon: Single node k8s cluster, it is easy to repoduce problem in single node cluster. 1. Start a device plugin with one device, make sure the resource is 1 by checking Allocatable field in "kubectl describe nodes" 2. Make kube-api-server unhealthy, kubelet failed to communicate with kube-api-server, this unhealthy should last 5 minutes 3. Stop the device plugin. 4. Wait for kubelet to remove device plugin due to graceful stop expired. You can check log "Set capacity for removed resource to 0 on device removal" in kubelet log. 5. Recover kube-api-server to healthy. You can see that Allocatable field in "kubectl desribe nodes" remains 1 while the device plugin stopped.
Actual results:
Expected results:
Additional info: