-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.14
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Pods are getting in CrashLoopBackOff state
Version-Release number of selected component (if applicable):
RHOCP 4.14
How reproducible:
Steps to Reproduce:
sh-4.4# oc get pods -A -o wide | egrep -i "pending|error|crash"
ibm-satellite-storage satellite-storage-operator-695d8d5cdd-6rbf9 0/1 CrashLoopBackOff 4192 (22s ago) 13d 172.30.106.xx mylocalnode65 <none> <none>
openshift-servicemesh istio-cni-node-v2-6-g8l8n 0/1 CrashLoopBackOff 8620 (4m14s ago) 27d 172.30.106.xx mylocalnode65 <none> <none>
openshift-servicemesh istio-cni-node-v2-6-grnzf 0/1 CrashLoopBackOff 8456 (4m25s ago) 27d 172.30.181.xxx mylocalnode66 <none> <none>
openshift-console. downloads-569f5c5d58-2pgqn 0/1 CrashLoopBackOff 8217 25d 172.30.106.x mylocalnode65 <none> <none>
Pod logs:
~~~
2025-04-03T15:25:27Z ERROR controller-runtime.source.EventHandler failed to get informer from cache {"error": "Timeout: failed waiting for *v1alpha1.ClusterServiceVersion Informer to sync"}
2025-04-03T15:25:27Z ERROR controller-runtime.source.EventHandler failed to get informer from cache {"error": "Timeout: failed waiting for *v1.Infrastructure Informer to sync"}
~~~
____________________________________________________
Node mylocalnode65:
------------------
MEMORY
Stats graphed as percent of MemTotal:
MemUsed ▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊▊. 97.8%
RAM:
41 GiB total ram
40 GiB (98%) used
____________________________________________________
sh-4.4# free -m
total used free shared buff/cache available
Mem: 41937 23945 3103 17 14888 17503
Swap: 0 0 0
____________________________________________________
$ oc get --raw /apis/metrics.k8s.io/v1beta1/nodes/mylocalnode65 | jq .
{
"kind": "NodeMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "mylocalnode65",
"creationTimestamp": "2025-04-03T15:22:57Z",
"labels": {
"arch": "amd64",
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/instance-type": "upi",
"beta.kubernetes.io/os": "linux",
<<snip>> "timestamp": "2025-04-03T15:22:57Z",
"window": "5m0s",
"usage": {
"cpu": "1274m",
"memory": "24123244Ki"
}
Actual results:
Pods are going into CrashLoopBackOff even if the memory is available on the node.
Expected results:
Kubelet should consider available memory of node to keep pods in running state.
Additional info: