-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
4.13.z
-
Moderate
-
No
-
ppc64le
-
False
-
Description of problem:
On a OCP 4.13 cluster on IBM Power, we see it is failing to create any container on any node, if the container manifest has got resources.limits.memory equals to 9223372036854775807 (golang math.MaxInt64).
Version-Release number of selected component (if applicable):
OCP 4.13.10
How reproducible:
Reproducible on an IBM Power OCP 4.13 cluster, also with a simple demo pod (see below)
Steps to Reproduce:
1. OCP 4.13 cluster on IBM Power 2. Apply a demo pod with a container having resources.limits.memory = "9223372036854775807" , ex. the following: ~~~ oc new-project prjtest20240206 cat << EOF | oc apply -f - apiVersion: v1 kind: Pod metadata: name: podtest20240206 labels: app: httpd namespace: prjtest20240206 spec: nodeName: <NODE> securityContext: runAsNonRoot: true seccompProfile: type: RuntimeDefault containers: - name: contest20240206 image: 'image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest' ports: - containerPort: 8080 securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL resources: limits: cpu: "40" memory: "9223372036854775807" # <-------- NOTE requests: cpu: "2" memory: 400Mi EOF ~~~ 3. Check in kubelet logs the following error: ~~~ runc create failed: json: cannot unmarshal number 18446744073709551615 into Go struct field LinuxMemory.linux.resources.memory.swap of type int64 ~~~
Actual results:
Container is not created at all on the node, and kubelet logs show: ~~~ runc create failed: json: cannot unmarshal number 18446744073709551615 into Go struct field LinuxMemory.linux.resources.memory.swap of type int64 ~~~
Expected results:
Container is created, and its inspect will show: ~~~ .info.runtimeSpec.linux.resources.memory.limit = 9223372036854775807; .info.runtimeSpec.linux.resources.memory.swap = 9223372036854775807; ~~~
Additional info:
If I try the same demo pod yaml above on a demo cluster on x86_64, I do NOT see the error. I wasn't able to test the same demo pod above on a different demo cluster on IBM Power. If we lower 9223372036854775807 by one (that is: 9223372036854775806), the issue does NOT appear anymore (also on IBM Power). Not sure if this issue is originated by kubelet or crio, since it looks emerging from the interaction between them.