-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.19
-
None
-
None
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Problem Statement:
In containerized environments, especially with operators like Spark that dynamically generate multiple pods with extensive environment variables or argument lists, the hard limit imposed by ARG_MAX causes job failures. Although workarounds like periodic cleanup or restructuring workloads exist, these are not sustainable or production-friendly solutions.
Current Behavior:
- ARG_MAX is statically defined at kernel build time (#define ARG_MAX 131072).
- The limit cannot be adjusted at runtime, even through ulimit or sysctl.
- When exceeded, processes fail with argument list too long errors, impacting container startup and workload automation.
Proposed Enhancement:
- Introduce a kernel tunable (sysctl or cgroup-level parameter) to adjust the maximum allowable argument list size (ARG_MAX) within safe boundaries.
- Alternatively, provide documented best practices or kernel enhancements for better handling of large argument/environment spaces in containerized workloads.
Business Impact:
- Affects OpenShift customers using Spark and similar data-intensive operators.
- Requires manual or cron-based workarounds, which are operationally inefficient.
- A configurable or adaptive mechanism would improve reliability and scalability in modern containerized applications.