1. Proposed title of this feature request
Create metrics in RHCOS to prevent "error loading seccomp filter: errno 524".
2. What is the nature and description of the request?
It was discussed in OCPBUGS-62648 to create metrics to try to prevent the "error loading seccomp filter: errno 524".
In OCPBUGS-49423 and RHOCPPRIO-419 it was identified that newer issues were not a regression of RHELPLAN-167394. In versions with that fix, it was identified that the issue appears when there is a very high number of zombie processes, around 20k.
Metrics for bfe_jit and bfe_jit_limit could be useful, but also other metrics like the number of zombie processes (and maybe other metrics) will be also useful.
3. Why does the customer need this? (List the business requirements here)
To prevent the "error loading seccomp filter: errno 524", getting alerts before reaching the limits to face that issue.
4. List any affected packages or components.
RHCOS
CRI-O
- is caused by
-
OCPBUGS-62648 error loading seccomp filter: errno 524
-
- ASSIGNED
-
- links to