-
Spike
-
Resolution: Done
-
Critical
-
None
-
None
-
False
-
-
False
-
-
-
0
-
0
Impact assessment for OCPBUGS-30096
Which 4.y.z to 4.y'.z' updates increase vulnerability?
This issue is present in the following releases, and updating to them exposes the cluster:
- 4.12.49 through 4.12.51
- 4.11.58
Which types of clusters?
- Likely all, or at least clusters which are near the limits of I/O capacity
What is the impact? Is it serious enough to warrant removing update recommendations?
- After rebooting into kernel-4.18.0-372.88.1.el8_6 or later kernel nodes experience high load average and io_wait times
- Nodes may fail to start or stop pods, probes may fail
- Workload and host processes may become unresponsive and workload may be disrupted
How involved is remediation?
- The kernel would need to be overridden to an unaffected version
Is this a regression?
- Yes, this is a kernel regression introduced in kernel-4.18.0-372.88.1.el8_6 and as of yet unfixed in 8.6 kernels.
Note, since OCP 4.11 is EOL we will not ship a subsequent 4.11.z which addresses this, please either apply the workaround or upgrade to 4.12 when a fix becomes available there.
- blocks
-
OCPBUGS-30278 [4.11][Tracker for RHEL-26706] High Load and Pods Stuck Terminating
- Verified
-
OCPBUGS-30096 [4.12][Tracker for RHEL-26706] High Load and Pods Stuck Terminating
- Closed
- links to