Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.19.z
Component/s: Containers
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
3
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
RUN 283, RUN 284
sprint_count:
2

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

{}Issue:{}
CRI-O is frequently segfaulting and restarting due to an attempt to pull the non-existent image causing pods to enter "ImagePullBackOff" and disrupting services. The issue was initially reported on May 2, 2025, and escalated to high severity due to its impact on the development stack and potential for causing problems in production.

{}Environment{}
OCP 4.19.21

{}Actual Issue:{}
Core dump analysis revealed a "makeslice: len out of range" error, indicating a memory allocation issue within CRI-O when attempting to pull the specific image. The panic is triggered by the `makeslice` function, suggesting that the image's layers or metadata might be causing excessive memory allocation.

{}Steps so far:{}
1. Customer reported CRI-O segfaulting due to an attempt to pull a non-existent image, causing pods to enter "ImagePullBackOff" state.
2. Provided journalctl logs showing CRI-O attempting to access and pull the problematic image.
3. Engineer requested coredump, sosreport, and additional debugging information.
4. Core dump analysis confirmed a "makeslice: len out of range" error, indicating a memory allocation issue.
5. Customer manually pulled the image using `crictl` and `podman`, which also failed with the same error.
6. Customer confirmed the issue was resolved by restarting the affected pods, though the root cause was not fully understood.
7. Customer reopened the case on January 20, 2026, reporting the issue persists, indicating a need for resilience in CRI-O's container image layer management.

{}Next steps:{}
Customer to provide a sosreport and journal logs from the affected node the next time they face the issue.
Discuss and implement a feature request to enhance CRI-O's ability to handle such issues, preventing future occurrences and ensuring stability.

{}Ask{}
Enhance CRI-O's ability to handle such issues, preventing future occurrences and ensuring stability.

relates to

OCPBUGS-77080 CRI-O storage syncfs change

Assignee:: Jan Kaluza

Reporter:: Novonil Choudhuri

Need Info From:: None

Contributors:: None

QA Contact:: None

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2026/01/20 5:23 PM

Updated:: 2026/02/19 5:07 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates