Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: RHEL CoreOS
Labels:
- pscomponent:kernel

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1. Proposed title of this feature request:

A new tool / utility in OCP which will save the core dump file of segfaulted containerized process but with proper naming format (container + process name).

2. What is the nature and description of the request?

I have a customer who is looking for a feature in OCP - users should be able to map the coredump file to container and pod. Currently the coredump gets stored on the host OS itself there are two problems with this approach:

Not every customer has access to the node / Host OS
Customers can not map the coredump file to the container / pod.

So there has to be some utility at OCP level so that it would be applied cluster level and users should be able to get their coredump easily and they would be able to reach from coredump file to the actual pod which has crashed.

3. Why does the customer need this? (List the business requirements here)

This customer has hundreds of c++ based processes which are running in hundreds of pods / containers. Their containerized processes often segfaults and as per OCP design the core dump gets captured on the host OS / node. Collecting these coredumps from these node is difficult task for them for various reasons for e.g:

a) This design either cripples their troubleshooting capabilities or causes friction with their clients who usually own the clusters.
b) They can't get the coredump unless they ask the cluster administrator.
c) There is risk of OS file system pollution if core files are frequently generated.

You can visualize the situation like this - A private, medium‑sized OpenShift cluster (40–50 nodes) hosts about 20 tenant organizations. Each tenant runs multiple C++ workloads, resulting in hundreds of C++ pods overall. Two tenants deploy application patches on the same day. The first patch introduces a defect that causes the application to generate hundreds of core files. Independently, the second tenant’s patch also introduces a defect that produces its own set of core files. Without reliable mechanisms to attribute, isolate, and deliver core files to the correct tenant, the shared core dump accumulation becomes an immediate operational obstacle, delaying root cause analysis for all affected parties. Even if such an event occurs rarely (e.g., once per year), the inability to efficiently map core files to their originating workloads significantly impairs timely troubleshooting in a multi‑tenant environment.

Now imagine a 2000 node cluster with 500 tenants.

4. List any affected packages or components.

OCP 4.XX

Assignee:: Mark Russell

Reporter:: Yogesh Babar

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/08/29 6:21 AM

Updated:: 2025/09/23 5:39 AM

Target start:: None

Target end:: None

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide