Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: openshift-virtualization
Labels:
None

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1. Proposed title of this feature request
Better Warnings and Safeguards for Over-Utilization of ODF Storage with OpenShift Virtualization

2. What is the nature and description of the request?
In CU environment, primarily running OpenShift Virtualization, they’ve encountered critical issues when ODF storage utilization exceeds 80%. At that point, the cluster becomes unresponsive, halting most operations—including critical diagnostics such as must-gather.

While this issue has been previously reported and addressed reactively, their current workaround is to over-provision their clusters with additional flash-backed worker nodes to create a buffer. This is both resource-inefficient and costly.

The default 80% utilization alert is useful for container-based workloads, but is insufficient for VM-heavy environments, where VMs tend to consume larger and more persistent storage blocks over time. They currently have no proactive warnings that take into account the impact of adding new VMs with storage-backed PVCs.

3. Why does the customer need this? (List the business requirements here)
Currently today we have to over build our clusters to prevent this issue resulting in approx 30% more resources than really needed should we have safe guards around this.

4. List any affected packages or components.
ODF 4.18.7
Virt 4.18.11, 4.19

Additional Details from the CU:

They propose the following improvements to help prevent ODF-related outages in virtualized environments:

1. Pre-scheduling Warnings for VM Additions:

When provisioning a new VM, evaluate whether its storage footprint (PVC size) could push the cluster’s ODF utilization beyond a safe threshold.

Warn users if the cumulative VM storage usage could create a risk of reaching critical utilization levels.

2. Intelligent Capacity Awareness:

Existing VM PVC usage and future write patterns (e.g., thick vs. thin provisioning).

Current data distribution across worker nodes.

3. Predictive Overcommitment Guardrails:

Introduce a predictive model or threshold system that simulates “worst-case” storage growth for VMs (e.g., if all disks hit 100% utilization), and warns users well before that scenario materializes.

1. 1. Business Impact
    To prevent storage-induced outages in their virtualized workloads, they are forced to overbuild their clusters by approximately 30%. This includes additional:

Worker nodes with SSD/flash storage

Infrastructure capacity, just to maintain stability

These measures inflate operational costs and reduce cluster efficiency. A more intelligent warning and enforcement mechanism would allow them to:

Optimize cluster sizing

Improve reliability

Avoid catastrophic failures tied to storage overconsumption

Please reference SF Ticket: 04210788

Assignee:: Peter Lauterbach

Reporter:: Darren Carpenter

Need Info From:: None

Votes:: 2 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/07/30 6:06 PM

Updated:: 2025/11/27 2:19 PM

Target start:: None

Target end:: None

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates