Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: 4.17.z
Affects Version/s: 4.16, 4.17, 4.18, 4.19, 4.20, 4.21
Component/s: Etcd
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Low
Regression:
No

Target Backport Versions:
None
Target Version:

4.17.z
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
Done
Release Note Type:
Enhancement
Release Note Text:

Hide
With this update, the `cluster-etcd-operator` Operator now implements a multi-stage notification system for the `etcdDatabaseQuotaLowSpace` alert to proactively manage etcd storage quotas. This enhancement is designed to prevent API server instability by providing earlier warnings of low database space. As etcd disk space usage reaches 65%, 75% and 85%, administrators now receive alerts with a severity level of info, warning, or critical. (link:https://issues.redhat.com/browse/OCPBUGS-61337[~~OCPBUGS-61337~~])

Show
With this update, the `cluster-etcd-operator` Operator now implements a multi-stage notification system for the `etcdDatabaseQuotaLowSpace` alert to proactively manage etcd storage quotas. This enhancement is designed to prevent API server instability by providing earlier warnings of low database space. As etcd disk space usage reaches 65%, 75% and 85%, administrators now receive alerts with a severity level of info, warning, or critical. (link: https://issues.redhat.com/browse/OCPBUGS-61337 [ OCPBUGS-61337 ])

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-60443~~. The following is the description of the original issue:
—
This is a clone of issue ~~OCPBUGS-60237~~. The following is the description of the original issue:
—
Description of problem:

There is a single alert bundled with cluster-etcd-operator called etcdDatabaseQuotaLowSpace that alerts when a cluster is using 95% of it's etcd quota. This alert is often too late, as seen by Managed OpenShift, and doesn't allow administrators enough time to correct issues before the API server is impacfted.

Version-Release number of selected component (if applicable):

How reproducible:

Very

Steps to Reproduce:

    1.Make a Managed Openshift (or OCP cluster) with default control plane size and default 8Gb quota.
    2.Write a loop to create lots of big secrets or configmaps.

Actual results:

The API server is unstable and the only solution is to resize the control plane (or pods backing etcd if in HCP), perform a defrag and try to get back in to delete resources.

Expected results:

Cluster administrators are alerted at info, warning, and then critical levels for etcdDatabaseQuotaLowSpace.

Additional info:

blocks

OCPBUGS-61505 [4.16] Singular etcdDatabaseQuotaLowSpace critical PrometheusRule isn't sufficient

Closed

clones

OCPBUGS-61235 [4.18] Singular etcdDatabaseQuotaLowSpace critical PrometheusRule isn't sufficient

Closed

is blocked by

OCPBUGS-61235 [4.18] Singular etcdDatabaseQuotaLowSpace critical PrometheusRule isn't sufficient

Closed

is cloned by

OCPBUGS-61505 [4.16] Singular etcdDatabaseQuotaLowSpace critical PrometheusRule isn't sufficient

Closed

links to

openshift/cluster-etcd-operator#1480: [release-4.17] OCPBUGS-61337: Vendor latest mixin, including additional and modified alerts for etcdDatabaseQuotaLowSpace

openshift/cluster-etcd-operator#1482: [release-4.16] OCPBUGS-61337: Vendor latest mixin, including additional and modified alerts for etcdDatabaseQuotaLowSpace

(1 links to)

Assignee:: Dean West

Reporter:: Josh Branham

QA Contact:: Sandeep Kundu

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/09/08 6:04 AM

Updated:: 2025/09/24 5:12 AM

Resolved:: 2025/09/24 5:12 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates