Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 4.21.0
Affects Version/s: 4.18.z, 4.19.z, 4.20.z, 4.22.0, 4.21.z
Component/s: OLM
Labels:

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
1
Severity:
Critical
Regression:
None

Target Backport Versions:

4.18.0, 4.19.0, 4.20.0
Target Version:

4.21.z
Release Blocker:
Rejected
Sprint:
Vaporeon Sprint 282, Weedle Sprint 283
sprint_count:
2

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
Done
Release Note Type:
Bug Fix
Release Note Text:

Hide
* Before this update, catalog sync triggered high I/O on masters, and caused etcd leader election and TTL counter resets. As a consequence, catalog sync caused high I/O and etcd events persistence, which affected user cluster performance. With this release, catalog sync duration is reduced from 10 minutes to four hours. As a result, I/O load and etcd events are reduced and catalog sync duration is minimized.

The default catalog polling interval has been increased from 10 minutes to 4 hours to reduce load on catalog sources.

Show
* Before this update, catalog sync triggered high I/O on masters, and caused etcd leader election and TTL counter resets. As a consequence, catalog sync caused high I/O and etcd events persistence, which affected user cluster performance. With this release, catalog sync duration is reduced from 10 minutes to four hours. As a result, I/O load and etcd events are reduced and catalog sync duration is minimized. The default catalog polling interval has been increased from 10 minutes to 4 hours to reduce load on catalog sources.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-69441~~. The following is the description of the original issue:
—
Description of problem:

It's been observed that the catalog sync triggers high I/O on masters where etcd runs. This then triggers an etcd leader election which then resets TTL counters on keys, in particular resulting in etcd events never clearing.

It seems unlikely that a 10 minute catalog update factors critically into anyone's operational plans. Therefore we should reduce the catalog source sync duration to four hours avoiding the etcd knock on effects in the local cluster while also reducing load on quay.io or their local mirrors by ~ 95%.

Version-Release number of selected component (if applicable):

All, but lets only bother with 4.18-4.22

How reproducible:

100%

Steps to Reproduce:

Observe catalog update duration

Actual results:

Happens every 10 minutes

Expected results:

Happens every 240 minutes

Additional info:

While I suspect the backend load on our infrastructure or the customer's infrastructure isn't horrible it would be good if we ensured there was an appropriate jitter added so that we avoid any stampeding herd effects of a mass reboot like a datacenter outage. A random sleep of up to 10 minutes is probably sufficient. We should consider whether or not an admin wishing to update the catalog right now would need to have a method to skip the jitter or not, but "restart the pod and wait up to 10 minutes" is probably not horrible.

We should also make sure that our release notes mention this change and that we document the preferred path for updating the catalog source right now.

clones

OCPBUGS-69441 10m catalog sync interval contributes to unbounded etcd growth

Closed

is blocked by

OCPBUGS-69441 10m catalog sync interval contributes to unbounded etcd growth

Closed

is depended on by

OCPBUGS-73876 10m catalog sync interval contributes to unbounded etcd growth

Closed

links to

operator-framework/operator-marketplace#705: [release-4.21] OCPBUGS-73881: Increase default catalog polling interval to 4h (240m)

Assignee:: Rashmi Gottipati

Reporter:: Scott Dodson

QA Contact:: Jian Zhang

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2026/01/15 11:18 PM

Updated:: 2026/02/10 10:07 AM

Resolved:: 2026/02/10 10:07 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates