Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: 4.13.z
Affects Version/s: 4.10
Component/s: Image Registry
Labels:
- ServiceDeliveryImpact
- groomed

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
No

Target Backport Versions:
None
Target Version:

4.13.z
Release Blocker:
None
Sprint:
Sprint 244, Sprint 245
sprint_count:
2

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
Done
Release Note Type:
Bug Fix
Release Note Text:

Hide
Previously, the Image Registry Operator made API calls to the Storage Account List endpoint as part of obtaining access keys every 5 minutes. In projects with many {product-title} clusters, this could lead to API limits being reached causing 429 errors when attempting to create new clusters. With this release, the time between calls is increased from 5 minutes to 20 minutes. (link:https://issues.redhat.com/browse/OCPBUGS-22126[*~~OCPBUGS-22126~~*])

Show
Previously, the Image Registry Operator made API calls to the Storage Account List endpoint as part of obtaining access keys every 5 minutes. In projects with many {product-title} clusters, this could lead to API limits being reached causing 429 errors when attempting to create new clusters. With this release, the time between calls is increased from 5 minutes to 20 minutes. (link: https://issues.redhat.com/browse/OCPBUGS-22126 [* OCPBUGS-22126 *])

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-18469~~. The following is the description of the original issue:
—
Description of problem:

The image registry operator in Azure by default has two replicas. Every 5 minutes, each of those replicas makes a call to the StorageAccount List operation for the image registry storage account.

Azure has published limits for storage account throttling operations. These limits are 100 calls to list operations every 5 minutes based on the subscription & region pair that exists.

Because of this, customers are limited to <50 clusters per subscription and region in Azure. This number can change based on the number of image registry replicas as well as customer activity on List storage account operations within that subscription and region.

On Azure Red Hat OpenShift managed service, we occasionally have customers exceeding these limits including internal customers for demos, preventing them from creating new clusters within the subscription & region due to these scaling limits.

Version-Release number of selected component (if applicable):

N/A

How reproducible:

Always.

Steps to Reproduce:

1. Scale up the number of image registry pods to hit the 100 / 5 minute List limit (50 replicas, or enough clusters within a given subscription & region)
2. Attempt to create a new cluster
3. Cluster installation may fail due to image-registry cluster operator never going healthy, or the installer not being able to generate a storage account key for the bootstrap node to fetch its ignition config.

Actual results:

storage.AccountsClient#ListAccountSAS: Failure responding to request: StatusCode=429 -- Original Error: autorest/azure: Service returned an error. Status=429 Code="TooManyRequests" Message="The request is being throttled as the limit has been reached for operation type - Read_ObservationWindow_00:05:00. For more information, see - https://aka.ms/srpthrottlinglimits"

Expected results:

Cluster installs successfully

Additional info:

Raising this as a bug since this issue will be persistent across all cluster installations should one exceed the threshold. It will also impact the image-registry pod health.

blocks

OCPBUGS-22125 Azure Image Registry Operator Making too Many Storage Account List Calls

Closed

clones

OCPBUGS-18469 Azure Image Registry Operator Making too Many Storage Account List Calls

Closed

is blocked by

OCPBUGS-22127 Azure Image Registry Operator Making too Many Storage Account List Calls

Closed

links to

openshift/cluster-image-registry-operator#940: [release-4.13] OCPBUGS-22126: increase storage account key cache expiration

RHBA-2023:7604 OpenShift Container Platform 4.13.z bug fix update

Assignee:: Flavian Missi

Reporter:: OpenShift Prow Bot

QA Contact:: Wen Wang

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2023/10/19 3:00 PM

Updated:: 2025/07/25 5:36 AM

Resolved:: 2023/12/06 12:34 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates