Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-8733

Implement rate limiting for Argo CD notification controller

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • GitOps
    • None
    • None
    • Product / Portfolio Work
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Feature Overview

      • Executive Summary: Implement a proactive rate-limiting mechanism within the Argo CD Notification Controller. The controller should monitor GitHub API rate-limit headers (`X-RateLimit-Remaining`, `X-RateLimit-Reset`) and pause outgoing notification requests before exhaustion occurs to avoid 429 errors and potential token/IP throttling.

      Business Goal/Objective

      • Goal: Improve the stability and reliability of the notification service by preventing API rate-limit breaches.
      • Urgency: Medium. Large-scale environments with high deployment frequencies are currently experiencing intermittent notification failures and potential service blacklisting due to unmanaged API consumption.
      • Prioritization: Customer demand for enterprise-scale stability and strategic gap in API "good citizenship."

      Goals

      • Who benefits: DevOps engineers and developers relying on timely Argo CD notifications.
      • Current State: The controller aggressively sends updates until the GitHub API returns a 429 error, leading to reactive failures and potential temporary bans.
      • Future State: The controller intelligently yields when limits are low, pausing until the reset window opens, resulting in a predictable and compliant notification flow.

      Requirements

      Requirements Notes MVP
           
           

      Use Cases

      • Scenario: An enterprise team has 500+ Applications in Argo CD. During a massive sync event, the notification controller spikes in activity. Instead of hitting a 429 and failing randomly, the controller sees it has only 10 requests left, pauses for 3 minutes until the reset, and then resumes cleanly.

      Out of Scope

      • Persistent Queuing: This feature does not guarantee the delivery of every missed notification during the pause; it focuses on the controller's behavior to stay within API limits.
      • Other Providers: Initial implementation focuses on GitHub API; other providers (Slack, Teams) are out of scope for this specific RFE.

      Dependencies

      • GitHub API: Dependent on GitHub continuing to provide standard rate-limit headers in their REST/GraphQL responses.

      Background and Strategic Fit

      • < What does the person writing code, testing, documenting need to know? >

      Assumptions

      • < Are there assumptions being made regarding prerequisites and dependencies?>
      • < Are there assumptions about hardware, software, or people resources?>

      Customer Considerations

      • Customers in highly active environments may notice a "delay" in notifications. This is an intentional trade-off to ensure the long-term health of the API token and service.

      Documentation/QE Considerations

      • Doc Impact: Updates to existing notification controller documentation.
      • Content: New section on "Rate Limit Management" and documentation for new CLI flags/environment variables.

      Impact

      • < If the feature is ordered with other work, state the impact of this feature on the other work >

      Related Architecture/Technical Documents

      • <links>

      Definition of Ready

      • The objectives of the feature are clearly defined and aligned with the business strategy.
      • All feature requirements have been clearly defined by Product Owners.
      • The feature has been broken

              showeimer Sho Weimer
              rhn-support-jorbell Jordan Bell
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                None
                None