Uploaded image for project: 'Connectivity Link'
  1. Connectivity Link
  2. CONNLINK-482

Input Token Rate Limiting at Gateway and HTTPRoute Level

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Needs validation as a user story. Models have input token limits, so having limits in 2 places is questionable. Also, input tokens are included in token usage in response already for TokenRateLimitPolicy as of v1alpha1

      As a platform engineer, I want to enforce input token rate limits at the Gateway and HTTPRoute level so that I can prevent excessive usage of expensive LLM APIs before requests reach the model server.

              Unassigned Unassigned
              davmarti@redhat.com David Martin
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated: