Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-9104

Add per-role model selection with global defaults for LLM analysis

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Pipelines as Code

      Story (Required)

      As a platform engineer trying to optimize cost and performance of LLM analysis I want the ability to specify different LLM models for different analysis roles so that I can use expensive powerful models for critical tasks and cheaper models for simple tasks

      This feature enables selecting specific LLM models (e.g., GPT-4, GPT-3.5-turbo, Gemini-1.5-Pro) at both the global level and per-role level. This allows teams to optimize costs by using cheaper, faster models for simple categorization tasks while reserving expensive, capable models for complex failure analysis. It also enables testing new models on specific roles before rolling out globally.

      Background (Required)

      Currently, LLM analysis configuration specifies the provider (OpenAI, Gemini) but not the specific model. This means:

      • All roles use the same model, regardless of task complexity
      • No ability to use GPT-3.5-turbo for simple tasks and GPT-4 for complex analysis
      • Cannot take advantage of new models (GPT-4-turbo, Gemini-1.5-Flash) without changing entire repository config
      • No cost optimization through model selection
      • Cannot experiment with different models for different use cases

      LLM providers offer multiple models with different capabilities, speeds, and costs. Enabling model selection allows teams to match the right model to each task.

      Related: Current LLM analysis implementation at docs/content/docs/guide/llm-analysis.md

      Out of scope

      • Automatic model selection based on task complexity or context size
      • Model performance benchmarking or recommendations
      • Fallback to cheaper models when expensive models fail
      • Dynamic model switching based on budget or rate limits
      • Model version pinning or auto-upgrade strategies

      Approach (Required)

      High-level technical approach:

      Add model field to global AI configuration as the default model for all roles

      Add model field to individual AnalysisRole configuration to override the global default

      Support provider-specific model names

      *OpenAI: gpt-4, gpt-4-turbo, gpt-3.5-turbo, etc

      * Gemini: gemini-1.5-pro, gemini-1.5-flash, gemini-pro, etc

      When a role doesn't specify a model, use the global default

      When a role specifies a model, use that model for that specific role

      Pass the model name to the LLM provider API when making analysis requests

      Validate that specified models are supported by the configured provider

      Include model information in analysis result metadata and logs

      Support model aliases or shortcuts for common configurations

      The feature should be backward compatible - if no model is specified, providers should use their default.

      Dependencies

      • Existing LLM analysis infrastructure and provider clients
      • Repository CRD must support model field at both global and role levels
      • Provider client implementations must support passing custom model names
      • Documentation of supported models per provider

      Acceptance Criteria (Mandatory)

      Given a global AI config with model: "gpt-4", When a role doesn't specify a model, Then that role uses GPT-4 for analysis

      Given a global model and a role with model: "gpt-3.5-turbo", When that role runs, Then it uses GPT-3.5-turbo regardless of the global setting

      Given multiple roles with different models configured, When analyses run, Then each role uses its specified model correctly

      Given an invalid or unsupported model name, When validating configuration, Then the system returns a clear error indicating which models are supported for that provider

      Given no model specified globally or per-role, When analysis runs, Then the provider's default model is used

      Given a Gemini provider with model gemini-1.5-flash, When analysis runs, Then the request is sent to the correct Gemini model endpoint

      Given analysis completes, When viewing logs or results, Then the model used is clearly indicated in metadata

      Edge cases to consider:

      • Model names that are valid for one provider but not another
      • Deprecated or sunset models that are no longer available
      • Model name typos or case sensitivity
      • Different token limits across models affecting context size
      • Cost implications visible in logging/monitoring
      • New models released by providers that aren't yet known to the system

              Unassigned Unassigned
              cboudjna@redhat.com Chmouel Boudjnah
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: