-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
-
False
-
-
Story (Required)
As a platform engineer trying to optimize cost and performance of LLM analysis I want the ability to specify different LLM models for different analysis roles so that I can use expensive powerful models for critical tasks and cheaper models for simple tasks
This feature enables selecting specific LLM models (e.g., GPT-4, GPT-3.5-turbo, Gemini-1.5-Pro) at both the global level and per-role level. This allows teams to optimize costs by using cheaper, faster models for simple categorization tasks while reserving expensive, capable models for complex failure analysis. It also enables testing new models on specific roles before rolling out globally.
Background (Required)
Currently, LLM analysis configuration specifies the provider (OpenAI, Gemini) but not the specific model. This means:
- All roles use the same model, regardless of task complexity
- No ability to use GPT-3.5-turbo for simple tasks and GPT-4 for complex analysis
- Cannot take advantage of new models (GPT-4-turbo, Gemini-1.5-Flash) without changing entire repository config
- No cost optimization through model selection
- Cannot experiment with different models for different use cases
LLM providers offer multiple models with different capabilities, speeds, and costs. Enabling model selection allows teams to match the right model to each task.
Related: Current LLM analysis implementation at docs/content/docs/guide/llm-analysis.md
Out of scope
- Automatic model selection based on task complexity or context size
- Model performance benchmarking or recommendations
- Fallback to cheaper models when expensive models fail
- Dynamic model switching based on budget or rate limits
- Model version pinning or auto-upgrade strategies
Approach (Required)
High-level technical approach:
Add model field to global AI configuration as the default model for all roles
Add model field to individual AnalysisRole configuration to override the global default
Support provider-specific model names
*OpenAI: gpt-4, gpt-4-turbo, gpt-3.5-turbo, etc
* Gemini: gemini-1.5-pro, gemini-1.5-flash, gemini-pro, etc
When a role doesn't specify a model, use the global default
When a role specifies a model, use that model for that specific role
Pass the model name to the LLM provider API when making analysis requests
Validate that specified models are supported by the configured provider
Include model information in analysis result metadata and logs
Support model aliases or shortcuts for common configurations
The feature should be backward compatible - if no model is specified, providers should use their default.
Dependencies
- Existing LLM analysis infrastructure and provider clients
- Repository CRD must support model field at both global and role levels
- Provider client implementations must support passing custom model names
- Documentation of supported models per provider
Acceptance Criteria (Mandatory)
Given a global AI config with model: "gpt-4", When a role doesn't specify a model, Then that role uses GPT-4 for analysis
Given a global model and a role with model: "gpt-3.5-turbo", When that role runs, Then it uses GPT-3.5-turbo regardless of the global setting
Given multiple roles with different models configured, When analyses run, Then each role uses its specified model correctly
Given an invalid or unsupported model name, When validating configuration, Then the system returns a clear error indicating which models are supported for that provider
Given no model specified globally or per-role, When analysis runs, Then the provider's default model is used
Given a Gemini provider with model gemini-1.5-flash, When analysis runs, Then the request is sent to the correct Gemini model endpoint
Given analysis completes, When viewing logs or results, Then the model used is clearly indicated in metadata
Edge cases to consider:
- Model names that are valid for one provider but not another
- Deprecated or sunset models that are no longer available
- Model name typos or case sensitivity
- Different token limits across models affecting context size
- Cost implications visible in logging/monitoring
- New models released by providers that aren't yet known to the system