Loading...

Type: Spike
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Engine/Runtime
Labels:
- 2.0-candidate

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Feature Overview (mandatory - Complete while in New status)
An elevator pitch (value statement) that describes the Feature in a clear, concise way. ie: Executive Summary of the user goal or problem that is being solved, why does this matter to the user? The “What & Why”...

With multiple model support (for inference, student, teacher) and to enable the end-to-end InstructLab workflow with Llama-Stack, we will have to ensure that:

Our RHEL AI Providers are initialized with default models
llama-stack (server) CLI users are able to change these defaults to customize which models they want to use as student/teacher/inference models
llama-stack(server) CLI users are able to understand clearly which models are supported as teacher models, student models, and inference-only.
llama-stack-client CLI users know which models are available for each provider upon running 'llama-stack-client model list' and know how to invoke those commands (especially if there are more than one models available at an endpoint, like inference)

Goals (mandatory - Complete while in New status)
Provide high-level goal statement, providing user context and expected user outcome(s) for this Feature

Who benefits from this Feature, and how?
What is the difference between today’s current state and a world with this Feature?

Requirements (mandatory -_ Complete while in Refinement status):
A list of specific needs, capabilities, or objectives that a Feature must deliver to satisfy the Feature. Some requirements will be flagged as MVP. If an MVP gets shifted, the Feature shifts. If a non MVP requirement slips, it does not shift the feature.

Requirement	Notes	isMVP?
works with models listed in the 'Use Cases' doc (2.0) in PM priorities

Done - Acceptance Criteria (mandatory - Complete while in Refinement status):
Acceptance Criteria articulates and defines the value proposition - what is required to meet the goal and intent of this Feature. The Acceptance Criteria provides a detailed definition of scope and the expected outcomes - from a users point of view
…
<your text here>

Use Cases - i.e. User Experience & Workflow: (Initial completion while in Refinement status):
Include use case diagrams, main success scenarios, alternative flow scenarios.
<your text here>

Out of Scope __(Initial completion while in Refinement status):
High-level list of items or persona’s that are out of scope.
1. Client CLI also has a workflow to 'register' models - I think it's ok to call that workflow not supported since that will mean a client CLI change will have to force system restart.

Documentation Considerations __(Initial completion while in Refinement status):
Provide information that needs to be considered and planned so that documentation will meet customer needs. If the feature extends existing functionality, provide a link to its current documentation..
<your text here>

Questions to Answer __(Initial completion while in Refinement status):
Include a list of refinement / architectural questions that may need to be answered before coding can begin.

vLLM and Model Management?
RHEL AI Inference Provider?
Do we download models on the llama-stack (server) side?
How do we deal with model 'register' in the short-term?
Artifact management
Where can models be pulled from? HF, OCI, S3, local, on server/filesystem? And, how?
CLI vs SDK workflow

Background and Strategic Fit (Initial completion while in Refinement status):
Provide any additional context is needed to frame the feature.

Today, on the Llama-Stack (server) CLI,

This is an example of how 'available' models are listed upon Llama-Stack server CLI setup:

metadata: {}

model_id: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo

model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType

- llm

provider_id: together

provider_model_id: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo

(All supported models with a particular distribution stack are listed like this)

Customer Considerations __(Initial completion while in Refinement status):
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature.
<your text here>

Team Sign Off (Completion while in Planning status)

All required Epics (known at the time) are linked to the this Feature
All required Stories, Tasks (known at the time) for the most immediate Epics have been created and estimated
Add - Reviewers name, Team Name
Acceptance == Feature as “Ready” - well understood and scope is clear - Acceptance Criteria (scope) is elaborated, well defined, and understood
Note: Only set FixVersion/s: on a Feature if the delivery team agrees they have the capacity and have committed that capability for that milestone

Reviewed By	Team Name	Accepted	Notes

…

is depended on by

RHELAI-3527 Set up and initialization of RHEL AI 2.0 Components

New

relates to

RHELAI-3618 Explore a RHEL AI Inference Provider

New

RHELAI-3620 Inference-only with Granite, third-party supported and NM-optimized models

Refinement

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates