Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3620

Inference-only with Granite, third-party supported and NM-optimized models

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Engine/Runtime
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected

      Feature Overview (mandatory - Complete while in New status)
      An elevator pitch (value statement) that describes the Feature in a clear, concise way. ie: Executive Summary of the user goal or problem that is being solved, why does this matter to the user? The “What & Why”... 

      Proposed flow: 

      Step 1: Able to connect to the server side with URL and API key (llama-stack-configure)

      • Knows the RHEL AI port - needs to be documented 

      Step 2: Client CLI/SDK calls ‘llama-stack-client model list’ (or something less clunky) - ability to list models available to be served (possibly specify which models are specifically optimized for inference with NM)

      Step 3: Client CLI/SDK calls ‘llama-stack-client model serve’, specifies listed model. 

      Requirements: 

      1. Works with models listed in: PM priorities - RHEL AI 1.5 and 2.0 [P0] Inference-only with models 

      Goals (mandatory - Complete while in New status)
      Provide high-level goal statement, providing user context and expected user outcome(s) for this Feature

      • Who benefits from this Feature, and how? 
      • What is the difference between today’s current state and a world with this Feature?

      <your text here>

      Requirements (mandatory -_ Complete while in Refinement status):
      A list of specific needs, capabilities, or objectives that a Feature must deliver to satisfy the Feature. Some requirements will be flagged as MVP. If an MVP gets shifted, the Feature shifts. If a non MVP requirement slips, it does not shift the feature.

      Requirement Notes isMVP?
           
           

       

      Done - Acceptance Criteria (mandatory - Complete while in Refinement status):
      Acceptance Criteria articulates and defines the value proposition - what is required to meet the goal and intent of this Feature. The Acceptance Criteria provides a detailed definition of scope and the expected outcomes - from a users point of view

      <your text here>

      Use Cases - i.e. User Experience & Workflow: (Initial completion while in Refinement status):
      Include use case diagrams, main success scenarios, alternative flow scenarios.
      <your text here>

      Out of Scope _{}(Initial completion while in Refinement status):{_}
      High-level list of items or persona’s that are out of scope.
      <your text here>

      Documentation Considerations _{}(Initial completion while in Refinement status):{_}
      Provide information that needs to be considered and planned so that documentation will meet customer needs. If the feature extends existing functionality, provide a link to its current documentation..
      <your text here>

       

      Questions to Answer _{}(Initial completion while in Refinement status):{_}
      Include a list of refinement / architectural questions that may need to be answered before coding can begin.

      Dependencies/Questions to be answered: 

      1. Where can models be pulled from? HF, OCI, S3, local, on server/filesystem? 
      2. ‘Register’ models - can be done on the client side. How do we carry that forward, if at all?
      3. What are the inputs/outputs of the remote inference API today?
      4. How does this work with agentic RAG?
      5. Remote vLLM provider?? What about inline vLLM and ollama?
      6. How do we 'change' models that are inferred? Do we want to?
      7. What about non-llama models? Do we need additional plumbing? 
      8. What happens if someone tries to serve and chat when training is running?

      Background and Strategic Fit (Initial completion while in Refinement status):
      Provide any additional context is needed to frame the feature.
      <your text here>

      Customer Considerations _{}(Initial completion while in Refinement status):{_}
      Provide any additional customer-specific considerations that must be made when designing and delivering the Feature.
      <your text here>

      Team Sign Off (Completion while in Planning status)

      • All required Epics (known at the time) are linked to the this Feature
      • All required Stories, Tasks (known at the time) for the most immediate Epics have been created and estimated
      • Add - Reviewers name, Team Name
      • Acceptance == Feature as “Ready” - well understood and scope is clear - Acceptance Criteria (scope) is elaborated, well defined, and understood
      • Note: Only set FixVersion/s: on a Feature if the delivery team agrees they have the capacity and have committed that capability for that milestone
      Reviewed By Team Name Accepted Notes
             
             
             
             

       

              jepandit@redhat.com Jehlum Vitasta Pandit
              jepandit@redhat.com Jehlum Vitasta Pandit
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: