Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3598

[ilab] Phi-4-14B ilab model serve (inference functional testing)

XMLWordPrintable

    • [ilab] Phi-4-14B ilab model serve (inference functional testing)
    • False
    • Hide

      None

      Show
      None
    • False

      Feature Overview:
      This Feature card is part of validating 3rd-party inference models with ilab serve with the Instructlab component for RHELAI 1.5

      3rd-party model for this card: Phi 4 14B

      Goals:

      • Run Phi 4 14B with the i lab serve command in InstructLab - functional test
      • No errors/warnings arise
      • Run for all quantized variants of the model (Base, INT4, INT8, FP8)

       Out of Scope [To be updated post-refinement]:

      • Evaluating the performance of the chat
      • Evaluating Accuracy 
      • That vLLM works w/ this model - this will be confirmed before this testing happens

      Requirements:

      • Functional Requirements:
        • Ensure below components of the flow are functional with the 3rd party model for the inference use case: 
          • Ilab model download is able to download the model from quay or huggingface
          • Ilab model list to view the downloaded model
          • Ilab model serve can serve the model

      Done - Acceptance Criteria:

      Use Cases - i.e. User Experience & Workflow:

      • User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command
      • User can then view the model with Ilab model list
      • User can then serve the model for inference on vLLM-ent with Ilab model serve

      Documentation Considerations:{}

      • Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)

      Questions to answer:

      • Are there any code changes to serve a model? Note, the model will already be validated in vllm-ent BEFORE this testing occurs so this is just ensuring the ilab commands work, NOT that vLLM works w/ this model. 

      Background & Strategic Fit:

      Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.

      See Red Hat AI Model Validation Strategy Doc

      See Redhat Q1 2025 Third Party Model Validation Presentation

              dmcphers@redhat.com Dan McPherson
              rh-ee-rogreenb Rob Greenberg
              Jenny Yi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: