Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3561

[ilab] Running a third-party Llama 3.1 8B as a student model (and inference testing)in ilab tuning flow

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • RHELAI-3608RHEL AI Third-Party Model Validation Program [Post-Summit]

      Feature Overview:
      This Feature card is part of validating 3rd-party teacher models with the Instructlab component for RHELAI 1.5

      3rd-party model for this card: Llama 3.1 8B

      Goals:

      • Run Llama 3.1 8B as a student model successfully in the Instructlab tuning flow, tuned by current teacher, Mixtral 8x7B Instruct (mixtral-8x7b-instruct-v0-1)
      • Create a fine-tuned Llama 3.1 8B student
      • Hand off the Model to PSAP Team for model validation - email/slack rh-ee-rogreenb when completed -so they can run OpenLLM Leaderboard v1/v2 evals between base model and fine-tuned model 
      • Run for all quantized variants of the model (Base, INT4, INT8, FP8) for the inference use case

       Out of Scope [To be updated post-refinement]:

      • Match performance results with current student, Granite 3.1 8B Instruct
      • Code changes that accommodates any arbitrary models
      • Model management functions (ie. ilab model upload)
      • Run dk-bench/Ragas evals to evaluate the fine-tuned student model on the newly learned data

      Requirements:

      • Functional Requirements:
        • Ensure below components of the flow are functional with the 3rd party student model: 
          • Ilab model download is able to download the model from quay/huggingface
          • Ilab model list to view the downloaded model
          • Ilab model train is able to tune Llama 3.1 8B
          • Ilab model serve can serve the fine-tuned Llama 3.1 8B model
        • Ensure below components are functional with the 3rd party model for all quantized variants in the inference use case:
          • Ilab model download is able to download the model from quay or huggingface
          • Ilab model list to view the downloaded model
          • Ilab model serve can serve the model
      • Accuracy evaluation requirements:
        • Handoff the base and fine-tuned model to the PSAP team - email/slack rh-ee-rogreenb when completed - to perform OpenLLM Leaderboard v1/v2 evaluation without math-hard subtask

      Done - Acceptance Criteria:

      • QE ensures all functional requirements are met
      • Base and finetuned model handover to PSAP
      • Student model performance before tuning and after tuning on OpenLLM Leaderboard v1/v2 is comparable and there isn't a significant accuracy degradation +- 5 points - BD
      • QE ensures inferencing functional requirements are met for each compression level [HF LINKS UPDATE] 
      Model Quantization Level Confirmed
      Llama 3.1 8B Baseline  
      Llama 3.1 8B INT4 INT4  
      Llama 3.1 8B INT8 INT8  
      Llama 3.1 8B FP8 FP8  

      Use Cases - i.e. User Experience & Workflow:

      • User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command

      Documentation Considerations:

      • Update relevant documentation to expose new third party model to users (ie. Chapter 3. Downloading Large Language models)

      Questions to answer: 

      • https://issues.redhat.com/browse/RHELAI-3559 - Refer to open questions here. 
      • Do we need to run all of the quantized versions of the same models through the ilab serve validation step or if the baseline works, we can assume the quantized will work?

      Background & Strategic Fit:

      Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.

      See Red Hat AI Model Validation Strategy Doc

      See Redhat Q1 2025 Third Party Model Validation Presentation

              rh-ee-rogreenb Rob Greenberg
              rh-ee-rogreenb Rob Greenberg
              Jenny Yi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: