Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3559

[ilab] Running a third-party Llama 3.3 70B Instruct model as teacher model (+ inference functional testing) in ilab tuning flow

XMLWordPrintable

    • [ilab] Running a third-party Llama 3.3 70B Instruct model as teacher model (+ inference functional testing) in ilab tuning flow
    • False
    • Hide

      None

      Show
      None
    • False

      Feature Overview:
      This Feature card is part of validating 3rd-party teacher models with the Instructlab component for RHELAI 1.5

      3rd-party model for this card: Llama 3.3 70B Instruct 

      Goals:

      • Run Llama 3.3 70B Instruct as a teacher model successfully through the Ilab tuning flow to tune student model Granite 3.1 8B Instruct (granite-3.1-8b-starter-v1)
      • Create a fine-tuned student with Llama 3.3 70B Instruct as a teacher model
      • Hand off the Model to PSAP Team for model validation - email/slack rh-ee-rogreenb when completed -so they can run OpenLLM Leaderboard v1/v2 evals between base model and fine-tuned model 
      • Run for all quantized variants of the model (Base, INT4, INT8, FP8) for the inference use case

       Out of Scope [To be updated post-refinement]:

      • Match the performance results with Mixtral 8x7B teacher set up
      • Code changes that accommodates any arbitrary models
      • Run dk-bench/Ragas evals to evaluate the fine-tuned student model on the newly learned data
      • Model management (TBD)

      Requirements:

      • Functional Requirements:
        • Ensure below components are functional with the 3rd party model as teacher for the finetuning usecase: 
          • Ilab model download is able to download the model from quay or huggingface
          • Ilab model list to view the downloaded model
          • Ilab data generate is able to generate usable synthetic data 
          • Ilab model train is able to tune Granite 3.1 8B Instruct (granite-3.1-8b-starter-v1)
          • Ilab model serve can serve the fine-tuned student model
        • Ensure below components are functional with the 3rd party model for all quantized variants in the inference use case:
          • Ilab model download is able to download the model from quay or huggingface
          • Ilab model list to view the downloaded model
          • Ilab model serve can serve the model
      • Accuracy evaluation requirements:
        • Handoff the base and fine-tuned model to the PSAP team - email/slack rh-ee-rogreenb when completed - to perform OpenLLM Leaderboard v1/v2 evaluation without math-hard subtask

      Done - Acceptance Criteria:

      • QE ensures finetuning functional requirements are met 
      • Base and finetuned model handover to PSAP Team for Model Validation
      • Student model performance before tuning and after tuning on OpenLLM Leaderboard v1/v2 is comparable and there isn't a significant accuracy degradation +- 5 points - TBD
      • QE ensures inferencing functional requirements are met for each compression level [HF LINKS UPDATE] 

      Use Cases - i.e. User Experience & Workflow:

      • User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command

      Documentation Considerations:{}

      • Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)

      Questions to answer:

      • What is the scope of code changes necessary to complete this for 1.5? Would this change if only 1 teacher and 1 student model is implemented, and not any arbitrary model?
      • Does RHELAI allow models to be downloaded directly from HF?
      • Should this feature rely on the target model already stored in Quay, or should we just perform model validation through Hugging Face stub flow instead?
      • Should ilab model chat, ilab model upload (model management into S3) be in scope?

      Background & Strategic Fit:

      Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.

      See Red Hat AI Model Validation Strategy Doc

      See Redhat Q1 2025 Third Party Model Validation Presentation

              osilkin@redhat.com Oleg Silkin
              rh-ee-rogreenb Rob Greenberg
              Aakanksha Duggal, Eshwar Prasad Sivaramakrishnan, Jaideep Rao, Jenny Yi
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: