Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2407

Analysis for vLLM Tensor and Data Parallelism Strategies

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • RHELAI-2403Support for Preference Tuning (RLAIF)

      Feature Overview

      Analysis of tensor and data parallelism strategies in vLLM, focusing on their impact on preference tuning performance. It aims to optimize the training process by selecting the most efficient strategy based on model size, available GPU memory, number of preference pairs, communication bandwidth, and target training throughput.

      Goals

      • Evaluate the impact of tensor and data parallelism strategies on preference tuning performance in vLLM.
      • Identify the optimal strategy for different scenarios based on model size, available GPU memory, number of preference pairs, communication bandwidth, and target training throughput.

      Requirements

      • Complete analysis of tensor and data parallelism strategies, including their impact on training batch size, memory usage, and communication patterns.
      • Ability to select the most efficient strategy for different scenarios based on given constraints.

      Background

      Models can be parallelized using tensor and data parallelism strategies. Tensor parallelism splits individual model layers across multiple GPUs, reducing memory requirements per GPU but introducing communication overhead. Data parallelism replicates the entire model across GPUs, enabling processing of more preference pairs simultaneously but requiring gradient synchronization between replicas.

      Done

      • [ ] Complete analysis of tensor and data parallelism strategies
      • [ ] Ability to select the most efficient strategy for different scenarios
      • [ ] Consideration of model size, available GPU memory, number of preference pairs, communication bandwidth, and target training throughput

      Questions to Answer

      • What are the specific constraints for model size, available GPU memory, number of preference pairs, communication bandwidth, and target training throughput in our use case?
      • How can we efficiently implement the selected strategy in Instructlab?
      • Are there any potential trade-offs between different strategies that we should consider?

      Out of Scope

      • Detailed implementation of the selected strategy in the model
      • Optimization of communication patterns for specific scenarios

      Customer Considerations

      • Provide clear documentation and examples of how to implement the chosen strategy in InstructLab.
      • Offer support and guidance for customers looking to optimize their preference tuning process using the selected strategy.

              wcabanba@redhat.com William Caban
              wcabanba@redhat.com William Caban
              Mustafa Eyceoz, Oleg Silkin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: