Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2500

Support for Preference Tuning (RLHF)

XMLWordPrintable

    • Not Selected
    • False
    • Hide

      None

      Show
      None

      Outcome Overview

      The preference tuning is used to better align models with human preferences and values. There are two main techniques: Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning from AI Feedback (RLAIF). This Outcome is for RLHF.

      In RLHF for LLMs (https://arxiv.org/pdf/2203.02155), the general process is:

      1. Generation of multiple potential responses to a given input
      2. Having humans evaluate and rank these responses on their quality, helpfulness, accuracy, and alignment with human values
      3. This feedback is used to train the model to favor generating responses that humans prefer

      Success Criteria

      1. A user can apply preference-tuning with InstructLab by bringing their RLHF dataset.

      Expected Results

      1. RHEL AI enables preference tuning with RLHF

      GitHub reference: https://github.com/instructlab/training/issues/335

              wcabanba@redhat.com William Caban
              wcabanba@redhat.com William Caban
              Mustafa Eyceoz, Oleg Silkin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: