-
Feature
-
Resolution: Unresolved
-
Undefined
-
None
-
None
Feature Overview
InstructLab's CLI should be extended to support preference tuning with the RLHF (Reinforcement Learning with Human Feedback) technique. This feature allows users to provide a pairwise dataset containing multiple answers to a question and indicating the preferred one.
Goals
- Enable users to provide a preference dataset to define their ethical and safety principles for the RLHF process
- Expand ilab CLI by adding a new command or flag for preference tuning
- Anticipated primary user type: AI researchers, developers, and AI ethicists
Requirements
- The CLI should accept a file or input containing the preference dataset for the ethical and safety principles in a well-known and defined schema
- The CLI should validate the input to ensure it follows the structure required by the RLHF technique.
- The CLI should trigger a pipeline to augment the training data with a dataset encoding the provided principles.
Background
Reinforcement Learning from Human Feedback (RLHF) is a technique used to train AI models by learning from human feedback. This feature will enable users to provide this feedback in the form of preference datasets.
Done
- [ ] The CLI accepts a preference datase that encodes the ethical and safety principles.
- [ ] The CLI validates the input to ensure it follows the RLHF technique structure.
Questions to Answer
- What file format should be used to provide the preference dataset encoding their ethical and safety principles? (JSON, YAML, etc.)
- Should the AI's training data be updated in real-time or during a separate training process?
Out of Scope
- The implementation of the RLHF technique itself. (see specific card for it)
Customer Considerations
- Ensure the CLI is user-friendly and easy to understand, even for users without extensive technical knowledge.
- Provide clear documentation and examples to help users define their ethical and safety principles.
- Consider providing a pre-defined or reference preference dataset for users unsure how to define their own.
- clones
-
RHELAI-2408 [ilab] Extend CLI to support preference tuning with RLAIF technique
- New