Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2986

[ilab] Phase I: Bring data ingestion and pre-processing into Core

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • InstructLab - Core
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • RHELAI-2971[ilab] Modularize ingestion and pre-processing pipeline, SDG and RAG libraries

      Feature Overview (mandatory - Complete while in New status)

      At the end of 1.4, SDG and RAG will both require input data, will need to pass it through docling and consume the transformed data. 

      Since both workflows are similar (although inputs needed right now are slightly different), we want to unify them eventually by bringing the ingestion function into Core, from where both libraries can consume it. 

      Goals (mandatory - Complete while in New status)
      Provide high-level goal statement, providing user context and expected user outcome(s) for this Feature

      • SDG should be able to consume the docling jsonl output + qna.yaml with the requisite metadata from the core library.
      • RAG should be able to consume the docling jsonl output from the core library. 

      Requirements (mandatory -_ Complete while in Refinement status):
      A list of specific needs, capabilities, or objectives that a Feature must deliver to satisfy the Feature. Some requirements will be flagged as MVP. If an MVP gets shifted, the Feature shifts. If a non MVP requirement slips, it does not shift the feature.

      • SDG uses docling v2 - hierarchical chunking. RAG uses docling v3 to leverage hybrid chunking. Bringing the ingestion function from both into Core will require the Core to maintain both versions of docling for 1.5. Essentially, Core will have to ensure that the SDG and RAG libraries are getting the appropriate docling outputs. 

       

      Requirement Notes isMVP?
           
           

       

      Done - Acceptance Criteria (mandatory - Complete while in Refinement status):
      Acceptance Criteria articulates and defines the value proposition - what is required to meet the goal and intent of this Feature. The Acceptance Criteria provides a detailed definition of scope and the expected outcomes - from a users point of view

      <your text here>

      Use Cases - i.e. User Experience & Workflow: (Initial completion while in Refinement status):
      Include use case diagrams, main success scenarios, alternative flow scenarios.
      <your text here>

      Out of Scope __(Initial completion while in Refinement status):
      High-level list of items or persona’s that are out of scope

      1. Pre-processing data for both libraries 

      Documentation . Considerations __(Initial completion while in Refinement status):

      Provide information that needs to be considered and planned so that documentation will meet customer needs. If the feature extends existing functionality, provide a link to its current documentation..
      <your text here>

       

      Questions to Answer __(Initial completion while in Refinement status):
      Include a list of refinement / architectural questions that may need to be answered before coding can begin.
      <your text here>

      Background and Strategic Fit (Initial completion while in Refinement status):
      Provide any additional context is needed to frame the feature.
      <your text here>

      Customer Considerations __(Initial completion while in Refinement status):
      Provide any additional customer-specific considerations that must be made when designing and delivering the Feature.
      <your text here>

      Team Sign Off (Completion while in Planning status)

      • All required Epics (known at the time) are linked to the this Feature
      • All required Stories, Tasks (known at the time) for the most immediate Epics have been created and estimated
      • Add - Reviewers name, Team Name
      • Acceptance == Feature as “Ready” - well understood and scope is clear - Acceptance Criteria (scope) is elaborated, well defined, and understood
      • Note: Only set FixVersion/s: on a Feature if the delivery team agrees they have the capacity and have committed that capability for that milestone
      Reviewed By Team Name Accepted Notes
             
             
             
             

       

              Unassigned Unassigned
              jepandit@redhat.com Jehlum Vitasta Pandit
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: