XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Done
    • Icon: Normal Normal
    • ols-2.0
    • None
    • Lightspeed
    • Product / Portfolio Work
    • OCPSTRAT-2123OpenShift Lightspeed 2.0
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Background
      A high-quality RAG process focuses on three areas of optimization: # Contextualized splitter function

      1. Embedding techniques and rich metadata
      2. Retrieval techniques 

       
      This Feature card is about the point number 1. The idea is to adopt a splitter function that retains context within chunks. For example, maintain a YAML example in the same chunk. Maintain notes in documents associated with the code block or section they are part of.
       
       Deliverables

      • Evaluate the quality of retrievals when using the MarkdownHeaderTextSplitter for creating chunks for embeddings
        • Compare to other retrievals:
          • Semantic Chunking
          • RecursiveCharacterTextSplitter
          • CodeTextSplitter
      • Evaluate the quality of retrievals when using a custom splitter function for defining chunks for embeddings
        • Contextualization of code blocks, lists, tables, notes, sections, and images
      • Document with findings on improvements based on the context of the document
      • Update text Splitter in RAG embedding pipeline based on findings

              gausingh@redhat.com Gaurav Singh
              wcabanba@redhat.com William Caban
              None
              None
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: