• Icon: Task Task
    • Resolution: Done
    • Icon: Major Major
    • 1.9.0
    • None
    • Knowledge, Lightspeed
    • None
    • RHDH AI Sprint 3284, RHDH AI Sprint 3285, RHDH AI Sprint 3286

      Task

      As an engineer working in the "AI Notebooks" feature, I need to make a component where given a url, pdf, doc, docx, txt, md, or json it will extract the document string and pass it on to the document rag chunk generator after safety checking.

       

      For extensions doc, docx, txt, md, and json, the component should clean and delete necessary tokens.

      For pdf, the scope will only contain native pdf (not scanned pdf) to convert into doc.

      For url, it will be only the specific url page content to be security checked and added to the vector database.

       

      Ensure security and stability

      Background

      Dependencies and Blockers

      QE impacted work
      Documentation impacted work 

      Acceptance Criteria

       

       

              rh-ee-lyoon Lucas Yoon
              rh-ee-lyoon Lucas Yoon
              RHDH AI
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: