Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3610

Identify pages of a pdf document with potential conversion errors post-conversion

XMLWordPrintable

    • doc conversion error - highlight and fix automatically
    • False
    • Hide

      None

      Show
      None
    • False
    • Done

      Use evaluations Docling team has been working on for documents, and any other non-LLM based tooling we can use to assess how well a document will be converted by Docling.

      Docling has struggled with the conversion of complex tables, images and OCR, and scientific notation among others. Users need a way to assess how well Docling might convert a large document before deciding whether or not to create a model with it.

       

      Acceptance Criteria:

      • A script that takes a pdf or a collection of pdf's and assess how well the conversion will take place.

       

       

              aliryan Alina Ryan
              jepandit@redhat.com Jehlum Vitasta Pandit
              Ali Maredia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: