Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2396

RHEL AI 1.3 Docling fails to load OCR support with missing system library

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Critical
    • Approved

      To Reproduce Steps to reproduce the behavior:

      1. Open an ilab shell on a recent RHEL AI 1.3 release candidate machine
      2. Execute python -c "from instructlab.sdg.utils.chunkers import resolve_ocr_options; print(resolve_ocr_options())"
      3. It prints a stack trace like below, failing to load the libspatialindex_c library.

      (app-root) /$ python -c "from instructlab.sdg.utils.chunkers import resolve_ocr_options; print(resolve_ocr_options())"
      Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/opt/app-root/lib64/python3.11/site-packages/instructlab/sdg/utils/chunkers.py", line 45, in resolve_ocr_options
      from docling.models.tesseract_ocr_model import TesseractOcrModel
      File "/opt/app-root/lib64/python3.11/site-packages/docling/models/tesseract_ocr_model.py", line 10, in <module>
      from docling.models.base_ocr_model import BaseOcrModel
      File "/opt/app-root/lib64/python3.11/site-packages/docling/models/base_ocr_model.py", line 10, in <module>
      from rtree import index
      File "/opt/app-root/lib64/python3.11/site-packages/rtree/{}init{}.py", line 12, in <module>
      from .index import Index, Rtree # noqa
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/opt/app-root/lib64/python3.11/site-packages/rtree/index.py", line 11, in <module>
      from . import core
      File "/opt/app-root/lib64/python3.11/site-packages/rtree/core.py", line 76, in <module>
      rt = finder.load()
      ^^^^^^^^^^^^^
      File "/opt/app-root/lib64/python3.11/site-packages/rtree/finder.py", line 130, in load
      raise OSError("Could not load libspatialindex_c library")
      OSError: Could not load libspatialindex_c library

      Expected behavior

      • It should show it's loading tesserocr, with output like:

      kind='tesserocr' force_full_page_ocr=False bitmap_area_threshold=0.05 lang=['eng'] path=None

              cheimes@redhat.com Christian Heimes
              bbrownin@redhat.com Ben Browning
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: