Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2429

[instructlab/sdg] Sdg v0.6.0+ multiple knowledge sources fails to clone

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Approved

      [2674087217] Upstream Reporter: KodieGlosserIBM
      Upstream issue status: Closed
      Upstream description:

      I think I found a potential race condition specifically here (context aware chunking): https://github.com/instructlab/sdg/pull/284 Basically if there is more than 1 knowledge document for git to clone, and it happens to do multiple clones with the same second it will generate the same output dir: document_output_dir = Path(output_dir) / f"documents-{date_suffix}" Which causes SDG to fail since the directory already exists on the git clone.

      Generating data on a single knowledge document, things works just fine. Its when we get to multiple I am seeing failures.


      Upstream URL: https://github.com/instructlab/sdg/issues/404

              bbrownin@redhat.com Ben Browning
              upstream-sync Upstream Sync
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: