Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-4280

SDG Output has Null Values

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • rhelai-1.5.2
    • rhelai-1.5
    • InstructLab - SDG
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      To Reproduce Steps to reproduce the behavior:

      • ilab data generate with user data
      • User's geenrated data is attached.

      Expected behavior

      • SDG output should not have null values in assistant content.
      • Null also found in sdg_document field in the output file

      Screenshots{}

      "messages": [
          {
            "content": "You are a Red Hat® Instruct Model, an AI language model developed by Red Hat and IBM Research based on the granite-3.1-8b-base model. Your primary role is to serve as a chat assistant.",
            "role": "system"
          },
          {
            "content": "9.16. Advanced virtual machine management\n9.16.6. Using huge pages with virtual machines\n9.16.6.1. Prerequisites\n- Nodes must have \n\nSummarize the document using extractive techniques.",
            "role": "user"
          },
          {
            "content": null,
            "role": "assistant"
          }
        ],

      Device Info (please complete the following information):

      • InstructLab Version: RHEL 1.5 instructlab 0.26.1

      Bug impact

      • Due to nulls in SDG output, training is crashing. 

       

      content = UNMASK_BEGIN_TOKEN + content + UNMASK_END_TOKEN
             ~~~~~~~~~~~~~~~~~~~^~~~~~~~~
      TypeError: can only concatenate str (not "NoneType") to str
      """
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
       File "/opt/app-root/lib64/python3.11/site-packages/instructlab/model/accelerated_train.py", line 233, in accelerated_train
        run_training(train_args=train_args, torch_args=torch_args)
       File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/__init__.py", line 36, in run_training
        return run_training(torch_args=torch_args, train_args=train_args)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/main_ds.py", line 699, in run_training
        dp.process_data(
       File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/data_process.py", line 1092, in process_data
        process_messages_into_input_ids(
       File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/data_process.py", line 814, in process_messages_into_input_ids
        data_with_input_ids_and_labels = process_samples(data, tokenizer, num_cpu_procs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/data_process.py", line 916, in process_samples
        processed_data = data.map(
                 ^^^^^^^^^
       File "/opt/app-root/lib64/python3.11/site-packages/datasets/arrow_dataset.py", line 557, in wrapper
        out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/opt/app-root/lib64/python3.11/site-packages/datasets/arrow_dataset.py", line 3171, in map
        for rank, done, content in iflatmap_unordered(
       File "/opt/app-root/lib64/python3.11/site-packages/datasets/utils/py_utils.py", line 728, in iflatmap_unordered
        [async_result.get(timeout=0.05) for async_result in async_results]
       File "/opt/app-root/lib64/python3.11/site-packages/datasets/utils/py_utils.py", line 728, in <listcomp>
        [async_result.get(timeout=0.05) for async_result in async_results]
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       File "/opt/app-root/lib64/python3.11/site-packages/multiprocess/pool.py", line 774, in get
        raise self._value
      TypeError: can only concatenate str (not "NoneType") to str
      Accelerated Training failed with 1

      Known workaround

      • Please add any known workarounds.

      Additional context

      • Shiv suggested a possibility of empty LLM responses in sdg output getting converted to null in messages format. Might have to drop if row is empty.
        Need to file a bug. 

        1. train_log_error_null.txt
          136 kB
          Aditi Saluja
        2. generation-1c9be78e-34b5-11f0-a9d2-0afffaeb628d.log
          3.55 MB
          Aditi Saluja
        3. taxonomy-virtmachine.tgz
          4.77 MB
          Vladimír Kadlec

              dhiggins@redhat.com Derek Higgins
              rh-ee-asaluja Aditi Saluja
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: