-
Bug
-
Resolution: Done
-
Undefined
-
rhelai-1.5
-
None
To Reproduce Steps to reproduce the behavior:
- ilab data generate with user data
- User's geenrated data is attached.
Expected behavior
- SDG output should not have null values in assistant content.
- Null also found in sdg_document field in the output file
Screenshots{}
"messages": [ { "content": "You are a Red Hat® Instruct Model, an AI language model developed by Red Hat and IBM Research based on the granite-3.1-8b-base model. Your primary role is to serve as a chat assistant.", "role": "system" }, { "content": "9.16. Advanced virtual machine management\n9.16.6. Using huge pages with virtual machines\n9.16.6.1. Prerequisites\n- Nodes must have \n\nSummarize the document using extractive techniques.", "role": "user" }, { "content": null, "role": "assistant" } ],
Device Info (please complete the following information):
- InstructLab Version: RHEL 1.5 instructlab 0.26.1
Bug impact
- Due to nulls in SDG output, training is crashing.
content = UNMASK_BEGIN_TOKEN + content + UNMASK_END_TOKEN ~~~~~~~~~~~~~~~~~~~^~~~~~~~~ TypeError: can only concatenate str (not "NoneType") to str """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/opt/app-root/lib64/python3.11/site-packages/instructlab/model/accelerated_train.py", line 233, in accelerated_train run_training(train_args=train_args, torch_args=torch_args) File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/__init__.py", line 36, in run_training return run_training(torch_args=torch_args, train_args=train_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/main_ds.py", line 699, in run_training dp.process_data( File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/data_process.py", line 1092, in process_data process_messages_into_input_ids( File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/data_process.py", line 814, in process_messages_into_input_ids data_with_input_ids_and_labels = process_samples(data, tokenizer, num_cpu_procs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/app-root/lib64/python3.11/site-packages/instructlab/training/data_process.py", line 916, in process_samples processed_data = data.map( ^^^^^^^^^ File "/opt/app-root/lib64/python3.11/site-packages/datasets/arrow_dataset.py", line 557, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/app-root/lib64/python3.11/site-packages/datasets/arrow_dataset.py", line 3171, in map for rank, done, content in iflatmap_unordered( File "/opt/app-root/lib64/python3.11/site-packages/datasets/utils/py_utils.py", line 728, in iflatmap_unordered [async_result.get(timeout=0.05) for async_result in async_results] File "/opt/app-root/lib64/python3.11/site-packages/datasets/utils/py_utils.py", line 728, in <listcomp> [async_result.get(timeout=0.05) for async_result in async_results] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/app-root/lib64/python3.11/site-packages/multiprocess/pool.py", line 774, in get raise self._value TypeError: can only concatenate str (not "NoneType") to str Accelerated Training failed with 1
Known workaround
- Please add any known workarounds.
Additional context
- Shiv suggested a possibility of empty LLM responses in sdg output getting converted to null in messages format. Might have to drop if row is empty.
Need to file a bug.
- is related to
-
RHELAI-4356 Require instructlab-sdg==0.8.3 in RHEL AI Wheel Pipeline for RHEL AI >=1.5.2
-
- Closed
-