Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-4733

data generate: TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Critical

      To Reproduce Steps to reproduce the behavior:

      1. Prepare RHEL AI 1.5.3 on nVidia instance (tested both on A100 and H100)
      2. Attempt ilab data generate

      Expected behavior

      • Successful SDG run

      Screenshots

      • Attached Image

      Device Info (please complete the following information):

      • Hardware Specs: nVidia A100 (x8) or nVidia H100 (x8)
      • OS Version: RHEL AI 1.5.3-2
      • InstructLab Version: ilab, version 0.26.1
      • Provide the output of these two commands:
        • registry.stage.redhat.io/rhelai1/bootc-azure-nvidia-rhel9:1.5.3-1754022569

       

      ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
      ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
      ggml_cuda_init: found 8 CUDA devices:
        Device 0: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 1: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 2: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 3: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 4: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 5: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 6: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
        Device 7: NVIDIA A100-SXM4-40GB, compute capability 8.0, VMM: yes
      Platform:
        sys.version: 3.11.7 (main, Jun 25 2025, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-4)]
        sys.platform: linux
        os.name: posix
        platform.release: 5.14.0-427.77.1.el9_4.x86_64
        platform.machine: x86_64
        platform.node: fzatlouk-rhelai-1.5-nvidia-test
        platform.python_version: 3.11.7
        os-release.ID: rhel
        os-release.VERSION_ID: 9.4
        os-release.PRETTY_NAME: Red Hat Enterprise Linux 9.4 (Plow)
        memory.total: 885.80 GB
        memory.available: 877.41 GB
        memory.used: 4.46 GB
      InstructLab:
        instructlab.version: 0.26.1
        instructlab-dolomite.version: 0.2.0
        instructlab-eval.version: 0.5.1
        instructlab-quantize.version: 0.1.0
        instructlab-schema.version: 0.4.2
        instructlab-sdg.version: 0.8.3
        instructlab-training.version: 0.10.3
      Torch:
        torch.version: 2.6.0
        torch.backends.cpu.capability: AVX2
        torch.version.cuda: 12.4
        torch.version.hip: None
        torch.cuda.available: True
        torch.backends.cuda.is_built: True
        torch.backends.mps.is_built: False
        torch.backends.mps.is_available: False
        torch.cuda.bf16: True
        torch.cuda.current.device: 0
        torch.cuda.0.name: NVIDIA A100-SXM4-40GB
        torch.cuda.0.free: 39.0 GB
        torch.cuda.0.total: 39.4 GB
        torch.cuda.0.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.1.name: NVIDIA A100-SXM4-40GB
        torch.cuda.1.free: 39.0 GB
        torch.cuda.1.total: 39.4 GB
        torch.cuda.1.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.2.name: NVIDIA A100-SXM4-40GB
        torch.cuda.2.free: 39.0 GB
        torch.cuda.2.total: 39.4 GB
        torch.cuda.2.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.3.name: NVIDIA A100-SXM4-40GB
        torch.cuda.3.free: 39.0 GB
        torch.cuda.3.total: 39.4 GB
        torch.cuda.3.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.4.name: NVIDIA A100-SXM4-40GB
        torch.cuda.4.free: 39.0 GB
        torch.cuda.4.total: 39.4 GB
        torch.cuda.4.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.5.name: NVIDIA A100-SXM4-40GB
        torch.cuda.5.free: 39.0 GB
        torch.cuda.5.total: 39.4 GB
        torch.cuda.5.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.6.name: NVIDIA A100-SXM4-40GB
        torch.cuda.6.free: 39.0 GB
        torch.cuda.6.total: 39.4 GB
        torch.cuda.6.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
        torch.cuda.7.name: NVIDIA A100-SXM4-40GB
        torch.cuda.7.free: 39.0 GB
        torch.cuda.7.total: 39.4 GB
        torch.cuda.7.capability: 8.0 (see https://developer.nvidia.com/cuda-gpus#compute)
      llama_cpp_python:
        llama_cpp_python.version: 0.3.6
        llama_cpp_python.supports_gpu_offload: True
      

       

      Bug impact

      • SDG is not working

      Known workaround

      • N/A

      Additional context

      ilab chat and serve works just fine

      First issue:

      (VllmWorkerProcess pid=487) Message: 'Cannot use FlashAttention-2 backend for head size %d.'
      (VllmWorkerProcess pid=487) Arguments: (None,)
      (VllmWorkerProcess pid=487) INFO 08-01 10:42:42 [cuda.py:289] Using XFormers backend.
      (VllmWorkerProcess pid=488) --- Logging error ---
      (VllmWorkerProcess pid=488) Traceback (most recent call last):
      (VllmWorkerProcess pid=488)   File "/usr/lib64/python3.11/logging/__init__.py", line 1110, in emit
      (VllmWorkerProcess pid=488)     msg = self.format(record)
      (VllmWorkerProcess pid=488)           ^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=488)   File "/usr/lib64/python3.11/logging/__init__.py", line 953, in format
      (VllmWorkerProcess pid=488)     return fmt.format(record)
      (VllmWorkerProcess pid=488)            ^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=488)   File "/opt/app-root/lib64/python3.11/site-packages/vllm/logging_utils/formatter.py", line 13, in format
      (VllmWorkerProcess pid=488)     msg = logging.Formatter.format(self, record)
      (VllmWorkerProcess pid=488)           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=488)   File "/usr/lib64/python3.11/logging/__init__.py", line 687, in format
      (VllmWorkerProcess pid=488)     record.message = record.getMessage()
      (VllmWorkerProcess pid=488)                      ^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=488)   File "/usr/lib64/python3.11/logging/__init__.py", line 377, in getMessage
      (VllmWorkerProcess pid=488)     msg = msg % self.args
      (VllmWorkerProcess pid=488)           ~~~~^~~~~~~~~~~
      (VllmWorkerProcess pid=488) TypeError: %d format: a real number is required, not NoneType

       

      causes: INFO 08-01 10:42:42 [cuda.py:289] Using XFormers backend.

      and subsequently, XFormers fails too:

      INFO 2025-08-01 10:42:47,549 instructlab.model.backends.vllm:138: Waiting for the vLLM server to start at http://127.0.0.1:60237/v1, this might take a moment... Attempt: 15/1200
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238] Traceback (most recent call last):
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     output = run_method(worker, method, args, kwargs)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/utils.py", line 2378, in run_method
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     return func(*args, **kwargs)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]            ^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/worker/worker.py", line 183, in load_model
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     self.model_runner.load_model()
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/worker/model_runner.py", line 1113, in load_model
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     self.model = get_model(vllm_config=self.vllm_config)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     return loader.load_model(vllm_config=vllm_config)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 452, in load_model
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     model = _initialize_model(vllm_config=vllm_config)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     return model_class(vllm_config=vllm_config, prefix=prefix)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/mixtral.py", line 438, in __init__
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     self.model = MixtralModel(vllm_config=vllm_config,
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/compilation/decorators.py", line 151, in __init__
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/mixtral.py", line 276, in __init__
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     self.start_layer, self.end_layer, self.layers = make_layers(
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                                                     ^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/utils.py", line 609, in make_layers
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     [PPMissingLayer() for _ in range(start_layer)] + [
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                                                      ^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/utils.py", line 610, in <listcomp>
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/mixtral.py", line 278, in <lambda>
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     lambda prefix: MixtralDecoderLayer(
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                    ^^^^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/mixtral.py", line 205, in __init__
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     self.self_attn = MixtralAttention(
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                      ^^^^^^^^^^^^^^^^^
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/models/mixtral.py", line 143, in __init__
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]     self.q_size = self.num_heads * self.head_dim
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]                   ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
      (VllmWorkerProcess pid=486) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238] TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'
      (VllmWorkerProcess pid=489) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
      (VllmWorkerProcess pid=489) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238] Traceback (most recent call last):
      (VllmWorkerProcess pid=489) ERROR 08-01 10:42:47 [multiproc_worker_utils.py:238]   File "/opt/app-root/lib64/python3.11/site-packages/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process

              fzatlouk@redhat.com František Zatloukal
              fzatlouk@redhat.com František Zatloukal
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: