Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2400

RuntimeError: operator torchvision::nms does not exist

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Impediment
    • Important
    • Approved

      To Reproduce Steps to reproduce the behavior:

      1. Start latest AMD image
      2. init config and download model
      3. Run
        ilab model serve
      4. Get traceback:
         $ ilab model serve
        [sudo] password for tflink: 
        INFO 2024-11-26 17:13:13,106 instructlab.model.serve_backend:56: Using model '/home/tflink/.cache/instructlab/models/granite-8b-lab-v1' with -1 gpu-layers and 4096 max context size.
        INFO 2024-11-26 17:13:13,106 instructlab.model.serve_backend:88: '--gpus' flag used alongside '--tensor-parallel-size' in the vllm_args section of the config file. Using value of the --gpus flag.
        INFO 2024-11-26 17:13:13,107 instructlab.model.backends.vllm:313: vLLM starting up on pid 4 at http://127.0.0.1:8000/v1
        INFO 11-26 17:13:17 importing.py:10] Triton not installed; certain GPU-related functions will not be available.
        Traceback (most recent call last):
          File "<frozen runpy>", line 189, in _run_module_as_main
          File "<frozen runpy>", line 112, in _get_module_details
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/__init__.py", line 3, in <module>
            from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/engine/arg_utils.py", line 11, in <module>
            from vllm.config import (CacheConfig, ConfigFormat, DecodingConfig,
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/config.py", line 16, in <module>
            from vllm.transformers_utils.config import (ConfigFormat, get_config,
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/transformers_utils/config.py", line 11, in <module>
            from transformers.models.auto.image_processing_auto import (
          File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/image_processing_auto.py", line 27, in <module>
            from ...image_processing_utils import BaseImageProcessor, ImageProcessingMixin
          File "/opt/app-root/lib64/python3.11/site-packages/transformers/image_processing_utils.py", line 21, in <module>
            from .image_transforms import center_crop, normalize, rescale
          File "/opt/app-root/lib64/python3.11/site-packages/transformers/image_transforms.py", line 22, in <module>
            from .image_utils import (
          File "/opt/app-root/lib64/python3.11/site-packages/transformers/image_utils.py", line 58, in <module>
            from torchvision.transforms import InterpolationMode
          File "/opt/app-root/lib64/python3.11/site-packages/torchvision/__init__.py", line 6, in <module>
            from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
          File "/opt/app-root/lib64/python3.11/site-packages/torchvision/_meta_registrations.py", line 163, in <module>
            @torch._custom_ops.impl_abstract("torchvision::nms")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          File "/opt/app-root/lib64/python3.11/site-packages/torch/library.py", line 654, in register
            use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
          File "/opt/app-root/lib64/python3.11/site-packages/torch/library.py", line 154, in _register_fake
            handle = entry.abstract_impl.register(func_to_register, source)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          File "/opt/app-root/lib64/python3.11/site-packages/torch/_library/abstract_impl.py", line 31, in register
            if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        RuntimeError: operator torchvision::nms does not exist

      Expected behavior

      • ilab model serve should work

      Device Info (please complete the following information):

      • Hardware Specs: x86_64, x8 MI300X
      • OS Version: RHEL AI 1.3 pre-release
      • Python Version: Python 3.11.7
      • InstructLab Version: ilab, version 0.21.0

      Additional context

      • Tested with instructlab-amd-rhel9:1.3-1732577202 and bootc-amd-rhel9:1.3-1732609273

              fdupont@redhat.com Fabien Dupont
              tflink Tim Flink
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: