Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2413

triton missing from instructlab-amd container image

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • RHELAI 1.3 GA
    • RHELAI 1.3 GA
    • Accelerators - AMD
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Impediment
    • Approved

      To Reproduce Steps to reproduce the behavior:

      1. start an instructlab-amd-rhel9:1.3-1732663637 container
      2. init config and download appropriate model
      3. ilab model serve
      4. See error:
        $ ilab model serve
        INFO 2024-11-27 01:47:51,117 instructlab.model.serve_backend:56: Using model '/opt/app-root/src/.cache/instructlab/models/granite-8b-lab-v1' with -1 gpu-layers and 4096 max context size.
        INFO 2024-11-27 01:47:51,117 instructlab.model.serve_backend:88: '--gpus' flag used alongside '--tensor-parallel-size' in the vllm_args section of the config file. Using value of the --gpus flag.
        INFO 2024-11-27 01:47:51,118 instructlab.model.backends.vllm:313: vLLM starting up on pid 132 at http://127.0.0.1:8000/v1
        INFO 11-27 01:47:53 importing.py:10] Triton not installed; certain GPU-related functions will not be available.
        /opt/app-root/lib64/python3.11/site-packages/vllm/connections.py:8: RuntimeWarning: Failed to read commit hash:
        No module named 'vllm._version'
          from vllm.version import __version__ as VLLM_VERSION
        Traceback (most recent call last):
          File "<frozen runpy>", line 189, in _run_module_as_main
          File "<frozen runpy>", line 112, in _get_module_details
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/__init__.py", line 6, in <module>
            from vllm.entrypoints.fast_sync_llm import FastSyncLLM
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/entrypoints/fast_sync_llm.py", line 9, in <module>
            from vllm.executor.multiproc_gpu_executor import MultiprocessingGPUExecutor
          File "/opt/app-root/lib64/python3.11/site-packages/vllm/executor/multiproc_gpu_executor.py", line 16, in <module>
            from vllm.triton_utils import maybe_set_triton_cache_manager
        ImportError: cannot import name 'maybe_set_triton_cache_manager' from 'vllm.triton_utils' (/opt/app-root/lib64/python3.11/site-packages/vllm/triton_utils/__init__.py)
        

      Expected behavior

      • vllm should start serving the default model

      Device Info (please complete the following information):

      • Hardware Specs: x86_64 with AMD GPU (reproduced with 4x MI210 and 8x MI300X)
      • OS Version: instructlab-amd-rhel9:1.3-1732663637 running on RHEL9
      • Python Version: Python 3.11.7
      • InstructLab Version: ilab, version 0.21.0

      Additional context

      • This is the traceback that happens after the fix from RHELAI-2400

              fdupont@redhat.com Fabien Dupont
              tflink Tim Flink
              Joseph Groenenboom
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: