Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: RHELAI 1.3 GA
Affects Version/s: RHELAI 1.3 GA
Component/s: Accelerators - AMD
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Flagged:

Impediment
Intelligence Requested:
Market:

Release Blocker:
Approved

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

To Reproduce Steps to reproduce the behavior:

start an instructlab-amd-rhel9:1.3-1732663637 container
init config and download appropriate model
```
ilab model serve
```

See error:

$ ilab model serve
INFO 2024-11-27 01:47:51,117 instructlab.model.serve_backend:56: Using model '/opt/app-root/src/.cache/instructlab/models/granite-8b-lab-v1' with -1 gpu-layers and 4096 max context size.
INFO 2024-11-27 01:47:51,117 instructlab.model.serve_backend:88: '--gpus' flag used alongside '--tensor-parallel-size' in the vllm_args section of the config file. Using value of the --gpus flag.
INFO 2024-11-27 01:47:51,118 instructlab.model.backends.vllm:313: vLLM starting up on pid 132 at http://127.0.0.1:8000/v1
INFO 11-27 01:47:53 importing.py:10] Triton not installed; certain GPU-related functions will not be available.
/opt/app-root/lib64/python3.11/site-packages/vllm/connections.py:8: RuntimeWarning: Failed to read commit hash:
No module named 'vllm._version'
  from vllm.version import __version__ as VLLM_VERSION
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 112, in _get_module_details
  File "/opt/app-root/lib64/python3.11/site-packages/vllm/__init__.py", line 6, in <module>
    from vllm.entrypoints.fast_sync_llm import FastSyncLLM
  File "/opt/app-root/lib64/python3.11/site-packages/vllm/entrypoints/fast_sync_llm.py", line 9, in <module>
    from vllm.executor.multiproc_gpu_executor import MultiprocessingGPUExecutor
  File "/opt/app-root/lib64/python3.11/site-packages/vllm/executor/multiproc_gpu_executor.py", line 16, in <module>
    from vllm.triton_utils import maybe_set_triton_cache_manager
ImportError: cannot import name 'maybe_set_triton_cache_manager' from 'vllm.triton_utils' (/opt/app-root/lib64/python3.11/site-packages/vllm/triton_utils/__init__.py)

Expected behavior