Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-2368

rpdTracerControl missing from instructlab-amd-rhel9:1.3-1732115458

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • rhelai-1.3
    • rhelai-1.3
    • Accelerators - AMD
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Important
    • Approved

      To Reproduce Steps to reproduce the behavior:

      1. start container with instructlab-amd-rhel9:1.3-1732115458
      2. create config
      3. run 'ilab model serve'
      4. See traceback

       

      $ ilab model serve
      INFO 2024-11-22 00:06:37,230 instructlab.model.serve_backend:56: Using model '/opt/app-root/src/.cache/instructlab/models/granite-8b-lab-v1' with -1 gpu-layers and 4096 max context size.
      INFO 2024-11-22 00:06:37,231 instructlab.model.serve_backend:88: '--gpus' flag used alongside '--tensor-parallel-size' in the vllm_args section of the config file. Using value of the --gpus flag.
      INFO 2024-11-22 00:06:37,232 instructlab.model.backends.vllm:313: vLLM starting up on pid 210 at http://127.0.0.1:8000/v1
      Traceback (most recent call last):
        File "<frozen runpy>", line 189, in _run_module_as_main
        File "<frozen runpy>", line 112, in _get_module_details
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/__init__.py", line 3, in <module>
          from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/engine/arg_utils.py", line 11, in <module>
          from vllm.config import (CacheConfig, ConfigFormat, DecodingConfig,
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/config.py", line 12, in <module>
          from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/__init__.py", line 1, in <module>
          from vllm.model_executor.parameter import (BasevLLMParameter,
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/model_executor/parameter.py", line 7, in <module>
          from vllm.distributed import get_tensor_model_parallel_rank
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/distributed/__init__.py", line 1, in <module>
          from .communication_op import *
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/distributed/communication_op.py", line 6, in <module>
          from .parallel_state import get_tp_group
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/distributed/parallel_state.py", line 39, in <module>
          from vllm.utils import supports_custom_op
        File "/opt/app-root/lib64/python3.11/site-packages/vllm/utils.py", line 34, in <module>
          from rpdTracerControl import rpdTracerControl
      ModuleNotFoundError: No module named 'rpdTracerControl'

       

       

      Expected behavior

      • model to be served without error

      Screenshots

      • Attached Image 

      Device Info (please complete the following information):

      • Hardware Specs: AMD MI300X x8
      • OS Version: RHELAI 1.3
      • Python Version: Python 3.11.7
      • InstructLab Version: ilab, version 0.21.0

      Additional context

      • I am still working to reproduce in a bare metal install, will update once that work has been done

              prarit@redhat.com Prarit Bhargava
              tflink Tim Flink
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: