Loading...

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: rhelai-1.3.1
Affects Version/s: RHELAI 1.3 GA
Component/s: Accelerators - Intel Gaudi, Engine/Runtime
Labels:
- Gaudi
- Intel

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Documentation Type:

Release Notes
Flagged:

Impediment
Release Note Type:
Known Issue
Intelligence Requested:
Market:

Severity:
Important

Release Blocker:
Approved

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

To Reproduce Steps to reproduce the behavior:

boot system with registry.stage.redhat.io/rhelai1/bootc-intel-rhel9:1.3-1732661719
run ilab config init
Check the config.yaml

[root@dhcp-10-111-212-61 devcloud]# cat .config/instructlab/config.yaml  | grep gpus
  gpus: 8
      gpus: 8
    gpus: 8

If the config remains like this, `ilab model serve` will fail with:

AssertionError: GPUExecutor only supports single GPU.
Exception ignored in: <function HabanaExecutor.__del__ at 0x7f37f90eab60>
Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/vllm/executor/habana_executor.py", line 197, in __del__
  File "/opt/app-root/lib64/python3.11/site-packages/vllm/executor/habana_executor.py", line 194, in shutdown
AttributeError: 'HabanaExecutorAsync' object has no attribute 'driver_worker'

As per cheimes@redhat.com the config should be created with gpus set to 1 ( slack thread )

Also by updating the gpus and tensor values in the serve section to 1, the serve and chat works just fine:

[root@dhcp-10-111-212-61 devcloud]# ILAB_HOME=/var/home/devcloud ilab chat
╭────────────────────────────────────────────────────────────────────────────────────────── system ──────────────────────────────────────────────────────────────────────────────────────────╮
│ Welcome to InstructLab Chat w/ GRANITE-7B-REDHAT-LAB (type /h for help)                                                                                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
>>> hello                                                                                                                                                                         [S][default]
╭────────────────────────────────────────────────────────────────────────────────── granite-7b-redhat-lab ───────────────────────────────────────────────────────────────────────────────────╮
│ Hello! How can I assist you today?                                                                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── elapsed 0.207 seconds ─╯
>>> are you alive ?                                                                                                                                                               [S][default]
╭────────────────────────────────────────────────────────────────────────────────── granite-7b-redhat-lab ───────────────────────────────────────────────────────────────────────────────────╮
│ Yes, I am an AI language model designed to help answer your questions and provide information as best I can.                                                                               │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── elapsed 0.513 seconds ─╯
>>>

Expected behavior