Loading...

XML

Word

Printable

Type: Bug
Resolution: Obsolete
Priority: Normal
Fix Version/s: None
Affects Version/s: rhelai-1.4.3
Component/s: Accelerators - NVIDIA, RHELAI - IBMCloud, vLLM
Labels:
- QE

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

To Reproduce Steps to reproduce the behavior:

Using 1xL4 GPU setup, changed all GPU values in the config file to 1.

IBM Cloud Instance

Expected behavior

Error while running ilab model serve

Screenshots

Device Info (please complete the following information):

Hardware Specs: Apple M3 Pro 36GB Memory
OS Version: Mac 0S 15.3.1
InstructLab Version:
ilab, version 0.23.3

registry.stage.redhat.io/rhelai1/bootc-nvidia-rhel9:1.4.3-1741712137

ilab system info
- ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no

ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no

ggml_cuda_init: found 1 CUDA devices:

Device 0: NVIDIA L4, compute capability 8.9, VMM: yes

Platform:

sys.version: 3.11.7 (main, Jan 8 2025, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)]

sys.platform: linux

os.name: posix

platform.release: 5.14.0-427.55.1.el9_4.x86_64

platform.machine: x86_64

platform.node: ecosystem-qe-us-east3-l4

platform.python_version: 3.11.7

os-release.ID: rhel

os-release.VERSION_ID: 9.4

os-release.PRETTY_NAME: Red Hat Enterprise Linux 9.4 (Plow)

memory.total: 78.54 GB

memory.available: 77.18 GB

memory.used: 0.58 GB

InstructLab:

instructlab.version: 0.23.3

instructlab-dolomite.version: 0.2.0

instructlab-eval.version: 0.5.1

instructlab-quantize.version: 0.1.0

instructlab-schema.version: 0.4.2

instructlab-sdg.version: 0.7.1

instructlab-training.version: 0.7.0

Torch:

torch.version: 2.5.1

torch.backends.cpu.capability: AVX512

torch.version.cuda: 12.4

torch.version.hip: None

torch.cuda.available: True

torch.backends.cuda.is_built: True

torch.backends.mps.is_built: False

torch.backends.mps.is_available: False

torch.cuda.bf16: True

torch.cuda.current.device: 0

torch.cuda.0.name: NVIDIA L4

torch.cuda.0.free: 21.8 GB

torch.cuda.0.total: 22.0 GB

torch.cuda.0.capability: 8.9 (see https://developer.nvidia.com/cuda-gpus#compute)

llama_cpp_python:

llama_cpp_python.version: 0.3.2

llama_cpp_python.supports_gpu_offload: True

Additional context

config.yaml

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Screenshot 2025-03-20 at 9.33.41 AM.png
2025/03/21 2:18 PM
600 kB
Justin Larkin
Screenshot 2025-03-18 at 7.56.07 PM.png
2025/03/21 2:19 PM
867 kB
Justin Larkin

Assignee:: Charles Doern

Reporter:: Justin Larkin

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/03/21 2:34 PM

Updated:: 2025/09/08 12:07 PM

Resolved:: 2025/09/08 12:06 PM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates