Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: RHAIIS-3.1
Affects Version/s: None
Component/s: Wheel Package Index
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Severity:
Critical

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

AIPCC-1634 attempted to fix the problem

Attempting to use wgmma.fence without CUTE_ARCH_MMA_SM90A_ENABLED

and added the env var "VLLM_FA_CMAKE_GPU_ARCHES" to the builder repo. The fix has a flaw that breaks the fix. The build argument was defined in the build args file as:

VLLM_FA_CMAKE_GPU_ARCHES='80-real;90-real'

However Podman does not strip quotes from build args. The line should be

VLLM_FA_CMAKE_GPU_ARCHES=80-real;90-real

From the bootstrap.log of the most recent vLLM build for CUDA https://gitlab.com/redhat/rhel-ai/rhaiis/pipeline/-/jobs/10415882372/artifacts/browse/mnt/

2025-06-20 14:54:13,203 DEBUG:fromager.external_commands:73: vllm: running: _PYPROJECT_HOOKS_BUILD_BACKEND=setuptools.build_meta MAKEFLAGS=-j22 CMAKE_BUILD_PARALLEL_LEVEL=22 MAX_JOBS=22 CMAKE_BUILD_TYPE=
Release VLLM_INSTALL_PUNICA_KERNELS=1 VLLM_CUTLASS_SRC_DIR=/work/git-repos/cutlass GPU_ARCHS='7.5 8.0 8.6 8.7 8.9 9.0 10.0 12.0+PTX' INCLUDES='-I/work/git-repos/cutlass/include -I/work/git-repos/cutlass/
tools/util/include' NVCC_THREADS=2 VLLM_FLASH_ATTN_SRC_DIR=/work/git-repos/vllm_flash_attn FLASH_MLA_SRC_DIR=/work/git-repos/vllm_flash_mla VLLM_TARGET_DEVICE=cuda VLLM_FA_CMAKE_GPU_ARCHES=''"'"'80-real;
90-real'"'"'' /opt/app-root/lib64/python3.12/site-packages/fromager/run_network_isolation.sh /opt/app-root/bin/python3.12 /opt/app-root/lib64/python3.12/site-packages/pyproject_hooks/_in_process/_in_proc
ess.py get_requires_for_build_wheel /tmp/tmp3g2sf29b in /mnt/work-dir/vllm-0.9.0.1/vllm-0.9.0.1

VLLM_FA_CMAKE_GPU_ARCHES=''"'"'80-real; 90-real'"'"''

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

llm-deploy-90938a67-5df4666999-lllsl-rhaiis.log
187 kB
2025/06/25 7:51 PM
pod-llm-deploy-90938a67-5df4666999-lllsl.yaml
15 kB
2025/06/25 7:51 PM

is triggering

AIPCC-2102 Add check / linter to prevent container build-arg values with quotes

Closed

mentioned on

Merge request - AIPCC-2101: Fix VLLM_FA_CMAKE_GPU_ARCHES setting

Merge request - INFERENG-982, AIPCC-2100, AIPCC-2101: vLLM fixes for VLLM_FA_CMAKE_GPU_ARCHE and MoE kernel

Assignee:: Christian Heimes

Reporter:: Christian Heimes

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/06/25 6:45 AM

Updated:: 2025/06/26 2:48 PM

Resolved:: 2025/06/26 2:47 PM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty