-
Story
-
Resolution: Done
-
Normal
-
None
-
None
The patch vllm-0.7.2/cuda-ubi9/0003-Remove-version-munging.patch disables code in vLLM setup.py that adds local version numbers. A CUDA 12.4 build gets +cu128, a CUDA 12.1 builds gets no suffix (see MAIN_CUDA_VERSION) , a CPU build gets +cpu, and so on.
For downstream builds, we don't want any local version numbers. Work with upstream and figure out if they are willing to accept a patch that lets us customize and disable local version number with an env var
Idea:
- VLLM_LOCAL_VERSION=none disables local version numbers completely