-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
I added TORCH_CUDA_ARCH_LIST and PYTORCH_ROCM_ARCH args and env variables in the hope that the information would be useful. I wanted to have one central place to configure and record supported GPU architectures.
The idea turned out to cause more problems than benefits:
- Wheel builder is changing CUDA arch list more often than base images are released.
- The presence of TORCH_CUDA_ARCH_LIST can slow down vLLM startup, see
AIPCC-4016. Some operations and dependencies might compile kernels just-in-time. Without TORCH_CUDA_ARCH_LIST, only cubins for the current GPU arch are compiled. With TORCH_CUDA_ARCH_LIST present, code is compiled for additional archs.
Let's remove these env ars.
- is triggered by
-
AIPCC-4016 rhaiis: Remove TORCH_CUDA_ARCH_LIST env var
-
- Closed
-
- mentioned on