-
Initiative
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
-
True
-
100% To Do, 0% In Progress, 0% Done
-
M
Feature title: Build flashinfer-cubin for RHAIIS
Feature Overview:
Right now, the flashinfer cubins are downloaded and installed manually:
It's not ideal and AIPCC should build the wheel instead, which would be shipped in the RHAIIS collection.
Product(s) associated:
RHAIIS: Yes
RHEL AI: No
RHOAI: No
Goals:
Build wheels for the project, and the wheels can be consumed by RHAIIS pipelines and we can remove the hack in the RHAIIS containerfile.
Requirements:
The version of the wheel must satisfy what vllm requires to function as expected, when it was downloaded manually.
Done - Acceptance Criteria:
The cubins are installed by installing the new wheel and the hack is removed in the Containerfile for CUDA. The QE has tested that vllm works as expected.
Out of Scope:
Introduce unreleased cubins, we'll take what is released in flashinfer itself.
Documentation Considerations : N/A
- links to