• Icon: Initiative Initiative
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Accelerator Enablement
    • None
    • False
    • Hide

      None

      Show
      None
    • True
    • 100% To Do, 0% In Progress, 0% Done
    • M

      Feature title:  Build flashinfer-cubin for RHAIIS

      Feature Overview:

      Right now, the flashinfer cubins are downloaded and installed manually:

      https://gitlab.com/redhat/rhel-ai/rhaiis/containers/-/blob/main/Containerfile.cuda-ubi9?ref_type=heads#L35-39

      It's not ideal and AIPCC should build the wheel instead, which would be shipped in the RHAIIS collection.

      Product(s) associated:

      RHAIIS: Yes
      RHEL AI: No
      RHOAI: No

      Goals:

      Build wheels for the project, and the wheels can be consumed by RHAIIS pipelines and we can remove the hack in the RHAIIS containerfile.

      Requirements:

      The version of the wheel must satisfy what vllm requires to function as expected, when it was downloaded manually.

      Done - Acceptance Criteria:
      The cubins are installed by installing the new wheel and the hack is removed in the Containerfile for CUDA. The QE has tested that vllm works as expected.

      Out of Scope:
      Introduce unreleased cubins, we'll take what is released in flashinfer itself.

      Documentation Considerations : N/A

              emacchi@redhat.com Emilien Macchi
              emacchi@redhat.com Emilien Macchi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: