Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-4124

Stop building flash-attn on CUDA (only RHAIIS)

    • Icon: Story Story
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • Accelerator Enablement
    • None
    • AIPCC Accelerators 13, AIPCC Accelerators 14

      vLLM in CUDA does not use it, it uses vllm's fork of flash-attn:

      https://github.com/vllm-project/vllm/blob/ae87ddd040b793fd9f4f05cb660a4728c81d7670/cmake/external_projects/vllm_flash_attn.cmake#L13-L26

      We don't need to build it anymore for RHAIIS, so we need to update collections/rhaiis/cuda-ubi9/requirements.txt

              emacchi@redhat.com Emilien Macchi
              emacchi@redhat.com Emilien Macchi
              Frank's Team
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: