Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-8563

llm-d: RHAIIS product build pulls latest nvshmem, causing instability for llm-d

    • Icon: Feature Feature
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • rhoai-3.0
    • Accelerator Enablement
    • False
    • Hide

      None

      Show
      None
    • False

      EDIT[2026/01/19]

      can we use this RFE to encompass the entire foundational stack (CUDA, NCCL, cuDNN, ROCm, etc.)

      Problem Statement

      The main RHAIIS product build is configured to pull the latest available version of nvshmem. This introduces significant risk, as the llm-d project (which runs on RHAIIS) conducts its performance testing and validation against specific versions of this library.

       

      We should enforce stability by pinning this dependency.

       

      cc rhn-support-weaton rh-ee-tysmith 

              fjansen@redhat.com Frank Jansen
              naisingh@redhat.com Naina Singh
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: