Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-4643

Transition torch build ownership from AE to torch team

    • True
    • Hide

      None

      Show
      None
    • True
    • 25% To Do, 50% In Progress, 25% Done
    • M
    • Hide

      2026-Feb-02 yellow

      • We are blocked by SPRHEL and the configuration of the 3.3 RPMs repositories
      • DP is working on LLVM 14 RPM for old llvmlite and numba. Support for llvmlite 0.43 and numba 0.60 should be available by Jan 30.
      • AIPCC-9427: Upstream tests are currently being run on torch 2.9.0 package built by builder on CUDA 12.9 Base image. This is to test if the current dependency versions used by builder to build torch are causing additional upstream CI tests failures 
      • Created tickets for the versioned package dependency for torch and tagged to EPIC AIPCC-6999
      • 10/12 dependency packages tickets are closed. numba==0.60.0 and ninja==1.11.1.4 packages are yet to be generated
      • AIPCC-7777: Built PyTorch with the latest dependencies on RHEL to see if we can nudge upstream torch to move to latest dependency versions. However observing test failures. Failures are seen compared to pinned version. Issues are logged in AIPCC-8018, AIPCC-7994, AIPCC-8017. With these failures and with the unknowns in other configurations, there might be friction upstream to move to latest dependency packages
      Show
      2026-Feb-02 yellow We are blocked by SPRHEL and the configuration of the 3.3 RPMs repositories DP is working on LLVM 14 RPM for old llvmlite and numba. Support for llvmlite 0.43 and numba 0.60 should be available by Jan 30. AIPCC-9427: Upstream tests are currently being run on torch 2.9.0 package built by builder on CUDA 12.9 Base image. This is to test if the current dependency versions used by builder to build torch are causing additional upstream CI tests failures  Created tickets for the versioned package dependency for torch and tagged to EPIC AIPCC-6999 .  10/12 dependency packages tickets are closed. numba==0.60.0 and ninja==1.11.1.4 packages are yet to be generated AIPCC-7777 : Built PyTorch with the latest dependencies on RHEL to see if we can nudge upstream torch to move to latest dependency versions. However observing test failures. Failures are seen compared to pinned version. Issues are logged in AIPCC-8018 , AIPCC-7994, AIPCC-8017 . With these failures and with the unknowns in other configurations, there might be friction upstream to move to latest dependency packages

      Feature Overview:

      We are moving the responsibility for building “torch wheels” from the Accelerator Enablement team to the PyTorch team to remove some friction in that build process and speed up delivery.

      Product(s) associated:

      All uses of torch.

      Goals:

      • Place responsibility for building the set of wheels from the torch community on the team with the most expertise.
      • Enable that team to use the best tool for the job.
      • Continue to build for all accelerator variants supported in the product.

      Requirements:

      1. Builds must be compatible with other wheels built by AE and DP using the wheel builder image. (Build for RHEL, link against library versions in the base images, etc.)
      2. We establish a process for the PyTorch team to recognize updated dependencies and make the AE and DP teams aware so they can perform updates.

      Done - Acceptance Criteria:

      1. Builds for tagged releases of torch are published to a package index visible to the rest of the wheel build pipelines for use in image builds.
      2. Nightly build jobs try to build using the same tools and the HEAD of the most recent development branch of torch, with monitored notifications for failures.

      Use Cases - i.e. User Experience & Workflow:
      Include use case diagrams, main success scenarios, alternative flow scenarios.

      Out of Scope:

      • We will not move all accelerator-enabled builds to the torch team.
      • We will work on making the upstream build tools provide hermetic over time.

      Documentation Considerations :

      N/A

              sdharane Sudhir Dharanendraiah
              dhellman@redhat.com Doug Hellmann
              Andre Lustosa Cabral de Paula Motta, Emilien Macchi, Joseph Groenenboom, Rohan Devasthale
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

                Created:
                Updated: