Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-8510

[PyTorch][Upstream CI] Setup RHEL 9.6 Docker Build Environment

    • Icon: Task Task
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • PyTorch
    • False
    • Hide

      None

      Show
      None
    • False

      Objective

      Build a RHEL 9.6 Docker environment for PyTorch compilation and testing.

      Work Completed

      • Created .ci/docker/rhel9/Dockerfile based on registry.redhat.io/ubi9/ubi:9.6
      • Installed Python 3.12 build environment
      • Configured GCC 11 compiler
      • Added CUDA 12.8 and cuDNN 12.8 support
      • Set up conda environment with PyTorch dependencies
      • Added compiler flags for FP16 support (-mf16c -mavx2)
      • Configured for H200 GPU (compute capability 9.0)

      Deliverables

      • [x] Docker image builds successfully
      • [x] Image size ~20GB
      • [x] All dependencies installed
      • [x] Build time ~35 minutes

      References

      • Dockerfile: .ci/docker/rhel9/Dockerfile
      • Workflow: .github/workflows/rhel-build-test.yml

              rh-ee-sugeorge Subin George
              rh-ee-sugeorge Subin George
              PyTorch Infrastructure
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: