Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-8331

Build fails with USE_NCCL=0: nccl_dev_cap.hpp tries to include nccl.h

    • Icon: Story Story
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • PyTorch
    • None
    • PyTorch Sprint 23

          1. 🐛 Describe the bug

      PyTorch compilation fails when building with USE_NCCL=0 due to nccl_dev_cap.hpp attempting to include NCCL headers even when NCCL is explicitly disabled.

      I will raise a PR to fix this in some time.

          1. Versions

      Collecting environment information...
      PyTorch version: 2.11.0a0+gitc22a1b4
      Is debug build: False
      CUDA used to build PyTorch: 13.0
      ROCM used to build PyTorch: N/A

      OS: Fedora Linux 42 (Cloud Edition) (x86_64)
      GCC version: (GCC) 15.2.1 20250808 (Red Hat 15.2.1-1)
      Clang version: 20.1.8 (Fedora 20.1.8-4.fc42)
      CMake version: version 4.2.1
      Libc version: glibc-2.41

      Python version: 3.14.2 | packaged by Anaconda, Inc. | (main, Dec 19 2025, 11:49:32) [GCC 14.3.0] (64-bit runtime)
      Python platform: Linux-6.16.7-200.fc42.x86_64-x86_64-with-glibc2.41
      Is CUDA available: True
      CUDA runtime version: 13.0.88
      CUDA_MODULE_LOADING set to:
      GPU models and configuration: GPU 0: NVIDIA L4
      Nvidia driver version: 580.82.09
      cuDNN version: Could not collect
      Is XPU available: False
      HIP runtime version: N/A
      MIOpen runtime version: N/A
      Is XNNPACK available: True
      Caching allocator config: N/A

      CPU:
      Architecture: x86_64
      CPU op-mode(s): 32-bit, 64-bit
      Address sizes: 48 bits physical, 48 bits virtual
      Byte Order: Little Endian
      CPU(s): 32
      On-line CPU(s) list: 0-31
      Vendor ID: AuthenticAMD
      Model name: AMD EPYC 7R13 Processor
      CPU family: 25
      Model: 1
      Thread(s) per core: 2
      Core(s) per socket: 16
      Socket(s): 1
      Stepping: 1
      BogoMIPS: 5299.99
      Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nrip_save vaes vpclmulqdq rdpid
      Hypervisor vendor: KVM
      Virtualization type: full
      L1d cache: 512 KiB (16 instances)
      L1i cache: 512 KiB (16 instances)
      L2 cache: 8 MiB (16 instances)
      L3 cache: 64 MiB (2 instances)
      NUMA node(s): 1
      NUMA node0 CPU(s): 0-31
      Vulnerability Gather data sampling: Not affected
      Vulnerability Ghostwrite: Not affected
      Vulnerability Indirect target selection: Not affected
      Vulnerability Itlb multihit: Not affected
      Vulnerability L1tf: Not affected
      Vulnerability Mds: Not affected
      Vulnerability Meltdown: Not affected
      Vulnerability Mmio stale data: Not affected
      Vulnerability Old microcode: Not affected
      Vulnerability Reg file data sampling: Not affected
      Vulnerability Retbleed: Not affected
      Vulnerability Spec rstack overflow: Mitigation; Safe RET
      Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
      Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
      Vulnerability Spectre v2: Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
      Vulnerability Srbds: Not affected
      Vulnerability Tsa: Mitigation; Clear CPU buffers
      Vulnerability Tsx async abort: Not affected
      Vulnerability Vmscape: Not affected

      Versions of relevant libraries:
      [pip3] intel-cmplr-lib-ur==2025.3.1
      [pip3] intel-openmp==2025.3.1
      [pip3] mkl-include==2025.3.0
      [pip3] mkl-static==2025.3.0
      [pip3] numpy==2.4.0
      [pip3] onemkl-license==2025.3.0
      [pip3] optree==0.18.0
      [pip3] tbb==2022.3.0
      [pip3] tbb-devel==2022.3.0
      [pip3] tcmlib==1.4.1
      [pip3] torch==2.11.0a0+gitc22a1b4
      [pip3] umf==1.0.2
      [conda] blas 1.0 mkl
      [conda] intel-cmplr-lib-ur 2025.3.1 pypi_0 pypi
      [conda] intel-openmp 2025.3.1 pypi_0 pypi
      [conda] mkl 2025.0.0 hacee8c2_941
      [conda] mkl-devel 2025.0.0 h3a03a7a_941
      [conda] mkl-include 2025.3.0 pypi_0 pypi
      [conda] mkl-static 2025.3.0 pypi_0 pypi
      [conda] numpy 2.4.0 pypi_0 pypi
      [conda] onemkl-license 2025.3.0 pypi_0 pypi
      [conda] optree 0.18.0 pypi_0 pypi
      [conda] tbb 2022.3.0 pypi_0 pypi
      [conda] tbb-devel 2022.3.0 pypi_0 pypi
      [conda] tcmlib 1.4.1 pypi_0 pypi
      [conda] torch 2.11.0a0+gitc22a1b4 pypi_0 pypi
      [conda] umf 1.0.2 pypi_0 pypi

      cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci @malfet @seemethere

              rh-ee-visgoyal Vishal Goyal
              rh-ee-visgoyal Vishal Goyal
              PyTorch Core
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: