-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
False
-
-
False
-
-
Summary:
Test in TritonTensorDescriptorTestCUDA is failing during PyTorch unit test execution on CPU platform with timeout.
Test Class: inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA
Number of Failing Tests: 1
Platform: CPU
Test Type: Unit Test
Version Information:
- PyTorch Commit: 6bdd8c9
- Branch: main
- Test Date: 2026-01-14
- Sprint: Sprint 24
Failure Pattern:
Single root cause - test timeout (command exceeded 30 minutes)
Common Error:
inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_multi_kernel_cuda Command took >30min, returning 124 Got exit code 124
Failing Tests:
1. test_2d_reduction_multi_kernel_cuda
Steps to Reproduce:
1. Run test command:
TEST_CONFIG=cpu python3 test/run_test.py -i inductor/test_torchinductor_strided_blocks TEST_CONFIG=cuda python3 test/run_test.py -i inductor/test_torchinductor_strided_blocks TEST_CONFIG=inductor python3 test/run_test.py -i inductor/test_torchinductor_strided_blocks
2. Observe test timeout after 30 minutes
Expected Result:
Test should complete within timeout period
Actual Result:
Test hangs and times out after 30 minutes with exit code 124
Root Cause Analysis:
The test is timing out on CPU platform. This is likely because:
- The test is designed for CUDA/GPU but is being run on CPU
- Triton tensor descriptor operations are not optimized or supported on CPU
- Test may be stuck in an infinite loop or very slow computation on CPU
Potential Solutions:
1. Skip this test on CPU platform (add platform check)
2. Investigate why CUDA-specific test is running on CPU
3. Add shorter timeout for CPU platform
4. Fix test to properly detect and handle CPU environment
Additional Context:
- Note: sGPU ticket AIPCC-8264 exists for the same test class
- This is the CPU-specific failure
- Test class name includes "CUDA" suggesting it should only run on GPU
- Exit code 124 indicates timeout
Logs:
Test execution logs: /home/ktanmay/Downloads/Run 1-20260120T060019Z-1-001/Run 1/20260114_024940_commit_6bdd8c9/cpu_tests.log
Priority: P3
Labels: pytorch, unittest, cpu, inductor, triton, timeout