-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
None
-
False
-
-
False
-
-
*Test Class:* test/test_foreach.py::TestForeachCUDA
*Failing Tests:* 1
*Error Pattern:* related_issues
-
-
- Description
-
Summary:
1 test(s) in TestForeachCUDA are failing during PyTorch unit test execution on sGPU platform.
Test Class: test/test_foreach.py::TestForeachCUDA
Number of Failing Tests: 1
Platform: sGPU
Test Type: Unit Test
Version Information:
- PyTorch Commit: 4816fd9
- Test Date: 2025-12-22
- Pipeline ID: 2217097191
- Platform: sGPU
Failure Pattern:
Tests failing with 3 related error patterns - likely common root cause
Error Patterns:
1.
File "/miniconda/envs/cuda_torch_build/lib/python3.12/site-packages/torch/testing/_comparison.py", line 1298, in not_close_error_metas
2.
CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 139.80 GiB of which 96.00 GiB is free. Process 1190 has 524.00 MiB memory in use. Process 2903636 has 522.00 MiB memory in use. Process 2903703 has 1.92 GiB memory in use. Process 2921375 has 522.00 MiB memory in use. Including non-PyTorch memory, this process has 39.77 GiB memory in use. Process 2930046 has 522.00 MiB memory in use. 46.13 GiB allowed; Of the allocated memory 39.00 GiB is allocated by PyTorch, and 12.0
3.
CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 139.80 GiB of which 96.52 GiB is free. Process 1190 has 524.00 MiB memory in use. Process 2903636 has 522.00 MiB memory in use. Process 2903703 has 1.92 GiB memory in use. Process 2921375 has 522.00 MiB memory in use. Including non-PyTorch memory, this process has 39.77 GiB memory in use. 46.13 GiB allowed; Of the allocated memory 39.00 GiB is allocated by PyTorch, and 12.00 MiB is reserved by PyTorch but unallocated.
Failing Tests:
1. test_foreach_copy_with_multi_dtypes_large_input_cuda
Steps to Reproduce:
1. Pull the PyTorch test image
2. Run the failing test class:
TEST_CONFIG=cuda python3 test/run_test.py -i test_foreach
3. Observe test failures
Expected Result:
All tests in TestForeachCUDA should pass
Actual Result:
1 test(s) failing with errors shown above
Logs:
Pipeline ID: 2217097191
CI Artifacts: Available in pipeline artifacts
Additional Context:
Test failures identified in automated PyTorch CI run.
Severity: Medium
Priority: P3