-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
-
False
-
-
Description
Summary: DataLoader persistent workers tests failing with worker process errors on CPU and sGPU platforms.
Test Class: test/test_dataloader.py::TestDataLoaderPersistentWorkers
Number of Failing Tests: 2
Platform: CPU, sGPU (CUDA)
Test Type: Unit Test
Version Information:
- PyTorch Commit: 6bdd8c9
- Branch: main
- Test Date: 2026-01-14
- Sprint: Sprint 24
Failure Pattern:
Tests failing with related errors - likely common root cause (same as TestDataLoader)
Common Error:
code
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/miniconda/envs/cuda_torch_build/lib/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 358, in _worker_loop
data = fetcher.fetch(index)
code
Failing Tests:
1. test_multiprocessing_iterdatapipe
2. test_segfault
Steps to Reproduce:
code
TEST_CONFIG=cpu python3 test/run_test.py -i test_dataloader
TEST_CONFIG=cuda python3 test/run_test.py -i test_dataloader
code
Expected Result:
Tests should pass without worker process errors
Actual Result:
Tests fail with ValueError in DataLoader worker process with persistent workers
Root Cause Analysis:
Same root cause as TestDataLoader - worker processes encountering errors related to numpy/pandas binary incompatibility when using persistent workers.
Priority: P3