Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-8945

[QA][PyTorch UT][CPU, sGPU] test/test_dataloader.py - TestDataLoader failures

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • PyTorch
    • False
    • Hide

      None

      Show
      None
    • False

      Description

      Summary: DataLoader tests failing with worker process errors on CPU and sGPU platforms.

      Test Class: test/test_dataloader.py::TestDataLoader
      Number of Failing Tests: 2
      Platform: CPU, sGPU (CUDA)
      Test Type: Unit Test

      Version Information:

      • PyTorch Commit: 6bdd8c9
      • Branch: main
      • Test Date: 2026-01-14
      • Sprint: Sprint 24

      Failure Pattern:
      Tests failing with related errors - likely common root cause (numpy incompatibility in worker processes)

      Common Error:
      code
      ValueError: Caught ValueError in DataLoader worker process 0.
      Original Traceback (most recent call last):
      File "/miniconda/envs/cuda_torch_build/lib/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 358, in _worker_loop
      data = fetcher.fetch(index)
      File "/miniconda/envs/cuda_torch_build/lib/python3.12/site-packages/torch/utils/data/_utils/fetch.py", line 35, in fetch
      code

      Failing Tests:
      1. test_multiprocessing_iterdatapipe
      2. test_segfault

      Steps to Reproduce:
      code
      TEST_CONFIG=cpu python3 test/run_test.py -i test_dataloader
      TEST_CONFIG=cuda python3 test/run_test.py -i test_dataloader
      code

      Expected Result:
      Tests should pass without worker process errors

      Actual Result:
      Tests fail with ValueError in DataLoader worker process

      Root Cause Analysis:
      The DataLoader worker processes are encountering errors when trying to fetch data. This is likely related to the same numpy/pandas binary incompatibility issue affecting other datapipe tests.

      Priority: P3

              rh-ee-rpunia Riya Punia
              pytorch-engineering PyTorch Engineering
              PyTorch Distributed
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: