Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-7168

`torch.utils.data.default_collate` raises misleading warning for read-only NumPy arrays

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • PyTorch
    • None
    • PyTorch Sprint 18, PyTorch Sprint 19, PyTorch Sprint 20, PyTorch Sprint 21, PyTorch Sprint 22, PyTorch Sprint 23

          1. 🐛 Describe the bug

      Similar to issue 47160(https://github.com/pytorch/pytorch/issues/47160) where `torch.tensor(a)` emits a warning when given a non-writeable NumPy array—even though it copies the data—`torch.utils.data.default_collate` also raises the same warning unnecessarily.

      The warning:
      ```
      UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor.
      ```

      However, `default_collate` internally copies the NumPy array, so the resulting tensor does not alias the original read-only buffer. This means the warning is misleading and should not be triggered in this case.

          1. To Reproduce
            ```
            import torch
            import numpy as np
            import warnings
            from torch.utils.data import default_collate

      def test_copy():
      a = np.arange(5.0)
      print("old a:", a)
      batch = [a]
      b = default_collate(batch)
      b[0][0] = 99

      1. new a != new b -> default_collate copy the data
        print("new a:", a)
        print("new b:", b)

      def test_bug():
      a = np.arange(5.0)
      a.flags.writeable = False

      1. Create a batch containing the non-writeable numpy array
        batch = [a]
      1. As default_collate copy the data, it should not raise the UserWarning
        _ = default_collate(batch)

      if _name_ == "_main_":
      test_copy()
      test_bug()

      ```

      1. Output
        ```
        old a: [0. 1. 2. 3. 4.]
        new a: [0. 1. 2. 3. 4.]
        new b: tensor([[99., 1., 2., 3., 4.]], dtype=torch.float64)
        /export/d2/secfuzz/anaconda3/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py:285: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:203.)
        return collate([torch.as_tensor(b) for b in batch], collate_fn_map=collate_fn_map)
        ```
          1. Versions

      PyTorch version: 2.7.0+cu126
      Is debug build: False
      CUDA used to build PyTorch: 12.6
      ROCM used to build PyTorch: N/A

      cc @andrewkho @divyanshk @VitalyFedyunin @dzhulgakov

              rh-ee-visgoyal Vishal Goyal
              rh-ee-visgoyal Vishal Goyal
              PyTorch Core
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: