-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
PyTorch Sprint 15, PyTorch Sprint 16
-
-
- 🐛 Describe the bug
-
I'm observing a memory leak when using a custom torch.autograd.Function with an in-place operation marked by ctx.mark_dirty(...). The following minimal repro highlights the issue:
```python
import torch
class ReluOps(torch.autograd.Function):
@staticmethod
def forward(ctx, x):
x.relu_()
ctx.mark_dirty
ctx.save_for_backward
return x
@staticmethod
def backward(ctx, grad):
x = ctx.saved_tensors
return (grad * x[0])
def run():
for i in range(100):
x = torch.rand((100, 100, 1000), requires_grad=True)
z = x + 1.0
z_view = z[0]
ReluOps.apply(z_view)
if _name_ == "_main_":
run()
```
Memory usage increases over iterations, and is not released between invocations.
The cycle looks like this:
CopySlices -> PyNode -> THPFunction -> SavedVariable -> AsStridedBackward -> CopySlices
-
-
- Versions
torch master
run above code
- Versions
-
cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @albanD @gqchen @nikitaved @soulitzer @Varal7 @xmfan @chauhang @penguinwu
- clones
-
AIPCC-4664 torch.inf.to(torch.int32) produces different values on CPU vs CUDA #154311
-
- Review
-
- is cloned by
-
AIPCC-5681 BUG: `torch.pow(-0.0)` wrongly returns `-0.0` (should be +0.0)
-
- In Progress
-