Loading...

XML

Word

Printable

Type: Story
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: PyTorch
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
AIPCC-3716
External issue URL:
https://github.com/pytorch/pytorch/issues/160317
Intelligence Requested:
Market:

Sprint:
PyTorch Sprint 15, PyTorch Sprint 16

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

1. 1. 🐛 Describe the bug

I'm observing a memory leak when using a custom torch.autograd.Function with an in-place operation marked by ctx.mark_dirty(...). The following minimal repro highlights the issue:

```python
import torch

class ReluOps(torch.autograd.Function):
@staticmethod
def forward(ctx, x):
x.relu_()
ctx.mark_dirty
ctx.save_for_backward
return x

@staticmethod
def backward(ctx, grad):
x = ctx.saved_tensors
return (grad * x[0])

def run():
for i in range(100):
x = torch.rand((100, 100, 1000), requires_grad=True)
z = x + 1.0
z_view = z[0]
ReluOps.apply(z_view)

if _name_ == "_main_":
run()

```

Memory usage increases over iterations, and is not released between invocations.

The cycle looks like this:
CopySlices -> PyNode -> THPFunction -> SavedVariable -> AsStridedBackward -> CopySlices

1. 1. Versions
    torch master
    run above code

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @albanD @gqchen @nikitaved @soulitzer @Varal7 @xmfan @chauhang @penguinwu

clones

AIPCC-4664 torch.inf.to(torch.int32) produces different values on CPU vs CUDA #154311

Review

is cloned by

AIPCC-5681 BUG: `torch.pow(-0.0)` wrongly returns `-0.0` (should be +0.0)

In Progress

Assignee:: Vishal Goyal

Reporter:: Vishal Goyal

Team:: PyTorch Core

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/10/01 7:26 AM

Updated:: 2025/10/06 6:23 AM

Resolved:: 2025/10/04 12:48 AM

Target start:: 2025/07/31

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty