Loading...

XML

Word

Printable

Type: Story
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: PyTorch
Labels:
- IR

Story Points:
8
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
External issue URL:
https://github.com/pytorch/pytorch/issues/170224
Intelligence Requested:
Market:

Sprint:
AIPCC Accelerators 23, AIPCC Accelerators 24, AIPCC Accelerators 25, AIPCC Accelerators 26

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

1. 1. 🐛 Describe the bug

In flex attention, you're responsible for creating a block mask to pass into flex attention. This mask can then be reused for subsequent flex attention calls. If you're trying to avoid recompilation of transformer blocks, the naive way of doing this results in at least two compiles: one where you compile and cache the block mask, and then another where you retrieve it. You can manually hoist the block mask compute out of the transformer block, but this can be sort of annoying.

What would be nice is if there was a way to ask Dynamo to hoist the block creation out, in the same way Python RNG is hoisted out. The idea is that you have a function that has only constant arguments, and we black box it for Dynamo. The user specifies that it is safe for this function to be reordered with respect to the rest of the compute. Then Dynamo will emit prelude bytecode that calls this function and feeds it into the Dynamo region.

This is different from `assume_constant_result` since the result is not constant; you have to run this function every Dynamo invocation.

We could also not do this and force people to manually hoist their code.

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @amjames @Lucaskabela @jataylo @anijain2305 @ailzhang @drisspg

1. 1. Versions

main

clones

AIPCC-8113 Skipping combo kernel for large pointwise nodes

In Progress

Assignee:: Turner Richmond

Reporter:: Turner Richmond

Team:: PyTorch Compile

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/12/18 4:39 PM

Updated:: 2026/02/12 5:53 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty