Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: PyTorch
Labels:
- pytorch_ci

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
AIPCC-8378
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Problem

Two scaled dot product attention (SDPA) CPU kernel tests fail on RHEL due to minor numerical precision differences.

Root Cause

The CPU-based fused attention kernel produces slightly different numerical results due to floating-point arithmetic differences and GCC 11 compiler optimizations.

Impact

Tests failing: 2
Severity: Very Low - Numerical precision issue, not functional
Production impact: None - Attention mechanism works correctly
Mismatched elements: 1 / 1,632 (0.1%)

Proposed Solutions

Option 1: Relax test tolerance for RHEL builds
Option 2: Exclude specific test variants
Option 3: Report to PyTorch upstream for investigation

Acceptance Criteria

[ ] Root cause fully understood
[ ] Fix implemented (tolerance relaxation or exclusion)
[ ] Tests pass consistently on RHEL
[ ] No regression in actual SDPA functionality

References

Analysis: TEST_FAILURE_REPORT.md
Test file: test/test_transformers.py

Assignee:: Subin George

Reporter:: Subin George

Team:: PyTorch Infrastructure

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2026/01/12 5:40 AM

Updated:: 2026/01/27 10:17 AM

Details

Description

Problem

Root Cause

Impact

Proposed Solutions

Acceptance Criteria

References

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty