-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
-
False
-
-
-
PyTorch Sprint 25, PyTorch Sprint 26, PyTorch Sprint 27
TorchTalk is an agentic tool that gives developers deep structural understanding of PyTorch's 3M+ line cross-language codebase (Python, C+,CUDA). It automatically detects and traces cross-language bindings, following code paths from Python APIs through pybind11 and TORCH_LIBRARY macros into C+ implementations and down to CUDA kernels. This allows new contributors and developers to become familiar with PyTorch's complex internals faster and in significantly more detail, answering questions like "How does torch.matmul work?" or "What breaks if I modify the GEMM kernel?" with file:line references instead of hours of manual code archaeology.
GitHub Repository: https://github.com/adabeyta/torchtalk
Branches: Dev (New Feature Enablement), Chatbot (RAG Based v1), Main (MCP v2)
Background — RAG-Based Approach (v1)
The initial implementation was a RAG-based chatbot built on LlamaIndex, vLLM (Llama 4 Maverick with 1M context windows), and ChromaDB. It used graph-enhanced retrieval to index cross-language bindings (pybind11), call graphs, and import graphs, with a Gradio UI for conversation. While this proved the concept was valuable, the RAG approach had limitations around retrieval accuracy for deeply nested cross-language traces and required significant GPU infrastructure to run the vLLM server.
Torchtalk v1 was presented as part of the Accelerating AI-First: AIPCC AI Tooling Demo Series, see: https://drive.google.com/file/d/1JwQPQae25aAoduNQk85oNSZRJp_pnBBv/view
Current Approach MCP Server + Claude Code (v2)
The project was re-architected as an MCP (Model Context Protocol) server that integrates directly with Claude Code. Instead of retrieval-augmented generation, TorchTalk now uses deterministic static analysis via tree-sitter, libclang, and Python AST to build structured indexes of PyTorch's source. This provides exact tracing (not probabilistic retrieval) across the full Python → YAML → C++ → CUDA stack, with tools for call graph analysis, impact assessment, CUDA kernel discovery, and test finding. The MCP approach eliminates the need for a local LLM server.
Integration is configured through agent.md (project context that tells Claude Code how to build, test, and navigate the project along with available MCP tools and their usage patterns) and skills.md (a reusable skill definition that defines when to invoke TorchTalk, tool categories, and common workflows with examples).
Current Usage & Adoption
The PyTorch team is actively using TorchTalk as part of their security AI agent pipeline (experimental). There are also ongoing discussions with the team around leveraging TorchTalk's impact analysis capabilities to assess how changes propagate through the codebase, potentially making it a key component of their security review workflow.
Next Steps
- Performance benchmarking: Compare TorchTalk's tracing accuracy and speed against standard Claude Code search to quantify the value of structured static analysis over general-purpose code search.
- Op-level tracing: Develop deeper operator-level tracing capabilities within PyTorch, enabling more granular tracking of how individual ops flow through dispatch, autograd, and backend execution.
- Continue collaborating with the PyTorch team on security agent pipeline integration and impact analysis workflows.
- impacts account
-
AIPCC-8477 🟢KR4-2: Each AIPCC function has implemented at least one AI-first workflow/process that significantly improves their efficiency and/or quality.
-
- In Progress
-