-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
None
-
False
-
-
False
-
None
-
None
-
None
-
NI&D Sprint 278, NI&D Sprint 279, NI&D Sprint 280, NI&D Sprint 281
What
A MCP (https://modelcontextprotocol.io/) server capable of answering the following questions:
- Why my ingress is not working?
- Why my route is not working?
- (optional) How to fix it? (but not apply a fix, just say what can be fixed)
The implementation / proposal right now should be kept simple, answering the question: "can we use some LLM integrated with a MCP server to troubleshoot an Ingress or a Route resource?"
Why
As part of the Agentic Diagnostics Working group (https://docs.google.com/document/d/1pn99HcaXCnClquYkWWgVFKYsD6ruyUoTvtx-7lednFA) the networking teams should implement a simple proposal on how a MCP Server can be implemented to, initially, help debugging problems related to bad user actions.
How
A user may create an Ingress without setting the right TLS certificate structure. In that case, depending on what resource was created (Route or Ingress) a condition, a status or an event will be raised on the user resource. A simple MCP server can be able to fetch that information from Kubernetes API Server (using mcp-k8s https://github.com/containers/kubernetes-mcp-server) and describe on a human readable way what is happening, and eventually how it should be fixed. If the existing mcp-k8s lacks some feature for the bare minimum demo, this should be documented.
The initial proposal is that the MCP Server is Read Only, meaning it can read the resources, but won't take any further action.
Previous art
As a suggestion to keep it simple, a suggestion is to watch the demos of ovn-kubernetes (https://docs.google.com/document/d/1nNBrxzLAxl3d3O9qxpKEtxdJL5HM-Fk7IF_Ld0FXan8 and https://www.youtube.com/watch?v=y31kRR9Jqno&t=5s) and Kiali (https://www.youtube.com/watch?v=1l9m1B5uEPw) and do something similar, relying on some existing MCP server (like mcp-k8s).
- relates to
-
OCPSTRAT-2811 Integrate Model Context Protocol for Agentic AI-driven Ingress and DNS Troubleshooting
-
- Refinement
-
- links to