Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: mcp-server, Networking
Labels:

Activity Type:
Product / Portfolio Work
Parent Link:
None
Hierarchy Progress Bar:

0% To Do, 100% In Progress, 0% Done
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Size:
None

Target Version:

openshift-4.21
Release Blocker:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Priority Data:
None
PX Impact Score:
PX Technical Impact:
None
PX Impact Range:
None
PX Scheduling Request:
None
PX Technical Impact Notes:
None

Intelligence Requested:
Market:

Feature Overview (aka. Goal Summary)

Enable the use of Model Context Protocol (MCP) to support Agentic AI capabilities for automated, context-aware troubleshooting of OVN-Kubernetes networking issues in OpenShift.

Goals (aka. expected user outcomes)

The MCP will allow an intelligent agent to persist, retrieve, and reason over structured contextual information—including node states, flow rules, pod connectivity, and previous debugging outcomes. This persistent context enables the agent to track problem resolution attempts across sessions, maintain a coherent understanding of cluster state over time, and suggest or even execute diagnostic commands as part of an autonomous or semi-autonomous troubleshooting workflow.

This feature enhances the supportability of OpenShift networking by enabling AI agents to:

Contextualize OVN-K-specific anomalies (e.g., dropped flows, NB/SB DB inconsistencies)

Learn from historic cluster issues

Persist recommended or executed steps with outcomes

Interact with users through guided root cause exploration

with the goals:

Reduce time-to-resolution for common and complex OVN-Kubernetes networking problems

Provide guided diagnostics via AI agents informed by persistent and evolving model context

Enable scalable troubleshooting automation across large or multi-cluster OpenShift environments

Requirements (aka. Acceptance Criteria):

MCP stores and retrieves OVN-Kubernetes context data reliably across AI sessions
Agent can recall relevant historical troubleshooting steps and apply them to current incidents

At least 3 predefined troubleshooting scenarios are supported by the AI agent

Agent suggestions are validated to reduce false positives and irrelevant actions

Documentation is available for extending MCP schemas and training new workflows

Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.

Deployment considerations	List applicable specific needs (N/A = not applicable)
Self-managed, managed, or both
Classic (standalone cluster)
Hosted control planes
Multi node, Compact (three node), or Single node (SNO), or all
Connected / Restricted Network
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x)
Operator compatibility
Backport needed (list applicable versions)
UI need (e.g. OpenShift Console, dynamic plugin, OCM)
Other (please specify)

Deliverables:

MCP schema definition tailored for OVN-Kubernetes context (flows, endpoints, OVS states, DB sync states, etc.)

Initial Agentic AI workflows for common OVN-K issues (e.g., pod connectivity failures, gateway misconfigurations)

Integration of MCP with troubleshooting logs and telemetry inputs (e.g., must-gather, OVN traceflows)

User-facing interface for interacting with the AI agent (CLI, web console plugin, or API)

Use Cases (Optional):

Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.

EgressIPs troubleshooting
UDNs troubleshooting
BGP troubleshooting

Questions to Answer (Optional):

Out of Scope

Real-time packet analysis or active traffic interception

Replacing manual network debugging for advanced edge cases outside current agent capabilities

Background

https://docs.google.com/presentation/d/1glNUCcA8zpNY-ckwLIRXyedyY1WPUW-tkI17Jtjo3sg/edit?slide=id.g36b12eb63d6_0_0#slide=id.g36b12eb63d6_0_0 OCP Networking Team already did some spike work during shift week, we saw promise and hence embarking on this journey.

Documentation Considerations

Documentation must be available for extending MCP schemas and training new workflows

Interoperability Considerations

links to

https://github.com/ovn-kubernetes/ovn-kubernetes-mcp/pull/2

OKEP for ovn-kubernetes-mcp-server repo

Upstream PR

Assignee:: Marc Curry

Reporter:: Marc Curry

Need Info From:: None

Contributors:: Aniket Bhat, Ben Bennett

Architect:: Surya Seetharaman

QA Contact:: None

Doc Contact:: Ashley Hardin

Product Operations Engineering Contact:: Chris Fields

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/07/31 7:34 PM

Updated:: 2025/11/19 10:16 PM

Details

Description

Feature Overview (aka. Goal Summary)

Goals (aka. expected user outcomes)

Requirements (aka. Acceptance Criteria):

Deliverables:

Use Cases (Optional):

Questions to Answer (Optional):

Out of Scope

Background

Documentation Considerations

Interoperability Considerations

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates