Uploaded image for project: 'must-gather'
  1. must-gather
  2. MG-108

Add eval set for plan_mustgather ServerPrompt

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Product / Portfolio Work
    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • OAPE Sprint 284, OAPE Sprint 285
    • 2

      Recently, we added the openshift toolset to the downstream openshift-mcp-server https://github.com/openshift/openshift-mcp-server/pull/51. This toolset contains the plan_mustgather mcp ServerPrompt to aid LLMs in planning pod, rbac, etc. spec that helps a user collect a must-gather on a cluster. It also makes the agent's behaviour deterministic by always ensuring the same spec(s) are generated based upon user queries that get converted into arguments for the prompt.

      For any new toolset to be shipped to our customers, we require evaluation rules on various agents like Claude Code, Codex, etc. test that the developed prompt does the correct task when user tries to perform gather collection and verify the results of the process. 

      https://github.com/mcpchecker/mcpchecker is the framework used to perform such evals.

       

      Acceptance criteria: add new eval rules for user queries on various types of must-gather collection into https://github.com/openshift/openshift-mcp-server/tree/main/evals, verify the results by running on atleast 1 agent like Claude Code.

              swghosh@redhat.com Swarup Ghosh
              swghosh@redhat.com Swarup Ghosh
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: