Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-8865

UI - Pipeline failure analysis

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • None
    • UI
    • None
    • Ranked Issues

      Story (Required)

      As a pipeline user troubleshooting a failed TaskRun
      I want an “Explain Error” option on each failed TaskRun
      So that I can quickly understand the possible cause of failure and get suggested remedies without digging through raw logs

      Background (Required)

      Currently, when a TaskRun fails in the pipeline UI, the user only sees raw log output. Interpreting these logs requires expertise and slows down issue resolution. By providing an “Explain Error” action, the system can call a backend API with the failed TaskRun details, analyze the failure, and return a structured explanation of the cause and potential fix using LLM.

      Out of scope

      • Automatic remediation of failed TaskRuns (only explanations are provided).
      • Predictive analysis before failures occur.
      • Error explanations for successful TaskRuns. ??
      • Changes to Tekton controller behavior (only UI + API integration). ??

      Approach (Required)

      • UI Changes
        • In the TaskRun list or details view, for every failed TaskRun, render an “Explain Error” button.
        • On click, trigger a request to a backend API endpoint (e.g., /api/explain-error).
      • API Call 
      • Display the cause and possible remedies

      Dependencies

      • Backend API capable
      • UX required and subsequent PatternFly components

      Acceptance Criteria (Mandatory)

      • UI: Every failed TaskRun has an “Explain Error” button.
      • API Integration: On click, the frontend sends taskRunName + namespace + to backend.
      • Response Handling: Cause + Remedy are displayed to the user in the console.

      INVEST Checklist

      Dependencies identified

      Blockers noted and expected delivery timelines set

      Design is implementable

      Acceptance criteria agreed upon

      Story estimated

      Legend

      Unknown

      Verified

      Unsatisfied

      Done Checklist
      Code is completed, reviewed, documented and checked in
      Unit and integration test automation have been delivered and running cleanly in continuous integration/staging/canary environment
      Continuous Delivery pipeline(s) is able to proceed with new code included
      Customer facing documentation, API docs etc. are produced/updated, reviewed and published
      Acceptance criteria are met

              rh-ee-apalit Anwesha Palit
              rh-ee-apalit Anwesha Palit
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: