Uploaded image for project: 'RH Developer Hub Planning'
  1. RH Developer Hub Planning
  2. RHDHPLAN-352

RHDH-specific must-gather container image for streamlined support

Create Doc EPIC from R...Prepare for Z ReleasePrepare Test Plan (Y R...XMLWordPrintable

    • Icon: Outcome Outcome
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • 50% To Do, 50% In Progress, 0% Done
    • XL
    • False
    • Hide

      None

      Show
      None

      Description:

      Provide a quick overview of the goal and key background or context.
      <your text here>

      This feature introduces an RHDH-specific must-gather container image. The primary goal is to provide Support and Engineering teams with a consistent, reliable, and efficient tool for collecting comprehensive, targeted diagnostic data from RHDH installations.
      This tool builds upon the existing OpenShift must-gather framework but is scoped specifically to RHDH artifacts, aiming to reduce troubleshooting time and the back-and-forth needed with customers to collect the necessary data needed to troubleshoot and resolve their issues.

      Benefits/Value:

      What are the benefits or value for our customers if we do this?

      • Faster Case Resolution: By providing Support and Engineering with all necessary, targeted data in one consistent package, we significantly reduce the time required to diagnose customer issues.
      • Reduced Customer Burden: Customers get a simple, single-command process for data collection, avoiding complex, manual steps or the need to download multiple tools from the Customer Portal.
      • Trust and Transparency: Customers maintain control over their data through clear communication about what is collected and the option to sanitize or exclude sensitive configuration details.
      • Support for Disconnected Environments: The tool is designed to work reliably even in air-gapped or disconnected installation setups.

      Acceptance Criteria:

      What must be true for this to be considered complete?

      • Image Availability & Versioning: An official RHDH-specific must-gather image is released officially in registry.redhat.io, and its versioning is tied to the RHDH version it supports. So it should be an official supported image that customers can use.
      • Comprehensive RHDH Data Collection: The tool successfully collects key RHDH-specific data (to be defined by both Support and Engineering and be documented clearly)
      • Deployment Method Coverage: The tool can collect data from both Helm and Operator-based installations (including Operator metadata/logs).
      • Platform Compatibility: The tool functions correctly across all supported platforms (OpenShift, AKS, EKS, GKE).
      • Workflow Integration: The tool integrates seamlessly with existing Support analysis tools (like OMC against a Namespace's inspect bundle).
      • Performance: The data collection process runs in a timely manner (target of 5 minutes at most).
      • Scope Filtering: The tool supports specifying a namespace (or list of namespaces) to limit the scope of the analysis.
      • Security
      • Clear documentation about the data collected and why

      Out of Scope:

      If there are significant scope constraints, note them so goals and non-goals are clear.

      • Collecting broad, general cluster data beyond what is strictly necessary for RHDH troubleshooting.
      • Developing features to display the collected diagnostic information to RHDH end-users/admins via the RHDH UI.
      • Creating a user interface (UI) within RHDH or the web console to trigger the diagnostic data collection.
      • Integrating metrics or dashboards to track case resolution time based on data attachment (potential future enhancement).

      Metrics:

      What metrics and telemetry data influence this and either help inform this work, or could help us understand its impact?

      • Support Case Metrics: Reduction in the average number of back-and-forth interactions required with the customer to acquire sufficient diagnostic data.
      • Customer Effort Score (Inferred): Improvement in perceived ease of providing diagnostic data.
      • Tool Performance: Execution time of the must-gather tool (should be under 5 minutes).

      Dependencies:

      Note any major team or technology dependencies outside of our direct control that we may need to plan around.

      • Support Team: Review and verification of the tool's output format and integration with their existing analysis tools (e.g., OMC).
      • Understanding the must-gather framework: Done - POC available at https://github.com/rm3l/rhdh-must-gather

              rh-ee-asoro Armel Soro
              rh-ee-asoro Armel Soro
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: