-
Outcome
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
None
-
50% To Do, 50% In Progress, 0% Done
-
XL
-
False
-
Description:
Provide a quick overview of the goal and key background or context.
<your text here>
This feature introduces an RHDH-specific must-gather container image. The primary goal is to provide Support and Engineering teams with a consistent, reliable, and efficient tool for collecting comprehensive, targeted diagnostic data from RHDH installations.
This tool builds upon the existing OpenShift must-gather framework but is scoped specifically to RHDH artifacts, aiming to reduce troubleshooting time and the back-and-forth needed with customers to collect the necessary data needed to troubleshoot and resolve their issues.
Benefits/Value:
What are the benefits or value for our customers if we do this?
- Faster Case Resolution: By providing Support and Engineering with all necessary, targeted data in one consistent package, we significantly reduce the time required to diagnose customer issues.
- Reduced Customer Burden: Customers get a simple, single-command process for data collection, avoiding complex, manual steps or the need to download multiple tools from the Customer Portal.
- Trust and Transparency: Customers maintain control over their data through clear communication about what is collected and the option to sanitize or exclude sensitive configuration details.
- Support for Disconnected Environments: The tool is designed to work reliably even in air-gapped or disconnected installation setups.
Acceptance Criteria:
What must be true for this to be considered complete?
- Image Availability & Versioning: An official RHDH-specific must-gather image is released officially in registry.redhat.io, and its versioning is tied to the RHDH version it supports. So it should be an official supported image that customers can use.
- Comprehensive RHDH Data Collection: The tool successfully collects key RHDH-specific data (to be defined by both Support and Engineering and be documented clearly)
- Deployment Method Coverage: The tool can collect data from both Helm and Operator-based installations (including Operator metadata/logs).
- Platform Compatibility: The tool functions correctly across all supported platforms (OpenShift, AKS, EKS, GKE).
- Workflow Integration: The tool integrates seamlessly with existing Support analysis tools (like OMC against a Namespace's inspect bundle).
- Performance: The data collection process runs in a timely manner (target of 5 minutes at most).
- Scope Filtering: The tool supports specifying a namespace (or list of namespaces) to limit the scope of the analysis.
- Security
- Clear documentation about the data collected and why
Out of Scope:
If there are significant scope constraints, note them so goals and non-goals are clear.
- Collecting broad, general cluster data beyond what is strictly necessary for RHDH troubleshooting.
- Developing features to display the collected diagnostic information to RHDH end-users/admins via the RHDH UI.
Creating a user interface (UI) within RHDH or the web console to trigger the diagnostic data collection.Integrating metrics or dashboards to track case resolution time based on data attachment (potential future enhancement).
Metrics:
What metrics and telemetry data influence this and either help inform this work, or could help us understand its impact?
- Support Case Metrics: Reduction in the average number of back-and-forth interactions required with the customer to acquire sufficient diagnostic data.
- Customer Effort Score (Inferred): Improvement in perceived ease of providing diagnostic data.
- Tool Performance: Execution time of the must-gather tool (should be under 5 minutes).
Dependencies:
Note any major team or technology dependencies outside of our direct control that we may need to plan around.
- Support Team: Review and verification of the tool's output format and integration with their existing analysis tools (e.g., OMC).
Understanding the must-gather framework: Done - POC available at https://github.com/rm3l/rhdh-must-gather