Loading...

Type: Epic
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Istio, Sail Operator
Labels:
- QE

Epic Name:
Next-Gen OSSM Release Process
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Status:
In Progress
Hierarchy Progress Bar:

50% To Do, 50% In Progress, 0% Done

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

1. Goal
This epic aims to evolve our release process by leveraging Artificial Intelligence (AI) to create a data-driven "release confidence score." By building upon the foundational work of faster CI pipelines and centralized test reporting, we will automate release decisions, reduce intervention, and significantly accelerate our time-to-release while simultaneously enhancing product quality.

2. Dependencies & Context
This initiative is the next logical step following the successful completion of two key epics. It directly consumes their outputs:

Speed up CI - prow jobs (https://issues.redhat.com/browse/OSSM-5980): The reduced job execution time from this epic provides the rapid feedback loop necessary for an agile, data-driven process.
OSSM Report Portal (https://issues.redhat.com/browse/OSSM-4620): The centralized and structured test data aggregated in Report Portal will serve as the primary data source for our AI model. This provides a single source of truth for all test results across upstream, midstream, and downstream.

Note: We need to evaluate if Polarion data will also be included in the project (In case there is different information that can be gathered from Polarion that is not available on Report Portal)

More information about this internal project can be found on this document: https://docs.google.com/document/d/1swcaNb9ZRaXWz-_mby8Zi_6PrqWOK6KM0Qtn4H5N6_I/edit?tab=t.0#heading=h.ajmyso7fkcty

3. Problem Statement
While we have improved CI speed and data collection, our release process still contains inefficiencies:

Time-Consuming Steps: Key decisions and triggers in the release process (e.g., triggering downstream testing, release approval) require manual intervention, consuming valuable engineering time.
High Testing Cycle Time: The average feature testing time is 3-4 days. We lack a data-driven approach to determine if this is "over-testing" or if testing can be more targeted based on code changes and risk profiles.
Uncertain Release Confidence: Release approval is based on the latest test runs, but confidence is subjective. We need an objective, quantifiable score that summarizes the quality and risk of a build based on historical and current data.

4. Proposed Solution
We will design and implement a system that analyzes the aggregated test data from Report Portal to generate a "release confidence score" for every build. This score will be integrated into our release pipeline to power automated quality gates, enabling a shift from manual approval to a data-driven, automated workflow.

Key Objectives:

Establish Data-Driven Quality Gates: Define and automate objective criteria for what constitutes a "releasable build" using the AI-generated confidence score.
Accelerate Time-to-Release: Drastically reduce the lead time from code freeze to release by minimizing manual testing cycles and decision-making delays.
Maximize Automation Efficiency: Use AI to enable smarter, more targeted testing, reducing our reliance on full regression runs for every change.
Reduce Intervention: Systematically identify and automate steps in the release pipeline, from triggering tests to deployment approvals.

5. Scope
In Scope:

Automating the generation of a "release confidence score" by analyzing data from Report Portal.
Researching and selecting a suitable AI/ML framework (evaluating existing Red Hat projects and open-source options).
Research and select other tools that can help us to gather more information to build this confidence score
Defining the metrics and criteria that contribute to the confidence score (e.g., test pass rates, flaky test trends, performance metrics, code coverage changes).
Integrating the confidence score into our CI/CD pipeline to create automated quality gates.

Out of Scope:

Replacing our current CI/CD platforms (Prow, GitHub Actions, etc.). This project will integrate with them, not replace them.
Building a custom AI/ML platform from scratch if a suitable existing tool can be leveraged.

6. Acceptance Criteria

A "release confidence score" is automatically generated and visible for each significant build.
Steps in the release process (e.g., triggering downstream tests for release candidates) are reduced by at least 75%. We need to be sure that we are able to automate processes that are now manual in the Konflux release process
The time from "code freeze" to "release candidate ready" is measurably reduced.
Clear documentation exists defining the metrics that constitute the confidence score and how it is calculated.
The confidence score is integrated into the release pipeline as an automated quality gate.
We have a centralized way to show information that can be shown to the team. We want to avoid hidden data from the entire team

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates