Loading...

XML

Word

Printable

Type: Story
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
None

Activity Type:
Product / Portfolio Work
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
None
Story Points:
None

Target Version:
None
Release Blocker:
None
Sprint:
None

I've been thinking about how we could enable intra-job test retries without tooling changes, i.e. how we could "rescue" from flakes on presubmits now and avoid the retest struggle, without confusing our tools. I came up with https://github.com/openshift/origin/pull/30223Adds retry strategies to origin: * none (no retries)

once (current behavior)
aggressive

In aggressive: * If test passes, we are done.

If it fails, and the first run was short enough (currently 2 min) we retry up to 10 times
- If 4 or more failures, we produce a single failure artifact with all the outputs, and its considered a true failure
- If less than 4, we leave the 10 results and it gets considered a flake by spyglass, sippy, etc.

30223 is running through jobs on the latest version, but earlier ones were very successful. It rescued many presubmit jobs from failure, like this one

For now, we could leave periodics as "once" so as not to change the kind of data we're giving to tools like CR. Eventually we could make CR, sippy, etc aware of this and analyze tests as discrete results, but as Justin's doc notes, that is a lot of work.However, we could enable this on presubmits pretty safely. The potential for reducing retests is pretty high here, with only a marginal chance we introduce a regression.

is incorporated by

SHIPSTRAT-7 CI Flake Reduction

links to

openshift/origin#30223: TRT-2288: Add configurable retry strategy for test failures

openshift/release#69061: TRT-2288: Enable aggressive retry strategy for origin

openshift/release#69924: TRT-2288: Enable aggressive intra-run retries on all presubmit jobs

Assignee:: Stephen Benjamin

Reporter:: Stephen Benjamin

Need Info From:: None

Contributors:: None

QA Contact:: None

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/09/10 2:12 AM

Updated:: 2025/11/24 6:45 PM

Resolved:: 2025/10/08 1:08 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates