OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
Epic Goal
- Deflaking existing upstream jobs to improve their stability
- Fixing of failing upstream jobs
Why is this important?
- The upstream CRI-O test coverage is especially important to increase the footprint of the runtime within the community.
- Promoting jobs to release blocking requires a decent amount of stability.
- Understanding the CI signals and actively working with them enables the team to find issues with the runtime in conjunction with upstream Kubernetes earlier.
Acceptance Criteria
- All existing jobs should be green on the testgrid dashboard: https://testgrid.k8s.io/sig-node-cri-o
Current state
Looking at the testgrid dashboard for CRI-O jobs and focusing on the ci-crio-* suites (which run periodically and not per PR) for now. The aggregated test results are also a good indicator about how often flakes happen, but the exit status 255 error message makes some test results opaque: https://storage.googleapis.com/k8s-triage/index.html?job=ci-crio-
Most important flakes and failures are as time of writing: https://docs.google.com/spreadsheets/d/1Hj7Z6nejmOEewm5HkhcQKgpAnynLV9UYICKiSWY9sFY/edit
- is caused by
-
OCPNODE-3600 Support outlining a plan for upstream test deflaking
-
- Closed
-