Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-54680

Platform failures counted and <2%

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • None
    • Upstream CI Platform
    • us-ci-platform-minimal-failures
    • 0.42
    • To Do

      Goal

      Sometimes CI jobs fail due to external circumstances, i.e. quay is down, github has issues etc. This confuses PR authors since most often the external circumstance is not directly obvious - we will call these "platform failures". Therefore we want to make our CI environment as resilient as possible by counting the platform failures so that we can improve on resilience.

      User Stories

      • As a CI maintainer I want to count all failures caused by external sources so that I can set this count in relation to all lane failures 
      • As a CI maintainer I want to have an upper target boundary of 2% for platform failures so that the CI environment is resilient enough that users trust it

      Notes

      • there's some opportunities here:
        • PR authors might get feedback on the PR if a platform failure is detected
        • PRs suffering from platform failures that are recoverable could be either stopped or re-run automatically (depending on the case)

              Unassigned Unassigned
              dhiller72 Daniel Hiller
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: