Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43928

CI shows ~10% installation failures post boostrap across SNO jobs

XMLWordPrintable

    • Low
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          This ticket attempts to catalog a series of installation failures that have created a pretty consistent pattern of 2-day installation alerts in the #single-node-ci channel.

      Across our jobs, we've seen a pretty consistent failure rate of between 10 and 15% of jobs on certain releases. Specifically, we see 4.13, 4.14, 4.15 & 4.18 warnings in the #single-node-ci channel.

      Here's a breakdown with some recent examples:

      • 4.18 examples
        • [techpreview] 1, 2  [upgrade] 1, 2, [upgrade-ovn-single-node] 1
      • 4.15 examples
        • [agent-single-node-ipv6] 1, 2, 3 [arm64-single-node] 1 [single-node-serial] 1
      • 4.14 examples
        • [ovn-single-node] 1, 2, 3 [serial] 1, 2
        • Agent ipv6 seems like a culprit for the 4.14 jobs: sippy
      • You can safely ignore the 4.13 warnings in the #single-node-ci channel.
        • 4.13 goes EOL on November 17th, 2024, and the 4.13 failures may likely be tied to a race condition that was fixed in 4.16

      These likely need to be recreated with brute force installations to see if we can recreate an installation failure that ends up “stuck” after bootstrap. For now I filed this as a generic SNO bug until we can gather more information about these and see if they're related.

              sakbas@redhat.com Suleyman Akbas
              jpoulin Jeremy Poulin
              Neil Hamza Neil Hamza
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: