Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-30208

[release-4.15] excessive Back-off restarting failed container console

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None
    • * Previously, the {product-title} web console terminated unexpectedly if authentication discovery failed on the first attempt. With this update, authentication initialization was updated to retry up to 5 minutes before failing.
    • Bug Fix
    • In Progress

      This is a clone of issue OCPBUGS-29479. The following is the description of the original issue:

      Component Readiness has found a potential regression in [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers.

      Probability of significant regression: 100.00%

      Sample (being evaluated) Release: 4.15
      Start Time: 2024-02-08T00:00:00Z
      End Time: 2024-02-14T23:59:59Z
      Success Rate: 91.30%
      Successes: 63
      Failures: 6
      Flakes: 0

      Base (historical) Release: 4.14
      Start Time: 2023-10-04T00:00:00Z
      End Time: 2023-10-31T23:59:59Z
      Success Rate: 100.00%
      Successes: 735
      Failures: 0
      Flakes: 0

      View the test details report at https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?arch=amd64&arch=amd64&baseEndTime=2023-10-31%2023%3A59%3A59&baseRelease=4.14&baseStartTime=2023-10-04%2000%3A00%3A00&capability=Other&component=Unknown&confidence=95&environment=ovn%20upgrade-micro%20amd64%20azure%20standard&excludeArches=arm64%2Cheterogeneous%2Cppc64le%2Cs390x&excludeClouds=openstack%2Cibmcloud%2Clibvirt%2Covirt%2Cunknown&excludeVariants=hypershift%2Cosd%2Cmicroshift%2Ctechpreview%2Csingle-node%2Cassisted%2Ccompact&groupBy=cloud%2Carch%2Cnetwork&ignoreDisruption=true&ignoreMissing=false&minFail=3&network=ovn&network=ovn&pity=5&platform=azure&platform=azure&sampleEndTime=2024-02-14%2023%3A59%3A59&sampleRelease=4.15&sampleStartTime=2024-02-08%2000%3A00%3A00&testId=openshift-tests-upgrade%3A37f1600d4f8d75c47fc5f575025068d2&testName=%5Bsig-cluster-lifecycle%5D%20pathological%20event%20should%20not%20see%20excessive%20Back-off%20restarting%20failed%20containers&upgrade=upgrade-micro&upgrade=upgrade-micro&variant=standard&variant=standard

      Note: When you look at the link above you will notice some of the failures mention the bare metal operator.  That's being investigated as part of https://issues.redhat.com/browse/OCPBUGS-27760.  There have been 3 cases in the last week where the console was in a fail loop.  Here's an example:

      https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade/1757637415561859072

       

      We need help understanding why this is happening and what needs to be done to avoid it.

              rh-ee-jonjacks Jon Jackson
              openshift-crt-jira-prow OpenShift Prow Bot
              YaDan Pei YaDan Pei
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: