Uploaded image for project: 'MicroShift'
  1. MicroShift
  2. USHIFT-2254

Greenboot restarts fail in Network smoke tests

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 3
    • None
    • None
    • uShift Sprint 246, uShift Sprint 247, uShift Sprint 248, uShift Sprint 249
    • None
    • None
    • None

      Description of problem:

      Greenboot restarts fail in standard test suites
      Some tests make frequent microshift restarts, which takes down the apiserver (among other components) but not the pods. The topolvm-controller pod is using leader election with hardcoded parameters. These are too short to withstand a microshift restart (15s), so one container in the pod goes down 15s after the apiserver is offline.
      When the controller restarts it tries to reach apiserver, which sometimes takes too long, and enters a crash loop. The backoff algorithm kicks in and caps at 5min, doubling every restart.
      Greenboot is only waiting for 5min, sometimes the backoff has a bad offset with greenboot and it will signal unhealthy.
      Eventually, the controller would recover by itself, once everything is stable and the container is restarted after the backoff.

      Version-Release number of selected component (if applicable):

      main

      How reproducible:

      Occasionally in CI

      Steps to Reproduce:

      1. CI
      

      Additional info:

      https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-microshift-main-ocp-metal-nightly-arm/1737457876722520064/artifacts/ocp-metal-nightly-arm/openshift-microshift-e2e-metal-tests/artifacts/scenario-info/el93-src@standard-suite/log.html 

              pacevedo@redhat.com Pablo Acevedo Montserrat
              ggiguash@redhat.com Gregory Giguashvili
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: