Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-7766

admin ack test sometimes fails because upgradable=false shows up too slowly after upgrade

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.13.0
    • None
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Even after fixing OCPBUGS-5505 and OCPBUGS-6503, the admin ack test that waits 4 minutes for the upgradeable=false after a cluster is upgraded still occasionally fails.

      Example interleaved CVO and E2E logs:

      I0218 01:33:14.667456       1 upgradeable.go:122] Cluster current version=4.10.52
      I0218 01:33:14.678918       1 upgradeable.go:42] Upgradeable conditions were recently checked, will try later.
      I0218 01:33:29.668132       1 upgradeable.go:42] Upgradeable conditions were recently checked, will try later.
      Feb 18 01:33:36.474: INFO: Completed upgrade to registry.build05.ci.openshift.org/ci-op-inzgh27t/release@sha256:b1a4d94c1c7e2ce227135b6e3abc532cd017f0aec63dbf23e4371897cf33a1a5
      1881:Feb 18 01:33:37.134: INFO: Waiting for Upgradeable to be AdminAckRequired...
      I0218 01:34:47.412472       1 upgradeable.go:42] Upgradeable conditions were recently checked, will try later.
      1999:Feb 18 01:37:42.267: FAIL: Error while waiting for Upgradeable to complain about AdminAckRequired ...
      I0218 01:38:20.818466       1 upgradeable.go:122] Cluster current version=4.11.0-0.ci-2023-02-17-233449
      

      Version-Release number of selected component (if applicable):
      Latest 4.11 but we will likely see this in 4.13 jobs after OTA-899

      How reproducible:

      Rare; search.ci says 3-8% impact

      Steps to Reproduce:
      search.ci query above

      Actual results:
      disruption_tests: [bz-Cluster Version Operator] Verify presence of admin ack gate blocks upgrade until acknowledged expand_less 1h2m36s
      {Feb 18 01:37:42.267: Error while waiting for Upgradeable to complain about AdminAckRequired with message "Kubernetes 1.25 and therefore OpenShift 4.12 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6955381 for details and instructions.": timed out waiting for the condition

      Expected results:
      no failures

      Additional info:

      • The failure shows that the assumption in OCPBUGS-6503 fix that the CVO guarantees the update to happen in 4m max, needs to be reinvestigated
      • We can either tighten the CVO code that performs this check (OTA-860 could be a part of this), or simply relax the criteria in the check further.

              lmohanty@redhat.com Lalatendu Mohanty
              afri@afri.cz Petr Muller
              Evgeni Vakhonin Evgeni Vakhonin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: