Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-45305

Frequent delays in operator lease acquisition, potentially due to server or etcd unavailability, require investigation to identify root causes.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.18.0
    • kube-apiserver
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Some time ago, a test ([sig-arch] all leases in ns/%s must gracefully release) checking for graceful lease releases was added to openshift/origin.

      The test verifies whether components/operators release their leases gracefully so that the next instance/replica can quickly acquire a new lease and start its work.

       

      However, based on this query, it appears that the test frequently 'fails' for various operators.

      Here is an example of a failed CI run where the 'openshift-kube-apiserver-operator' waited 144 seconds before acquiring a lease.

      For operators using library-go, the worst non-graceful lease acquisition time is 2m43s, while the worst graceful lease acquisition time is 26s.

      I'm also attaching a timeline from the same run, which shows that lease acquisition took longer for other components around the same time.

       

      We should investigate why, in some cases, components/operators take longer to acquire their leases.

      The suspicion is that leases are not released gracefully due to server unavailability, which may result from etcd unavailability. However, I don't have data to support this theory.

       

      Note: This issue might be connected to https://issues.redhat.com/browse/OCPBUGS-42087 and a few other failures. It is possible that all of them share the same root cause.

              Unassigned Unassigned
              lszaszki@redhat.com Lukasz Szaszkiewicz
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: