Uploaded image for project: 'OpenShift Storage'
  1. OpenShift Storage
  2. STOR-1714

Release leader election on operator shutdown

    XMLWordPrintable

Details

    • Story
    • Resolution: Done
    • Undefined
    • None
    • None
    • None
    • False
    • None
    • False
    • Sprint 246

    Description

      As OCP user, I want storage operators restarted quickly and newly started operator to start leading immediately without ~3 minute wait.

      This means that the old operator should release its leadership after it receives SIGTERM and before it exists. Right now, storage operators fail to release the leadership in ~50% of cases.

      Steps to reproduce:

      1. Delete an operator Pod (`oc delete pod xyz`).
      2. Wait for a replacement Pod to be created.
      3. Check logs of the replacement Pod. It should contain "successfully acquired lease XYZ" relatively quickly after the Pod start (+/- 1 second?)
      4. Go to 1. and retry few times.

       

      This is an hack'n'hustle "work", not tied to any Epic, I'm using it just to get proper QE and tracking what operators are being updated (see linked github PRs).

      Attachments

        Activity

          People

            rhn-engineering-jsafrane Jan Safranek
            rhn-engineering-jsafrane Jan Safranek
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: