Uploaded image for project: 'OpenShift Storage'
  1. OpenShift Storage
  2. STOR-2277

[Upstream Cycle] Techdebt: Restart CSI sidecars faster

XMLWordPrintable

    • Upstream techdebt: Restart CSI sidecars faster
    • Upstream
    • 2
    • False
    • None
    • False
    • Green
    • To Do

      Epic Goal*

      Release leader election lease in all CSI sidecars when they receive SIGTERM, so their replacements can start immediately without waiting for the old lease to expire (typically 2-3 minutes).

       
      Why is this important? (mandatory)

      This will dramatically reduce time to start a replacement CSI sidecar, for example during a cluster upgrade. Right now, the new sidecar needs to wait for the old lease to expire (2-3 minutes), but they could start instantly.

      Tracked upstream as: https://github.com/kubernetes-csi/external-attacher/issues/609

      Scenarios (mandatory) 

      1. As OCP / Kubernetes developer / user, I can restart a CSI sidecar, e.g. with a new log level, and I can see the sidecar to start immediately working, without waiting 2 minutes for leader election, so I can debug my cluster faster.
      2. As OCP cluster admin, I can upgrade my cluster without 2-3 minute gap in volume provisioning / attachment during the upgrade, caused by the attacher / provisioner sidecars waiting for the old leader lease to expire.

       
      Dependencies (internal and external) (mandatory)

      All upstream CSI sidecars need to be updated (+ rebased in OCP)

      Contributing Teams(and contacts) (mandatory) 

      • Development - 
      • QE - 

      Acceptance Criteria (optional)

      • Update a CSI driver log level (clustercsidriver spec.logLevel) and check the restarted sidecars start working immediately.

      Drawbacks or Risk (optional)

      Done - Checklist (mandatory)

      The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.

      • CI Testing -  Basic e2e automationTests are merged and completing successfully
      • Documentation - Content development is complete.
      • QE - Test scenarios are written and executed successfully.
      • Technical Enablement - Slides are complete (if requested by PLM)
      • Engineering Stories Merged
      • All associated work items with the Epic are closed
      • Epic status should be “Release Pending” 

              rh-ee-rhrmo Richard Hrmo
              rhn-engineering-jsafrane Jan Safranek
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:

                  Estimated:
                  Original Estimate - 2 weeks
                  2w
                  Remaining:
                  Remaining Estimate - 1 week, 3 days
                  1w 3d
                  Logged:
                  Time Spent - Not Specified Time Not Required
                  Not Specified