Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10816

Volume unmount repeats after successful unmount, preventing pod delete

XMLWordPrintable

    • +
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Release Note Not Required

      Description of problem:

      We have observed a situation where:
      - A workload mounting multiple EBS volumes gets stuck in a Terminating state when it finishes.
      - The node that the workload ran on eventually gets stuck draining, because it gets stuck on unmounting one of the volumes from that workload, despite no containers from the workload now running on the node.
      
      What we observe via the node logs is that the volume seems to unmount successfully. Then it attempts to unmount a second time, unsuccessfully. This unmount attempt then repeats and holds up the node.
      
      Specific examples from the node's logs to illustrate this will be included in a private comment. 

      Version-Release number of selected component (if applicable):

      4.11.5

      How reproducible:

      Has occurred on four separate nodes on one specific cluster, but the mechanism to reproduce it is not known.

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      A volume gets stuck unmounting, holding up removal of the node and completed deletion of the pod.

      Expected results:

      The volume should not get stuck unmounting.

      Additional info:

       

            jdobson@redhat.com Jonathan Dobson
            mbargenq Matt Bargenquast (Inactive)
            Wei Duan Wei Duan
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: