Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-381

[2322671] [Stretch cluster] RWX storage issue on surviving zone : MountVolume.MountDevice failed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • odf-4.18
    • odf-4.17
    • ceph
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • If docs needed, set a value
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):

      Stretch Cluster testing w/ "DR" scenario - entire application was running on zone-1, took zone-1 down, & monitored recovery of application to zone-2. The application relocated to zone-2 and portion was up and running in approx 20min, but several pods requiring RWX storage where struck in ContainerCreating with similar message to below :

      Events:
      Type Reason Age From Message
      ---- ------ ---- ---- -------
      Normal Scheduled 3m48s ibm-cpd-scheduler Successfully assigned cpd-ins/asset-files-api-5f46c7f599-hvfdz to dahorak-ibmcloud-bwvbz-worker-2-6dzcw
      Warning FailedMount 38s kubelet MountVolume.MountDevice failed for volume "pvc-a45db048-7f1e-4be6-8ca2-2f47ea09046e" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0011-openshift-storage-0000000000000001-a9a70a11-eca5-4377-aad4-7f276bfb1d46 already exists

      Version of all relevant components (if applicable):

      Does this issue impact your ability to continue to work with the product
      (please explain in detail what is the user impact)? Yes - unable to recover application on surviving zone

      Is there any workaround available to the best of your knowledge? No

      Rate from 1 - 5 the complexity of the scenario you performed that caused this
      bug (1 - very simple, 5 - very complex)? 4 - custom install of application

      Can this issue reproducible? likely - may be timing based

      Can this issue reproduce from the UI? no

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:
      1. Install application with multiple pods using RWX storage on zone-1
      2. Shutdown zone-1 & force delete pods once enter Terminating state
      3. Monitor application relocation to surviving zone

      Actual results:
      Pods with RWX storage are able to mount drives

      Expected results:
      Pods with RWX storage were NOT able to mount drives

      Additional info:

              lflores@redhat.com Laura Flores
              morstad Nancy Heinz
              Venky Shankar
              Neha Berry Neha Berry
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated: