Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-30204

Inactive LVs leading failure for PV stage volume failed error

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Minor Minor
    • None
    • 4.14.z
    • RHCOS
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Attachment of the persistent volume related device gets failed. These PVs are of Dell vendor. From Dell side they have isolated this issue happening due to inactive logical volumes.
      
      Attachment failure error at OCP level:
      ~~~
      Mar 02 15:43:56 c1-esx04.rackk13.local kubenswrapper[10759]: E0302 15:43:56.023135   10759 csi_attacher.go:364] kubernetes.io/csi: attacher.MountDevice failed: rpc error: code = Internal desc = failed to stage volume: unable to mount /dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-47de9b6c-02fa-4e4e-b7a7-0b4545cb44d6 to /var/lib/kubelet/plugins/kubernetes.io/csi/csi-baremetal/38b03bf352349fbfab662c4c71e2cbaad31f3e66f473222295570042e9f0e7ea/globalmount/dev: <nil>
      ~~~
      
      Dell team found out the root cause for this is inactive logical volumes:
      ~~~
      # lvscan
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-07fbf5b2-4406-4079-bec3-fdc82f025771' [16.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-2860d8e4-ceba-48d4-9422-06150f501cae' [3.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-e66bad6a-af19-4a0f-86e0-0e55872f21c8' [200.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-12bdc137-6aa5-4abc-ab7b-7ebeb938d652' [10.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-dbcf3239-9f89-4768-aa09-e8165e963470' [20.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-47de9b6c-02fa-4e4e-b7a7-0b4545cb44d6' [10.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-6dde3ae7-6278-414a-8382-24b362513637' [20.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-7a607cc5-6a32-48fe-81a4-e3359dd6dd72' [128.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-a22a5025-d136-45e2-8e70-8c15cf226698' [20.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-1a9064ac-b980-4c5b-939a-63927f532df8' [2.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-4827027e-b0bf-4c6e-a61d-f94d0828680e' [20.00 GiB] inherit
        inactive          '/dev/cabd8341-1dc3-41fd-beba-fb5c6bfb4a33/pvc-6086b007-3420-4fcd-84d6-040cf397aba7' [10.00 GiB] inherit
      ~~~
      
      Below workaround is applied as mentioned below:
      
      ~~~
      ====workaround steps===
      This fix is for CoreOS and LVM implementation. It enables allowlist poulation after system restart
      
      Copy coreos-populate-lvmdevices-fix.service to this location: /etc/systemd/system/coreos-populate-lvmdevices-fix.service
      Create symlink /etc/systemd/system/default.target.wants/coreos-populate-lvmdevices-fix.service -> /etc/systemd/system/coreos-populate-lvmdevices-fix.service
      Update systemd
      systemctl daemon-reload
      Enable new service
      systemctl enable --now coreos-populate-lvmdevices-fix
      =====
      
      ==== service file contents coreos-populate-lvmdevices-fix.service====
      # File location: /etc/systemd/system/coreos-populate-lvmdevices-fix.service
      # Symlink: /etc/systemd/system/default.target.wants/coreos-populate-lvmdevices-fix.service -> /etc/systemd/system/coreos-populate-lvmdevices-fix.service
      [Unit]
      Description=CoreOS Populate LVM Devices File Fix
      # Don't add default dependencies so we can run early enough to populate
      # the devices file before any LVM devices are used.
      DefaultDependencies=false
      RequiresMountsFor=/var/lib
      Before=coreos-populate-lvmdevices.service
      
      [Service]
      Type=oneshot
      RemainAfterExit=yes
      ExecStartPre=cp /usr/etc/lvm/devices/system.devices /etc/lvm/devices/system.devices
      ExecStart=-rm /var/lib/coreos-populate-lvmdevices.stamp
      
      [Install]
      WantedBy=default.target
      ====
      
      ~~~

      Version-Release number of selected component (if applicable):

      Red Hat Enterprise Linux CoreOS 414.92.202401121330-0

      How reproducible:

      Always

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

      LVs in inactive status. 

      Expected results:

      Without workaround, need proper way to fix this. Also, need to verify whether the applied workaround is correct or not. 

      Additional info:

          

            rhn-gps-dmabe Dusty Mabe
            rhn-support-adeshpan Aditya Deshpande
            Michael Nguyen Michael Nguyen
            Timothée Ravier
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: