Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7638

LVM-activate: Monitor operation succeeds when PV has been removed

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • None
    • Important
    • rhel-ha
    • ssg_filesystems_storage_and_HA
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None
    • 57,005

      Description of problem:

      If the managed volume group's underlying PV has been removed, the LVM-activate monitor operation still succeeds.

      [root@fastvm-rhel-8-0-24 ~]# pcs resource config lvm
      Resource: lvm (class=ocf provider=heartbeat type=LVM-activate)
      Attributes: activation_mode=exclusive vg_access_mode=system_id vgname=test_vg1

      [root@fastvm-rhel-8-0-24 ~]# pcs resource debug-start lvm
      Operation start for lvm (ocf:heartbeat:LVM-activate) returned: 'ok' (0)

      [root@fastvm-rhel-8-0-24 ~]# vgs test_vg1
      VG #PV #LV #SN Attr VSize VFree
      test_vg1 1 1 0 wz-n 992.00m 696.00m

      [root@fastvm-rhel-8-0-24 ~]# /usr/sbin/iscsiadm -m node --logoutall=all
      Logging out of session [sid: 1, target: iqn.2003-01.org.linux-iscsi.fastvm-rhel-7-6-51.x8664:sn.9677dbd9a870, portal: 192.168.22.51,3260]
      Logout of [sid: 1, target: iqn.2003-01.org.linux-iscsi.fastvm-rhel-7-6-51.x8664:sn.9677dbd9a870, portal: 192.168.22.51,3260] successful.

      [root@fastvm-rhel-8-0-24 ~]# vgs test_vg1
      Volume group "test_vg1" not found.
      Cannot process volume group test_vg1

      [root@fastvm-rhel-8-0-24 ~]# pcs resource debug-monitor lvm
      Operation monitor for lvm (ocf:heartbeat:LVM-activate) returned: 'ok' (0)

      This is because the monitor operation checks `dmsetup info` for the VG. When the PV is abruptly removed, the mapping does not get removed from device-mapper, so the VG still looks active in the dmsetup output.

      ~~~
      else
      dm_count=$(dmsetup info --noheadings --noflush -c -S "vg_name=${VG}" | grep -c -v '^No devices found')
      fi

      if [ $dm_count -eq 0 ]; then
      return $OCF_NOT_RUNNING
      fi

      return $OCF_SUCCESS
      }
      ~~~

      We need to add a more reliable test for the existence of the VG, either in addition to or in place of `dmsetup info`.

      Alternatively, if this is considered a bug in dmsetup info, then we should get that fixed. (I assume it's behaving as expected and that there's no mechanism to remove the mapping in this case.)


      Version-Release number of selected component (if applicable):

      resource-agents-4.1.1-68.el8.x86_64


      How reproducible:

      Always


      Steps to Reproduce:
      1. Create and start an LVM-activate resource.
      2. Remove the managed VG's underlying PV (e.g., log out of the iSCSI session if it's presented via iSCSI).
      3. Run the resource's monitor opearation.


      Actual results:

      Monitor operation succeeds.


      Expected results:

      Monitor operation fails.


      Additional info:

      If we make the monitor operation fail, then the stop operation is also likely to fail because the missing VG can't be deactivated (see also BZ 1902208). This probably isn't desirable behavior. If the volume group fails to deactivate *because it doesn't exist*, this should probably be considered a successful stop. However, I could see that point as debatable, since the VG doesn't get to stop cleanly.

              rhn-engineering-oalbrigt Oyvind Albrigtsen
              rhn-support-nwahl Reid Wahl
              Oyvind Albrigtsen Oyvind Albrigtsen
              Cluster QE Cluster QE
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: