Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-4995

RHEL8 LVM2 raid1 LV's with --raidintegrity set become unpredictable if a leg fails.

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • rhel-8.8.0.z
    • None
    • None
    • Important
    • rhel-sst-logical-storage
    • ssg_filesystems_storage_and_HA
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • None
    • All
    • None

      What were you trying to do that didn't work? When a LVM raid1 LV has 

      --raidintegrity set it can mask that one leg of the raid is out of sync.

      And with integrity a LV does not recover from a transient in sync path loss. That is without integrity if you loose both insync paths due to a transient problem you can umount and remount the above FS to recover the FS. But with  --raidintegrity y you see the "kernel: device-mapper: integrity: Error on reading data: -5 " message and the only way to recover the FS ovet that LV is turn off raid integrity.

       Please provide the package NVR for which bug is seen:

      How reproducible:

       100%

      Steps to reproduce

      1.  create a lvm raid Lv
      2.  lvconvert -m 1 [vgname]/[lvname]
      3. enable raid intergity
      4.  lvconvert [vgname]/[lvname] --raidintegrity y
      5.  fail a leg of the raid,
      6. Bring that leg back and run pvscan --cache
      7. The failed leg is not back in sync but no displays show this. 

         In simple terms this lvmraid has two legs  /dev/mapper/block31 is active and /dev/mapper/block33 is out of sync. 

        LV             VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert LV Tags Devices                         KRahead Rahead #Str Stripe
        mlv            mvg rwi-a-r-r- 500.00m                                    100.00                   mlv_rimage_0(0),mlv_rimage_1(0) 128.00k   auto    2     0 
        [mlv_rimage_0] mvg Iwi-aor-r- 500.00m                                                             /dev/mapper/block33(0)          128.00k   auto    1     0 
        [mlv_rimage_1] mvg iwi-aor--- 500.00m                                                             /dev/mapper/block31(1)          128.00k   auto    1     0 
        [mlv_rmeta_0]  mvg ewi-aor-r-   4.00m                                                             /dev/mapper/block33(125)        128.00k   auto    1     0 
        [mlv_rmeta_1]  mvg ewi-aor---   4.00m                                                             /dev/mapper/block31(0)            4.00m   auto    1     0 

        Turn on integrity  lvconvert mvg/mlv --raidintegrity y and now the failure is masked

       ** Aftert cutting this bug I see The lv_attr bit: Volume Health for the not n sync leg is "(r)efresh needed,"! But we have lost the The lv_attr bit: Volume type: which becomes g for both legs as apposed to (i)mage, mirror or raid (I)mage out-of-sync! AND both legs report  Cpy%Sync  at 100.00 which may be true for integrity but is confusing for the Cpy%Sync function! 

      1. lvconvert mvg/mlv --raidintegrity y
          Creating integrity metadata LV mlv_rimage_0_imeta with size 12.00 MiB.
          Logical volume "mlv_rimage_0_imeta" created.
          Creating integrity metadata LV mlv_rimage_1_imeta with size 12.00 MiB.
          Logical volume "mlv_rimage_1_imeta" created.
          Using integrity block size 4096 for file system block size 4096.
          Logical volume mvg/mlv has added integrity.

          LV                   VG  Attr       LSize   Pool Origin               Data%  Meta%  Move Log Cpy%Sync Convert LV Tags Devices                         KRahead Rahead #Str Stripe
        mlv                  mvg rwi-aor-r- 500.00m                                                  100.00                   mlv_rimage_0(0),mlv_rimage_1(0) 128.00k   auto    2     0 
        [mlv_rimage_0]       mvg gwi-aor-r- 500.00m      [mlv_rimage_0_iorig]                        100.00                   mlv_rimage_0_iorig(0)           128.00k   auto    1     0 
        [mlv_rimage_0_imeta] mvg ewi-ao----  12.00m                                                                           /dev/mapper/block33(126)             0      0     1     0 
        [mlv_rimage_0_iorig] mvg wi-ao--- 500.00m                                                                           /dev/mapper/block33(0)            4.00m   auto    1     0 
        [mlv_rimage_1]       mvg gwi-aor--- 500.00m      [mlv_rimage_1_iorig]                        100.00                   mlv_rimage_1_iorig(0)           128.00k   auto    1     0 
        [mlv_rimage_1_imeta] mvg ewi-ao----  12.00m                                                                           /dev/mapper/block31(126)             0      0     1     0 
        [mlv_rimage_1_iorig] mvg wi-ao--- 500.00m                                                                           /dev/mapper/block31(1)            4.00m   auto    1     0 
        [mlv_rmeta_0]        mvg ewi-aor-r-   4.00m                                                                           /dev/mapper/block33(125)        128.00k   auto    1     0 
        [mlv_rmeta_1]        mvg ewi-aor---   4.00m                                                                           /dev/mapper/block31(0)            4.00m   auto    1     0 

       The same is true if a LEG is lost and comes back. while the lvmraid is already running raidintegrity.  If the loss is transient the Leg remains out of sync but this is not obvious to any administrator.

       This problem becomes serious or critical if at some time in the future the only insync path is lost and comes back. With raidintegrity on you will get the error

      "kernel: device-mapper: integrity: Error on reading data: -5"  and the LV cannot remount the FS that used this lv, until you turn raidintergity off.

       Where without raidintegrity when the insync path returns the LV can be reused, IE the FS umounted and mounted.

       Expected results

       raid integrity does not mask to lvs that a raid leg is out of sync and it allows a LV to be reused if the insync leg is temporarily lost and recovers.  

      Actual results

       using --raidintegreity masks failed raid legs on lvs

      and can cause a LV to not recover from a transient leg loss, until it is removed.  

      IE After both legs were lost and returned quickly

      [root@RHEL8 /]# umount /mnt
      [root@RHEL8 /]# mount /dev/mvg/mlv /mnt 
      mount: /mnt: can't read superblock on /dev/mapper/mvg-mlv.

      [root@RHEL8 ~]# lvconvert mvg/mlv --raidintegrity n
      [root@RHEL8 /]# mount /dev/mvg/mlv /mnt 

      [root@RHEL8 /]

              lvm-team lvm-team
              rhn-support-ldigby Lance Digby
              lvm-team lvm-team
              Cluster QE Cluster QE
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: