Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-41238

Online reencryption may run with block device file descriptor open without (essential) O_DIRECT flag.

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • rhel-9.5
    • rhel-9.5
    • cryptsetup
    • None
    • cryptsetup-2.7.2-3.el9_5
    • Yes
    • Important
    • rhel-sst-logical-storage
    • ssg_filesystems_storage_and_HA
    • 27
    • 28
    • 5
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Approved Blocker
    • x86_64
    • None

      I'm not positive that this belongs as a bug against LVM, but I figured I start here. This happens more often when running with a luks header. When not using a header, this test scenario can go many iterations without seeing this issue. I have gone back reproduced this issue on rhel9.4 as well.

      kernel-5.14.0-457.el9    BUILT: Thu May 30 07:38:48 PM EDT 2024
      lvm2-2.03.24-1.el9    BUILT: Fri Jun  7 10:32:04 AM EDT 2024
      lvm2-libs-2.03.24-1.el9    BUILT: Fri Jun  7 10:32:04 AM EDT 2024
        
       
      SCENARIO - [cache_pool_resize_in_between_luks_encryption_operations]
      Create snapshots of encrypted luks cache origin with fs data, and then extend and (attempt to) reduce both the cache pool volume in between re-encryption stack operations
       
      *** Cache info for this scenario ***
      *  origin (slow):  /dev/sdf1
      *  pool (fast):    /dev/sdc1
      ************************************
       
      Adding "slow" and "fast" tags to corresponding pvs
      pvchange --addtag slow /dev/sdf1
      pvchange --addtag fast /dev/sdc1
      Create origin (slow) volume
      lvcreate --yes --wipesignatures y  -L 4G -n corigin cache_sanity @slow
       
      Create cache data and cache metadata (fast) volumes
      lvcreate --yes  -L 4G -n resize cache_sanity @fast
      lvcreate --yes  -L 12M -n resize_meta cache_sanity @fast
       
      Create cache pool volume by combining the cache data and cache metadata (fast) volumes
      lvconvert --yes --type cache-pool --poolmetadata cache_sanity/resize_meta cache_sanity/resize
        WARNING: Converting cache_sanity/resize and cache_sanity/resize_meta to cache pool's data and metadata volumes with metadata wiping.
        THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
       
      Create cached volume by combining the cache pool (fast) and origin (slow) volumes
      lvconvert --yes --type cache --cachepool cache_sanity/resize cache_sanity/corigin
       
      Encrypting corigin volume
      cryptsetup reencrypt --encrypt --init-only /dev/cache_sanity/corigin --header /tmp/cache_pool_resize_luks_header.3901450
      cryptsetup reencrypt /dev/cache_sanity/corigin --header /tmp/cache_pool_resize_luks_header.3901450
      cryptsetup luksOpen /dev/cache_sanity/corigin luks_corigin --header /tmp/cache_pool_resize_luks_header.3901450
       
      Placing an xfs filesystem on origin volume
      Mounting origin volume
      Writing files to /mnt/corigin
      Checking files on /mnt/corigin
       
      syncing before snap creation...
      Hack until live pool resize is supported: uncache and recreate pool w/ a larger size (lvconvert --yes --uncache) cache_sanity/corigin
       
      Create cache data and cache metadata (fast) volumes
      lvcreate --yes  -L 5G -n resize cache_sanity @fast
      lvcreate --yes  -L 12M -n resize_meta cache_sanity @fast
       
      Create cache pool volume by combining the cache data and cache metadata (fast) volumes
      lvconvert --yes --type cache-pool --poolmetadata cache_sanity/resize_meta cache_sanity/resize
        WARNING: Converting cache_sanity/resize and cache_sanity/resize_meta to cache pool's data and metadata volumes with metadata wiping.
        THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
      Create cached volume by combining the cache pool (fast) and origin (slow) volumes
      lvconvert --yes --type cache --cachepool cache_sanity/resize cache_sanity/corigin
       
      Writing files to /mnt/corigin
      (ONLINE) Re-encrypting corigin volume
      cryptsetup reencrypt --resilience journal --active-name luks_corigin --header /tmp/cache_pool_resize_luks_header.3901450
       
      syncing before snap creation...
      Checking files on /mnt/corigin
       
      Making 2nd snapshot of origin volume
      lvcreate --yes  -s /dev/cache_sanity/corigin -c 128 -n snap2 -L 4296704
      cryptsetup luksOpen /dev/cache_sanity/snap2 luks_snap2 --header /tmp/cache_pool_resize_luks_header.3901450
       
      [root@grant-03 ~]# lvs -a -o +devices,segtype
        LV                   VG            Attr       LSize    Pool           Origin          Data%  Meta%  Move Log Cpy%Sync Convert Devices               Type      
        corigin              cache_sanity  owi-aoC---    4.00g [resize_cpool] [corigin_corig] 3.82   8.27            0.00             corigin_corig(0)      cache     
        [corigin_corig]      cache_sanity  owi-aoC---    4.00g                                                                        /dev/sdf1(0)          linear    
        [lvol0_pmspare]      cache_sanity  ewi-------   12.00m                                                                        /dev/sdb1(0)          linear    
        [resize_cpool]       cache_sanity  Cwi---C---    5.00g                                3.82   8.27            0.00             resize_cpool_cdata(0) cache-pool
        [resize_cpool_cdata] cache_sanity  Cwi-ao----    5.00g                                                                        /dev/sdc1(0)          linear    
        [resize_cpool_cmeta] cache_sanity  ewi-ao----   12.00m                                                                        /dev/sdc1(1280)       linear    
        snap2                cache_sanity  swi-aos---    4.00g                corigin         0.06                                    /dev/sdb1(3)          linear    
       
      Mounting snap volume
      mount: /mnt/snap2: mount(2) system call failed: Structure needs cleaning.
      couldn't mount fs snap on /mnt/snap2
       
       
      Jun 14 14:37:19 grant-03 qarshd[226107]: Running cmdline: mount -o nouuid /dev/mapper/luks_snap2 /mnt/snap2
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Mounting V5 Filesystem 646bb303-3591-4a2b-82ee-17b75ea33629
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Starting recovery (logdev: internal)
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Metadata corruption detected at xfs_buf_ioend+0x101/0x220 [xfs], xfs_inode block 0x80 xfs_inode_buf_verify
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Unmount and run xfs_repair
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): First 128 bytes of corrupted metadata buffer:
      Jun 14 14:37:19 grant-03 kernel: 00000000: 75 82 75 2a 1f 8c 1d 84 ca cc d4 24 91 ae 96 11  u.u*.......$....
      Jun 14 14:37:19 grant-03 kernel: 00000010: e2 84 54 b8 3b 0f 3e 57 75 5f 32 f9 4c 06 90 1c  ..T.;.>Wu_2.L...
      Jun 14 14:37:19 grant-03 kernel: 00000020: 72 92 a3 3a cf 0f 16 da c9 29 bb 83 27 cd a7 bf  r..:.....)..'...
      Jun 14 14:37:19 grant-03 kernel: 00000030: 2a 04 6f f6 4d f7 5d 09 70 56 65 c3 20 00 ba e9  *.o.M.].pVe. ...
      Jun 14 14:37:19 grant-03 kernel: 00000040: 86 30 90 8f 76 2c 45 9d 34 65 e2 60 b1 e5 ac d4  .0..v,E.4e.`....
      Jun 14 14:37:19 grant-03 kernel: 00000050: 61 84 f2 a8 5e 82 4e 8f 07 31 3d 30 2c dc b2 32  a...^.N..1=0,..2
      Jun 14 14:37:19 grant-03 kernel: 00000060: 99 73 94 93 70 e4 32 64 ee c4 df 3e 67 d0 b8 7d  .s..p.2d...>g..}
      Jun 14 14:37:19 grant-03 kernel: 00000070: 36 dc 79 d1 0d 3c 0e 99 67 c9 bb 25 f5 5d b6 ca  6.y..<..g..%.]..
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): metadata I/O error in "xlog_recover_items_pass2+0x51/0xd0 [xfs]" at daddr 0x80 len 32 error 117
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Metadata corruption detected at xfs_agi_verify+0x34/0x170 [xfs], xfs_agi block 0x10
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Unmount and run xfs_repair
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): First 128 bytes of corrupted metadata buffer:
      Jun 14 14:37:19 grant-03 kernel: 00000000: 58 41 47 49 00 00 00 01 00 00 00 00 00 02 00 00  XAGI............
      Jun 14 14:37:19 grant-03 kernel: 00000010: 00 00 09 80 00 00 00 06 00 00 00 01 00 00 00 1d  ................
      Jun 14 14:37:19 grant-03 kernel: 00000020: 00 07 53 40 ff ff ff ff ff ff ff ff ff ff ff ff  ..S@............
      Jun 14 14:37:19 grant-03 kernel: 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
      Jun 14 14:37:19 grant-03 kernel: 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
      Jun 14 14:37:19 grant-03 kernel: 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
      Jun 14 14:37:19 grant-03 kernel: 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
      Jun 14 14:37:19 grant-03 kernel: 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Corruption of in-memory data (0x8) detected at __xfs_buf_submit+0x6e/0x1e0 [xfs] (fs/xfs/xfs_buf.c:1551).  Shutting down filesystem.
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): Please unmount the filesystem and rectify the problem(s)
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): log mount/recovery failed: error -117
      Jun 14 14:37:19 grant-03 kernel: XFS (dm-11): log mount failed
      

              okozina@redhat.com Ondrej Kozina
              cmarthal@redhat.com Corey Marthaler
              Ondrej Kozina Ondrej Kozina
              Guangwu Zhang Guangwu Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: