Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-54673

Leapp filesystem check and repair during upgrade

    • Icon: Story Story
    • Resolution: Can't Do
    • Icon: Undefined Undefined
    • None
    • None
    • leapp-repository
    • rhel-sst-upgrades
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None

      Goal: Provide a mechanism to check and repair local filesystem problems during upgrade

      Example: XFS AGFL corruption detected after upgrading from RHEL7 to RHEL8

      Customers upgrading from RHEL7 to RHEL 8 may be exposed to AGFL corruption issues due to a bug in the XFS v5 filesystem format on RHEL7. This can result in a warning being logged to dmesg and a number of filesystem blocks leaked[1].

      XFS (dm-6): WARNING: Reset corrupted AGFL on AG 24. 6 blocks leaked. Please unmount and run xfs_repair 

      When this message is printed the filesystem will correct the problem, leaking a
      small number of blocks. However customers seeing this message are concerned that it could be a problem for them causing them to open cases with support.

      There is a KCS article[2] for this issue, however the existence of these warnings is problematic, and I don't think we can say definitively that there can never be a problem caused by this issue, even though the probability seems extremely low and no issues have yet been seen in the field.

      [1] xfs: detect agfl count corruption and reset agfl 
      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a27ba2607e60312554cbcd43fc660b2c7f29dc9c
      [2] "WARNING: Reset corrupted AGFL" seen in logs on RHEL
      https://access.redhat.com/solutions/7049129

      Notes: The XFS AGFL problem was not found early enough in the Leapp lifecycle for RHEL7 to RHEL8 to be addressed during normal development, but has recently begun causing a support burden, possibly due to RHEL7 transitioning to EOL. A broader meta-story may be what options do Red Hat support or customers have to address specific issues outside of the main Leapp development cycle.

      Risks: Block layer problems should always be resolved before running fsck or repair. For example If a volume is partially assembled by LVM or dmraid. Traditionally LVM refuses to activate a partial volume but this should be taken into account.

      Acceptance Criteria

      A list of verification conditions, successful functional tests, or expected outcomes in order to declare this story/task successfully completed.

      • Upgrade can deal with small problems with local filesystems including XFS, Ext3, Ext4, Fat
        • Upgrade can repair the AGFL example above (see KCS solution for synthetic reproducer).
        • Upgrade can repair EFI fat problems
        • Upgrade can repair ext4 problems
      • Upgrade can detect and report on problems (fsck -n, xfs_repair -n).

       

              leapp-notifications leapp-notifications
              rhn-support-ddouwsma Donald Douwsma
              leapp-notifications leapp-notifications
              RHEL Upgrades QE Team RHEL Upgrades QE Team
              Miriam Portman Miriam Portman
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: