XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Major Major
    • 2.11.0
    • None
    • Documentation
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected

      Data Integrity Enhancements in MTV (RHEL 10.1)

      This feature ticket outlines the initiatives to significantly strengthen data integrity and consistency during VM migration within the Migration Toolkit for Virtualization (MTV), leveraging features and code rigor introduced in RHEL 10.1.

      The core philosophy driving these changes is to enhance the migration process to detect and report any potential data issues as fast as possible before handing the VM to the customer.

      1. Filesystem-Level Consistency

      The MTV upgrade to utilize RHEL 10.1 introduces a robust check at the filesystem layer:

      • Pre- and Post-Conversion fsck: We are implementing the execution of fsck (filesystem check) both before the start of the conversion and after its completion. This ensures filesystem-level consistency is validated across the migration boundary.

      2. Enhanced System Error Handling

      A comprehensive review of the codebase was conducted to eliminate previously mistaken assumptions about system call behavior:

      • Mandatory Return Code Checks: All system calls are now strictly configured to check their return codes and error out appropriately upon failure. This removes previous, incorrect logic where some system calls were treated as merely advisory, ensuring greater robustness and reliability against transient system or storage errors.

      3. Internal Diagnostics and Auditing

      While we maintain the assumption that underlying network and storage layers manage their own integrity, we are adding an internal audit mechanism:

      • Blkhash Checksum: We compute a blkhash (a type of checksum) during the disk copying phase. This is primarily an internal diagnostic tool, crucial for our support team to efficiently identify and troubleshoot cases where customers suspect storage corruption during or after migration.

      4. Downtime Minimization (Contextual Features)

      While focused on data integrity, MTV continues to support methods for minimizing the impact of the migration cutover:

      • Warm Migration: We allow warm migrations, which utilize snapshots to migrate underlying disks while the source VM remains operational, significantly reducing final downtime.
      • Storage Offload: The Storage Offload feature (available via Transfer Provider, TP) facilitates highly efficient disk migration by performing the data transfer directly within the storage array itself.

      Documentation Note

      As of the creation of this ticket, dedicated documentation detailing these specific data integrity practices and improvements is not available in the official product guides (e.g., in MTV 2.10 documentation). The output of this work should inform new or updated documentation sections.

      JTBD statement 

      When I am migrating critical virtual machines using MTV and RHEL 10.1 features, I want to leverage built-in filesystem consistency checks, diagnostic tools like blkhash checksums, and mandatory system call checks so I can be absolutely confident in the integrity and consistency of my workload data while minimizing downtime through robust and reliable migration procedures.

      Personas

      1. As a Virtualization Administrator, I want to leverage filesystem-level consistency checks (pre- and post-conversion fsck) on my migrating workloads to achieve maximum confidence and verifiable data consistency for all migrated virtual machines.
      2. As a Migration Troubleshooting Specialist, I want to access internal diagnostic tools, such as blkhash checksums and mandatory return code checks to achieve the ability to quickly and accurately identify and resolve the root cause of any storage corruption or reliability issues during transfer.
      3. As a Service Owner, I want to migrate critical applications that require high data integrity assurance to achieve the minimization of service downtime through robust warm migrations while still guaranteeing the safety and consistency of the application data.

       

              rhn-support-anarnold A Arnold
              rhn-support-anarnold A Arnold
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: