Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-7057

[DOC] VMFR (VM Single File Restore)

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Major Major
    • OADP 1.6.0
    • None
    • Documentation
    • None
    • Quality / Stability / Reliability
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • ToDo
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • None

      The Virtual Machine File Restore (VMFR) project is a new capability for OADP umbrella designed to provide file-level recovery for OpenShift Virtual Machines (VMs).

      VMFR's primary goal is to enable fast, simple, and efficient data recovery of specific files directly from VM backups without needing to restore the entire VM. This recovery process can even be performed without starting the original VM.

      Key Features and Components

      Declarative Recovery

      VMFR utilizes a truly cloud-native, declarative approach, allowing users to search for and recover specific files across multiple backups using only Custom Resource Definitions (CRDs).

      Process Overview

      The VMFR process is managed through two CRDs and two phases:

      • VM Backup Discovery (VMBBD) CRD: Used to query available VM backups within a specified namespace and time frame. The controller updates this object with a list of discovered backups.
      • VM File Restore (VMFR) CRD: Uses the discovered backup list to specify how the desired files will be served to the user.

      The Restore Phase ("The Magic")

      The architectural steps ensure the original VM remains unaffected and secure:

      1. Isolation: A temporary isolated namespace is created for the restore request and file serving.
      2. Velero Restore: Velero restores the Persistent Volume Claim (PVC) from S3 into this temporary namespace, where the OpenShift Velero plug-in renames the PVC to prevent collisions.
      3. Deployment: VMFR deploys a three-container pod, where the Main Container uses libguestfs and guestmount to create FUSE mounts from the raw block devices.

      File Serving Options

      VMFR offers two secure ways for users to access the read-only mounted data:

      • Web-based File Browser (HTTPS): Best for interactive browsing, previewing, and zip downloads, using strong password authentication generated by the controller.
      • Limited SSH Mode: Supports automation, bulk downloads, and power users via SCP, rsync, or SFTP operations. Access is restricted to internal cluster networking and uses key-based authentication (public keys only).

      Security Layers

      VMFR follows a defense-in-depth approach with four security layers:

      • Isolation: Achieved through a temporary namespace per restore request and an SSH chroot jail mechanism.
      • Read-Only: All file serving mounts are set as read-only at the FUSE level.
      • Authentication: Uses public keys for SSH and strong passwords for the web browser.
      • Authorization: Leverages Kubernetes Role-Based Access Control (RBAC).

       

      1. Discover (Awareness and Introduction) 

      This stage introduces the problem VMFR solves and its core value proposition.

      Focus Content Goal Key VMFR Topics
      Why VMFR? Define the limitations of traditional VM backup recovery and introduce VMFR as the solution. Headline Benefit: Cloud-native, file-level recovery directly from VM backups.
      Core Value Summarize the key user benefits for data recovery. Recovery is faster, simpler, and more efficient. Recovery is possible without starting the VM, which prevents accidental file changes.
      Declarative Approach Highlight the modern, cloud-native method of operation. VMFR uses only Custom Resource Definitions (CRDs) to search for and recover specific files.

      2. Learn (Prerequisites and Principles) 

      This stage details the architecture, security models, and required components for the recovery process.

      Focus Content Goal Key VMFR Topics
      CRD-Driven Process Explain the two main components that drive the process. Introduce the two CRDs: VM Backup Discovery (VMBBD) CRD (for querying backups) and VM File Restore (VMFR) CRD (for specifying file serving).
      VMFR Architecture Detail the isolation and restoration mechanisms. Explain the use of PVC UID labels to find VM disks in S3 and the creation of a temporary isolated namespace for recovery.
      Core Recovery Engine Explain the "magic" behind mounting the volume. Detail the three-container pod deployment and the use of libguestfs and guestmount in the main privileged container to create read-only FUSE mounts.
      Security Principles Introduce the robust defense-in-depth security model. Cover the four layers: Authentication, Authorization (RBAC), Read-Only access, and Isolation (namespace, chroot jail).

      3. Try (Initial File Recovery) 

      This stage provides the end-to-end, hands-on user experience for recovering a file using the simplest method.

      Focus Content Goal Key VMFR Topics
      File Discovery Walkthrough Guide the user through finding the specific backup they need. Instructions and YAML examples for using the VMBBD CRD to specify a time frame and list discovered backups.
      Initiating the Restore Guide the user on creating the recovery environment. Walkthrough of creating the VMFR CRD to specify which backup to serve files from.
      Web Browser Access Guide the user through the easiest method for file retrieval. Instructions for accessing the Web-based File Browser Interface (HTTPS), including how to use the controller-generated strong password authentication for interactive browsing and downloads.
      Cleaning Up Simple instructions for tearing down the temporary recovery environment. Steps to delete the VMFR and VMBBD resources, which removes the temporary namespace and stops file serving.

      4. Adopt (Advanced Usage and Security) 

      This stage explores the power-user features, automation, and security controls for file retrieval.

      Focus Content Goal Key VMFR Topics
      Advanced Discovery Detail options for more specific backup querying. Using the VMBBD CRD to include the very first backup or selecting backups across multiple timeframes.
      Bulk Recovery/Automation Detail the methods for non-interactive, bulk data retrieval. Instructions for setting up and using the Limited SSH Mode via SFTP, SCP, or rsync.
      SSH Security Stress the security restrictions on the command-line access. Emphasize that only key-based authentication (public keys) is allowed, password access is disabled, and access is restricted to internal cluster networking. The SSH Sidecar is chrooted.
      File Browser Features Document the benefits of the web interface. Detail features like interactive browsing, file preview, and zip downloads.

       

      5. Expand (Management and Maintenance) 

      This stage provides guidance on the components involved in the lifecycle and troubleshooting.

      Focus Content Goal Key VMFR Topics
      Understanding PVC Management Detail what happens to the VM's data during the restore phase. Explain how Velero restores the PVC into the temporary namespace and how the plug-in renames the PVC to prevent naming collisions (the original VM is completely unaffected).
      CRD Reference Provide a full reference for the spec and status fields of both CRDs. Full detail on VMBBD and VMFR spec fields and how to interpret the status output.
      Troubleshooting & Debugging Solutions for issues during the discovery or serving phase. Troubleshooting guides for Velero restore errors, deployment failures in the temporary namespace, and access issues with the SSH/Web interfaces.

       
       
      JTBD statement:"As a VM Application Owner, I want to perform encrypt and back up all custom configurations and network share data (e.g., Ceph or NFS) using my own S3 credentials, to achieve full, independent control and ownership of my data and encryption keys, preventing administrator access or reliance for fast restoration."

      Personas

      The Virtual Machine File Restore (VMFR) feature enables faster, simpler, and more efficient file-level recovery directly from VM backups. Based on its capabilities and security controls, here are three distinct personas and their corresponding Jobs to Be Done (JTBD) statements:

      1. The Interactive User

      This persona prioritizes ease of use and quick validation for small, urgent recoveries.

      • As a VM Application User, I want to perform browse and preview the contents of a specific file backup using a familiar web interface, to achieve quickly find and securely download a few lost or corrupted files without needing complex command-line tools.

      2. The Power User / Automation Specialist

      This persona focuses on speed, automation, and bulk operations for large recovery tasks.

      • As a DevOps Engineer, I want to perform transfer a large directory of configuration files from a historical backup using a tool like rsync or SFTP, to achieve automate the bulk recovery process quickly and reliably via the restricted, key-based SSH access channel.

      3. The Recovery / Forensic Specialist

      This persona values security, isolation, and ensuring the integrity of the recovered data.

      • As a Data Recovery Specialist, I want to perform access the backed-up VM files without starting the original VM or affecting any current cluster resources, to achieve ensure that the files I recover are tamper-free and that the recovery operation is completely isolated to a temporary namespace, maintaining data integrity and security.

       

              rhn-support-anarnold A Arnold
              rhn-support-anarnold A Arnold
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: