Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-7058

[DOC] VMDR (Virtual Machine Data Protection) / Kopia client on VM (CNV)

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Major Major
    • OADP 1.6.0
    • None
    • Documentation
    • None
    • Quality / Stability / Reliability
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • ToDo
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • None

      OADP Virtual Machine Data Protection (VMDP) Design Implementation

      Based on OADP Operator PR: #1845

      Replaces Previous Designs: #1827, #1830

      Overview

      The Virtual Machine Data Protection (VMDP) feature introduces a OpenShift-native, client-server architecture that enables file-level backup and restore operations initiated from within OpenShift Virtual Machines. The system integrates seamlessly with OpenShift APIs and the existing OADP infrastructure, while preserving a clear separation of responsibilities between cluster administrators and VM users.

      Architecture Overview

      VMDP implements a client-server model where:
      _ _Backup Server:* Deployed and managed by the OADP Operator using Kopia server technology
      _ _VM Clients:* File-level backup operations initiated from within VMs
      _ _Repository Management:* Integrated with existing OADP backup storage locations
      _ _Authentication:* User-based access control with repository-level security

      Key Features

      File-Level Backup and Restore

      • User-initiated backup operations from within VMs
      • Selective file and directory backup capabilities
      • Point-in-time restore functionality
      • Integration with Kopia backup technology

      OpenShift Native Integration

      • Seamless integration with OADP infrastructure
      • BackupStorageLocationServer (BSLS) CRD for configuration
      • OpenShift API compatibility
      • Service-based architecture within OADP namespace

      Security and Access Control

      • User-based authentication (username/password)
      • Repository-level access control
      • TLS encryption for client-server communication
      • Secure credential management

      Technical Implementation

      BackupStorageLocationServer CRD

      • New custom resource for backup server configuration
      • Integration with existing DataProtectionApplication (DPA)
      • Automated deployment and lifecycle management
      • TLS certificate and fingerprint management

      Client-Server Communication

      • Kopia server deployed as part of OADP operator
      • Network connectivity requirements for VM-to-backup-server communication
      • Authentication flow: VM User -> [username/password] -> Backup Server -> [repo password] -> Repository

      Repository Management

      • Integration with existing OADP backup storage locations
      • Multi-user repository access
      • Backup retention and lifecycle policies
      • Storage backend compatibility (S3, etc.)

      Design Considerations

      Prerequisites

      • Internal OpenShift networking must allow VMs to connect to Backup Service in OADP namespace
      • Compatible container image for backup server (official Kopia or custom OADP extended image)
      • Image override capabilities via DPA configuration

      User Experience

      • Command-line interface for VM users
      • Preflight connectivity checks for troubleshooting
      • Network connectivity validation
      • Service availability verification
      • Performance and bandwidth checks

      Administrator Controls

      • Integration with DPA configuration or standalone CRD management
      • Repository quota and storage cap considerations
      • Monitoring and observability features

      Scope and Limitations

      In Scope

      • File-level backup and restore from within VMs
      • User-initiated backup operations
      • Integration with OADP infrastructure
      • Kopia server deployment and management
      • OpenShift-native API integration

      Out of Scope

      _ _Full VM Protection:* Does not modify or replace OADP's snapshot-based backup for entire VMs
      _ _Application Quiescing:* No application-level consistency mechanisms (user responsibility)
      _ _Block-Level Operations:* No support for raw block devices or unmounted partitions
      _ _Graphical User Interface:* No OpenShift Console integration (security and complexity considerations)
      _ _Storage Quotas:* No native mechanisms for enforcing storage quotas at repository level (future enhancement)

      Security Considerations

      Attack Surface Reduction

      • No web UI to minimize CVE impact
      • Limited network exposure
      • Secure authentication mechanisms
      • TLS encryption for all communications

      Multi-User Repository Management

      • Repository-level access control
      • Prevention of cross-user interference
      • Secure credential isolation
      • Configuration protection mechanisms

      Testing Strategy

      • Unit tests for VMDP components
      • Integration tests with OpenShift VMs
      • End-to-end backup and restore validation
      • Network connectivity and security testing
      • Performance benchmarking
      • Multi-user scenario validation

      Target Release

      • Implementation target: OADP 1.6.0
      • Technology Preview phase initially
      • Future enhancements for storage quotas and UI integration

      Benefits

      • Enhanced backup capabilities for VM workloads
      • User-driven backup operations without cluster admin involvement
      • Seamless integration with existing OADP infrastructure
      • Improved data protection for virtualized applications
      • Flexible file-level restore capabilities

       

      The Virtual Machine Data Protection (VMDP) feature implements a true Zero Trust model for VM data protection, shifting the responsibility and control of backup operations directly to the VM user.

      Core Principles

      • Zero Trust Architecture: The fundamental design principle is zero trust, ensuring administrators cannot restore or access the backup data. The VM user maintains complete ownership and control over the lifecycle of their data, including ownership of the data, the backup, and the encryption keys required for restoration.
      • User Autonomy: The user is solely responsible for choosing what data is backed up and restored. This includes the critical capability to protect data accessible over network file systems, such as Ceph or NFS shares, which are typically excluded by standard OADP backups.

      Implementation and Workflow

      • Client Application: The feature provides a pre-built, statically linked OADP VMDP client. This client is a reworked version of the Copia CLI and is specifically engineered for VMDP operations.
      • Client Access: The client can be easily and securely downloaded directly to the VM via an in-cluster HTTP service.
      • Personal Repository: To ensure zero trust, the user must provide their own credentials to create and manage a personal, encrypted repository within the designated S3 storage backend.
      • Simplified Backup: Backing up is streamlined to a single command, allowing users to protect any folders and files they can access. This includes local data and external network shares (Ceph/NFS).
      • Efficiency: The client leverages Copia's deduplication technology. This ensures data is stored efficiently and subsequent backup runs, even after large file changes or copies, remain very fast as only new or changed data blocks are transmitted.
      • Restoration Flexibility: Restoring files to a VM is a fast operation. The user has control over the restore location, allowing them to restore to a different folder or location, and can choose to restore from a specific, historical backup point.
      • Architecture: Based on Zero Trust Architecture principles, meaning the VM user is fully responsible for backup and restore operations and owns their data and keys.
      • Client: OADP provides a pre-built, statically linked VMDP client, which is a reworked version of the kopia CLI.
      • Documentation: Expected to follow the style of the oadp-cli documentation.

       

      1. Discover (Awareness and Introduction)

      Focus Content Goal Key VMDP Topics
      Why VMDP? Introduce the feature and address common user pain points related to internal VM data protection. Headline Benefit: Secure, user-controlled data protection under a Zero Trust model.
      Conceptual Overview Briefly explain what VMDP is and its core difference from standard platform backups. Emphasize that the VM user, not the administrator, owns the backup and the keys.
      High-Level Features List the key advantages to capture the user's interest. Deduplication (Efficiency), single-command simplicity, and ability to protect Ceph/NFS shares.

      2. Learn (Prerequisites and Principles)

      Focus Content Goal Key VMDP Topics
      Zero Trust Deep Dive Detail the security architecture and the shift in data ownership/responsibility. Core Principle: The user takes full responsibility. Explain why administrators cannot access or restore the data.
      Prerequisites Outline everything required before the user can begin, from both the administrative and VM perspective. Cluster Admin setup (S3 backend enablement), VM network access, and the user's requirement for personal S3 credentials.
      Client Mechanics Introduce the tool the user will interact with. Define the OADP VMDP Client as the reworked Copia CLI. Explain it is statically linked and downloaded via the in-cluster HTTP service.

      3. Try (Initial Setup and First Backup)

      Focus Content Goal Key VMDP Topics
      Client Access & Setup Provide the exact commands and steps for getting started. Step-by-step guide for downloading the client to the VM (e.g., using curl).
      Repository Creation Guide the user through setting up their private, encrypted storage. Detailed steps for using user-provided credentials to create and initialize the personal, encrypted repository in S3 storage.
      First Backup Walkthrough A simple, end-to-end example of protecting local data. The specific single command example for running the first backup of a local folder (e.g., /home/user/app_data).

      4. Adopt (Advanced Usage and Security)

      Focus Content Goal Key VMDP Topics
      Protecting External Data Show the feature's unique capability to secure complex data sources. Crucial Example: Specific command examples for backing up data on mounted network file systems like Ceph and NFS shares.
      Backup Efficiency Explain how subsequent backups work to build user confidence in performance. Explain Copia's deduplication and why subsequent runs are very fast, only transferring changed blocks.
      Restoration Workflow Detail the process of retrieving data, focusing on user choice. Step-by-step guide for restoring: basic restore, restoring to a different location, and selecting a specific backup point in time.

      5. Expand (Management and Maintenance)

      Focus Content Goal Key VMDP Topics
      Repository Management Instructions for managing the personal S3 repository over time. Key commands for checking repository status, listing existing backups, and potentially removing old/unneeded backups (data lifecycle).
      Troubleshooting Provide solutions for common issues encountered by VM users. Addressing client download failures, credential errors, S3 connectivity problems, and large file exclusions.
      Integration/Scripting Ideas for automating VMDP backups in a production environment. Guidance on integrating the single-command backup into VM startup/shutdown scripts or cron jobs for scheduled protection.

      JTBD statement

      "As a VM Application Owner, I want to perform encrypt and back up all custom configurations and network share data (e.g., Ceph or NFS) using my own S3 credentials, to achieve full, independent control and ownership of my data and encryption keys, preventing administrator access or reliance for fast restoration."

       

      Personas 
      1. The Primary User (Application Owner)

      This persona focuses on the core benefit of control and independence offered by the Zero Trust model.

      • As a VM Application Owner, I want to perform encrypt and back up all custom configurations and critical network share data (like Ceph or NFS) using my own S3 credentials, to achieve full, independent control and ownership of my data and encryption keys, preventing administrator access or reliance for fast restoration.

      2. The Efficiency-Focused User

      This persona prioritizes operational speed and minimal resource usage.

      • As a Production Engineer, I want to perform run subsequent data protection jobs using a single command, even after copying large amounts of files, to achieve complete my daily backups efficiently and extremely fast, leveraging deduplication so I don't waste time or storage space on redundant data.

      3. The Security & Recovery Specialist

      This persona focuses on the reliable, immediate, and unauthorized-access-free recovery process.

      • As a Recovery Specialist, I want to perform quickly restore specific files or folders to a different location within the VM from a chosen backup point, to achieve ensure immediate business continuity and verify that files are recoverable only by authorized users, strictly adhering to Zero Trust recovery protocols.

       

              rhn-support-anarnold A Arnold
              rhn-support-anarnold A Arnold
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: