Uploaded image for project: 'Virtualization Strategy'
  1. Virtualization Strategy
  2. VIRTSTRAT-369

Suspend and resume virtual machine

XMLWordPrintable

    • Product / Portfolio Work
    • 5
    • False
    • Hide

      None

      Show
      None
    • False

      Feature Overview

      The Suspend and Resume with Memory Snapshots feature enables capturing the complete execution state of a virtual machine (VM), including its memory contents, CPU state, and device states, allowing the VM to be paused and later resumed exactly from that point. Unlike traditional VM snapshots that primarily capture disk state, this feature ensures a live execution snapshot for debugging, testing, and operational efficiency.

      Goals

      High-level Goal Statement

      Enable VM suspend and resume functionality with full memory snapshots to support debugging, troubleshooting, testing, and rapid state recovery without requiring guest OS reboot or reconfiguration.

      Who Benefits and How?

      • Developers: Gain access to in-memory states at the time of failure for deeper debugging and root cause analysis.
      • QA/Test Engineers: Can freeze execution at breakpoints, share exact program states across teams, and reproduce transient issues reliably.
      • Operations/Support Teams: Avoid repeated reboots when restoring a system, reducing downtime and accelerating issue resolution.
      • Customers: Save time and resources by transferring full VM states (not just disk images), preserving the working environment.

      Difference from Today’s State

      • Today: Snapshots primarily capture disk and system configuration. Memory and execution state are lost, requiring a reboot. Debugging transient issues is difficult and time-consuming.
      • With This Feature: Memory, CPU, and device states are preserved. VMs resume instantly from the paused state, enabling rapid recovery, reproducible debugging, and efficient collaboration.

      Requirements

      Requirement Notes isMvp?
      Ability to capture full VM memory snapshot (RAM + CPU + device state) Required to restore exact execution state Yes
      Suspend VM execution and store memory snapshot on disk Triggered via API or UI Yes
      Resume VM from stored memory snapshot Must restore to identical state (no reboots) Yes
      Transfer snapshots across clusters/teams Enables debugging/testing handoffs Yes
      Integrate with snapshot management (list, delete, restore) Align with existing snapshot tools Yes
      Support differential/incremental memory snapshots Optimizes storage footprint No
      Encryption/compression of memory snapshots Security and efficiency No
      Integration with automated test pipelines Streamline debugging workflows No

      (Optional) Use Cases

      Use Case 1 – Customer-provided VM snapshot setup

      • Scenario: Customer provides a snapshot; engineering team sets up VM directly in the captured state.
      • Main Success Scenario: VM starts from paused state without OS reboot.
      • Alternative Flow: If snapshot is incompatible, fall back to boot from disk image.

      Use Case 2 – Debugging product crashes

      • Scenario: Product crashes inside VM; memory snapshot taken and shared with developer.
      • Main Success Scenario: Developer loads VM snapshot and replays crash state.
      • Alternative Flow: Snapshot is too large or corrupted → requires traditional logs/debugging.

      Use Case 3 – Software testing & team transfer

      • Scenario: QA pauses VM at a breakpoint, creates memory snapshot, and shares it with another team/cluster.
      • Main Success Scenario: Recipient resumes VM at the exact paused state for continuation.

      Use Case 4 – Faster recovery after issue

      • Scenario: Transient system/application issue captured in snapshot; repeatedly reverted to for testing/debugging.
      • Main Success Scenario: Engineers can reproduce and analyze without waiting for issue to reoccur.

      Questions to Answer

      • What are storage requirements for large memory snapshots (e.g., 32GB–4TB VMs)?
      • Should snapshot creation be blocking (freeze VM until complete) or asynchronous (allow continued execution)?
      • How will snapshot transfer work securely across environments?
      • How to handle hardware/driver mismatches when moving snapshots between clusters?

      Out of Scope

      • Disk-only snapshots (already supported by existing tools).
      • Incremental/differential snapshots in MVP.
      • Automated deduplication or compression beyond basic file handling.

      Background and Strategic Fit

      VM-based environments are widely used for software development, debugging, and testing. Traditional disk snapshots fail to capture transient in-memory issues, leaving engineers with limited visibility into real-time states. By supporting full memory snapshots with suspend/resume functionality, this feature enhances reliability, collaboration, and developer velocity, while reducing downtime and improving user experience.

      Assumptions

      • Users already have VM snapshot management workflows.
      • Snapshot files can be large; infrastructure must support adequate storage and transfer speeds.
      • Security and compliance (e.g., sensitive data in memory) will be handled via access controls.

              rhn-support-mtessun Martin Tessun
              fdeutsch@redhat.com Fabian Deutsch
              Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: