-
Epic
-
Resolution: Unresolved
-
Major
-
None
-
None
-
Pre-Migration Capacity Guidance for CCLM
-
Product / Portfolio Work
-
False
-
-
False
-
Not Selected
-
To Do
-
VIRTSTRAT-54 - RHACM Cross Cluster Live VM Migration
Epic Goal
Implement checks within the CCLM workflow at the management layer (ACM) to assess and provide visibility into the target cluster's resource capacity, thereby preventing migrations from stalling indefinitely and significantly improving the user experience.
Why is this important?
Currently, when a Cross-Cluster Live Migration (CCLM) is initiated (via the management layer/ACM), no proactive check for resource capacity (storage, CPU, memory) on the target cluster is performed. This causes the migration to never start, as the necessary receiver pod remains in a pending state indefinitely. This lack of feedback at the initiation stage leads to confusion, troubleshooting overhead, and a poor user experience.
This effort is distinct from, but related to, the existing work on "hard" pre-flight checks (CNV-71000). This Epic focuses specifically on the "soft" capacity requirements and enhanced user guidance from the management layer.
Scenarios
Current Behavior (KubeVirt/MTV Level) # CCLM is initiated.
- The target cluster attempts to create the required Virtual Machine and Persistent Volumes.
- If resources are insufficient, the receiving $\text{virt-launcher}$ pod enters a pending state.
- The actual live migration never begins, as the receiver component isn't running.
- This behavior is consistent regardless of whether the CCLM is triggered directly or through the management layer.
Acceptance Criteria
Desired Outcome (ACM/Management Layer)
The management layer, which has a holistic view of all clusters, must provide proactive guidance to the user. Before CCLM is permitted or executed, the system should: # Check resource capacity (storage, CPU, and memory) on the selected target cluster.
- Treat capacity shortages as "soft" requirements (unlike "hard" requirements like network connectivity).
- If a capacity issue is detected, alert the user about the shortage or suggest a more suitable cluster where the workload will fit.
Dependencies (internal and external)
- ...
Previous Work (Optional):
- ...
Open questions:
- ...
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
Issue> - DEV - Upstream documentation merged: <link to meaningful PR or GitHub
Issue> - DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Doc issue opened with a completed template. Separate doc issue
opened for any deprecation, removal, or any current known
issue/troubleshooting removal from the doc, if applicable. - Considerations were made for Extended Update Support (EUS)