Loading...

Linking RHIVOS CVEs to...

Migration: Automation ...

SWIFT: POC Conversion

Sync from "Extern...

XML

Word

Printable

Type: Story
Resolution: Done-Errata
Priority: Undefined
Fix Version/s: rhel-10.1
Affects Version/s: None
Component/s: libvirt / Live Migration
Labels:

Fixed in Build:
libvirt-11.1.0-1.el10
Severity:
None

AssignedTeam:
rhel-virt-core-libvirt-1
Sub-System Group:

ssg_virtualization

Internal Target Milestone:
9
Story Points:
8
ACKs Check:

Dev ack
Blocked:
False
Ready:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
Yes
Products:

Red Hat Enterprise Linux
Sprint:
None

Preliminary Testing:
Pass
Errata Link:
https://errata.engineering.redhat.com/advisory/148139
Test Coverage:

Manual

Release Note Type:
Feature
Release Note Text:

Hide
.New option for VM live migration: `--available-switchover-bandwidth`

When live-migrating a virtual machine (VM) by using the `virsh migrate --live` command, you can now add the `--available-switchover-bandwidth` option to specify the bandwidth at which the migration switches over to the destination host in the pre-copy process. By default, the hypervisor measures the available bandwidth automatically, but when this might not reliably ensure that the live migration finishes successfully, using `--available-switchover-bandwidth` can fix the issue.

Show
.New option for VM live migration: `--available-switchover-bandwidth` When live-migrating a virtual machine (VM) by using the `virsh migrate --live` command, you can now add the `--available-switchover-bandwidth` option to specify the bandwidth at which the migration switches over to the destination host in the pre-copy process. By default, the hypervisor measures the available bandwidth automatically, but when this might not reliably ensure that the live migration finishes successfully, using `--available-switchover-bandwidth` can fix the issue.
Release Note Status:
Done

Experience:
Target Upstream Version:
11.1.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

Description

qemu 8.2 introduced a new live migration param called avail-switchover-bandwidth, which let users to specify the switchover bandwidth based on their network capacity, hence improve the chance for live migration to converage

The qemu commit for the feature:

commit 8b2395970aa3beab91b98dda89c7ed471e65ad25
Author: Peter Xu <peterx@redhat.com>
Date:   Tue Oct 10 18:19:22 2023 -0400

    migration: Allow user to specify available switchover bandwidth
    
    Migration bandwidth is a very important value to live migration.  It's
    because it's one of the major factors that we'll make decision on when to
    switchover to destination in a precopy process.
    
    This value is currently estimated by QEMU during the whole live migration
    process by monitoring how fast we were sending the data.  This can be the
    most accurate bandwidth if in the ideal world, where we're always feeding
    unlimited data to the migration channel, and then it'll be limited to the
    bandwidth that is available.
    
    However in reality it may be very different, e.g., over a 10Gbps network we
    can see query-migrate showing migration bandwidth of only a few tens of
    MB/s just because there are plenty of other things the migration thread
    might be doing.  For example, the migration thread can be busy scanning
    zero pages, or it can be fetching dirty bitmap from other external dirty
    sources (like vhost or KVM).  It means we may not be pushing data as much
    as possible to migration channel, so the bandwidth estimated from "how many
    data we sent in the channel" can be dramatically inaccurate sometimes.
    
    With that, the decision to switchover will be affected, by assuming that we
    may not be able to switchover at all with such a low bandwidth, but in
    reality we can.
    
    The migration may not even converge at all with the downtime specified,
    with that wrong estimation of bandwidth, keeping iterations forever with a
    low estimation of bandwidth.
    
    The issue is QEMU itself may not be able to avoid those uncertainties on
    measuing the real "available migration bandwidth".  At least not something
    I can think of so far.
    
    One way to fix this is when the user is fully aware of the available
    bandwidth, then we can allow the user to help providing an accurate value.
    
    For example, if the user has a dedicated channel of 10Gbps for migration
    for this specific VM, the user can specify this bandwidth so QEMU can
    always do the calculation based on this fact, trusting the user as long as
    specified.  It may not be the exact bandwidth when switching over (in which
    case qemu will push migration data as fast as possible), but much better
    than QEMU trying to wildly guess, especially when very wrong.      
    
    A new parameter "avail-switchover-bandwidth" is introduced just for this.
    So when the user specified this parameter, instead of trusting the
    estimated value from QEMU itself (based on the QEMUFile send speed), it
    trusts the user more by using this value to decide when to switchover,
    assuming that we'll have such bandwidth available then.
    
    Note that specifying this value will not throttle the bandwidth for
    switchover yet, so QEMU will always use the full bandwidth possible for
    sending switchover data, assuming that should always be the most important
    way to use the network at that time.
    
    This can resolve issues like "unconvergence migration" which is caused by
    hilarious low "migration bandwidth" detected for whatever reason.
    
    Reported-by: Zhiyi Guo <zhguo@redhat.com>
    Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
    Reviewed-by: Juan Quintela <quintela@redhat.com>
    Signed-off-by: Peter Xu <peterx@redhat.com>
    Signed-off-by: Juan Quintela <quintela@redhat.com>
    Message-ID: <20231010221922.40638-1-peterx@redhat.com>

So we need a libvirt implementation for it as well

Goal

As system admin, I would like to specify the switchover bandwidth based on the network capacity(for example, 10GbE), hence improve the chance for live migration to converage. Note, enabling this doesn't mean the downtime will certainly decrease. Whether downtime will decrease or not would depend on the VM dirtypage rate and the threshold to trigger live migration switchover.

Acceptance Criteria

A list of verification conditions, successful functional tests, or expected outcomes in order to declare this story/task successfully completed.

On a 1000M network, live migrate a rhel virtual machine(idle) without setting avail-switchover-bandwidth, live migration can complete. Then setting avail-switchover-bandwidth to 1000M,
```
virsh qemu-monitor-command $VM --pretty '{ "execute": "migrate-set-parameters" , "arguments": { "avail-switchover-bandwidth": 104857600 } }' 
```
live migration can also complete.

Compare the downtime achieved between live migration without setting avail-switchover-bandwidth and the one has avail-switchover-bandwidth set to 1000M, the downtime should improve:

live migration without setting avail-switchover-bandwidth:
Check the output of virsh domjobinfo $VM --completed:
...
Total downtime:   119          ms
Memory bandwidth: 99.139 MiB/s

live migration has avail-switchover-bandwidth set to 1000M:
Check the output of virsh domjobinfo $VM --completed:
...
Total downtime:   102          ms
Memory bandwidth: 98.586 MiB/s

is blocked by

RHEL-71662 Rebase libvirt in RHEL-10.1

Closed

links to

Resolved upstream as of commit v11.0.0-40-gd9fca42e40

RHBA-2025:148139 libvirt update

Assignee:: Jiri Denemark

Reporter:: Zhiyi Guo

Contributors:: Liping Cheng

Developer:: virt-maint

QA Contact:: Liping Cheng

Doc Contact:: Jiří Herrmann

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Created:: 2023/12/30 12:51 AM

Updated:: 2025/11/11 1:46 PM

Resolved:: 2025/11/11 9:37 AM

Target end:: 2025/04/28

Next Planned Release Date:: 2025/11/11

Release Date:: 2025/11/11

Details

Description

Description

Goal

Acceptance Criteria

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates