-
Bug
-
Resolution: Done
-
Normal
-
None
-
rhel-9.2.0
-
None
-
Moderate
-
rhel-virt-core
-
ssg_virtualization
-
5
-
False
-
False
-
-
None
-
None
-
None
-
Automated
-
If docs needed, set a value
-
-
All
-
None
-
57,005
Description of problem:
Do zerocopy and multifd migration, during migration is active, cancel migration, sometimes it fails to cancel migration, hang in cancelling status
Version-Release number of selected component (if applicable):
hosts info: kernel-5.14.0-226.el9.aarch64 && qemu-kvm-7.2.0-2.el9.aarch64
guest info: kernel-4.18.0-447.el8.aarch64
How reproducible:
1/50
Steps to Reproduce:
1.Boot a guest on the source host with qemu command lines [1]
2.Boot the guest on the destination host with same qemu cmd with [1] but append '-incoming defer'
3.Enable multifd on the src and dst host, enable zero copy on the src host, set multifd channel to 4 on the src and dst host
4.Set migration incoming on the dst host, start to migration from the src to dst host
5.During migration is active, cancel migration
The auto log as below, 10.19.241.172 is the src host, 10.19.241.174 is the dst host:
2023-01-13-06:00:23: Host(10.19.241.174) Sending qmp command : {"execute": "migrate-incoming", "arguments":
, "id": "qBJcl14d"}
2023-01-13-06:00:24: Host(10.19.241.174) Responding qmp command: {"return": {}, "id": "qBJcl14d"}
2023-01-13-06:00:24: Host(10.19.241.172) Sending qmp command : {"execute": "migrate", "arguments":
, "id": "IQA7fhXl"}
2023-01-13-06:00:24: Host(10.19.241.172) Responding qmp command: {"return": {}, "id": "IQA7fhXl"}
2023-01-13-06:00:24: Host(10.19.241.172) Sending qmp command :
2023-01-13-06:00:24: Host(10.19.241.172) Responding qmp command: {"return":
{"status": "setup"}, "id": "iVo7Bktc"}
2023-01-13-06:00:29: Host(10.19.241.172) Sending qmp command :
2023-01-13-06:00:29: Host(10.19.241.172) Responding qmp command: {"return": {"expected-downtime": 300, "status": "active", "setup-time": 4, "total-time": 5012, "ram": {"total": 4429328384, "postcopy-requests": 0, "dirty-sync-count": 1, "multifd-bytes": 571058432, "pages-per-second": 773219, "downtime-bytes": 0, "page-size": 4096, "remaining": 2806267904, "postcopy-bytes": 0, "mbps": 615.55226016260156, "transferred": 573374799, "dirty-sync-missed-zero-copy": 0, "precopy-bytes": 2316367, "duplicate": 257305, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 569139200, "normal": 138950}}, "id": "NeeVUFkq"}
2023-01-13-06:00:29: Host(10.19.241.174) Sending qmp command :
2023-01-13-06:00:29: Host(10.19.241.174) Responding qmp command: {"return": {"status": "active", "socket-address": [
{"port": "4000", "ipv6": true, "host": "::", "type": "inet"}]}, "id": "pZnBPkIF"}
2023-01-13-06:00:29: ======= Step 6. During migration, cancel it =======
2023-01-13-06:00:29: ----- 6.1 Cancel migration during it is active -----
2023-01-13-06:00:29: Host(10.19.241.172) Sending qmp command :
2023-01-13-06:00:35: Host(10.19.241.172) Responding qmp command: {"return": {}, "id": "Lu4AfMCD"}
2023-01-13-06:00:35: Host(10.19.241.172) Sending qmp command :
2023-01-13-06:00:35: Host(10.19.241.172) Responding qmp command: {"return": {"expected-downtime": 300, "status": "cancelling", "setup-time": 4, "total-time": 11031, "ram": {"total": 4429328384, "postcopy-requests": 0, "dirty-sync-count": 1, "multifd-bytes": 571062464, "pages-per-second": 773219, "downtime-bytes": 0, "page-size": 4096, "remaining": 2806267904, "postcopy-bytes": 0, "mbps": 615.55226016260156, "transferred": 573378831, "dirty-sync-missed-zero-copy": 0, "precopy-bytes": 2316367, "duplicate": 257305, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 569139200, "normal": 138950}}, "id": "96lzw0jZ"}
Actual results:
As the step 5 of Steps to Reproduce, cancel migration hang in cancelling status. Can't cancel migration now
Expected results:
Cancel migration successfully.
Additional info:
1. Tried 300 times for the plain migration without zerocopy and multifd enabled, cancel migration always successfully;
2. Tried 100 times with only multifd enabled and set multifd channel to 4, cancel migration also successfully