Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-143350

Guest hit call trace after postcopy migration

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • None
    • Low
    • rhel-virt-core-live-migration
    • None
    • QE ack
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • Yes
    • Unspecified
    • Unspecified
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      What were you trying to do that didn't work?
      Guest hit call trace after postcopy migration

      Please provide the package NVR for which bug is seen:
      hosts: kernel-6.12.0-184.el10.x86_64 && qemu-kvm-10.1.0-9.el10.x86_64
      guest: kernel-6.12.0-184.el10.x86_64

      How reproducible:
      1/10

      Steps to reproduce
      1.Boot VM on src host
      2.Run stress in VM:

      1. /home/stress-0.18.9/bin/stress --cpu 4 --vm 4 --vm-bytes 256M --timeout 60
        3.Boot VM on dst host
        4.Enable postcopy and postcopy-preempt capabilities on src and dst host
        5.On src qemu, set max-postcopy-bandwidth to 20M
        6.Migrate VM from src to dst host
        7.Change into postcopy mode once migration status is active (it means start postcopy migration when dirty-sync-count is 1)
        8.During postcopy migration, set max-postcopy-bandwidth to 80M
        9.After postcopy migration, check guest status

      Expected results
      Guest works well

      Actual results
      Guest hit call trace after postcopy migration, found the call trace info via "dmesg" command.

      And only get the call trace info. Guest still works after migration.

      2026-01-20-11:39:29: [ 79.402176] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [khugepaged:58]
      2026-01-20-11:39:29: [ 79.402306] Modules linked in: rfkill nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr vfat fat intel_rapl_msr intel_rapl_common kvm_intel kvm iTCO_wdt irqbypass iTCO_vendor_support rapl virtio_balloon lpc_ich pcspkr i2c_i801 i2c_smbus sg joydev fuse loop vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci xfs ahci sd_mod libahci nvme_tcp virtio_net libata crct10dif_pclmul crc32_pclmul net_failover crc32c_intel nvme_fabrics ghash_clmulni_intel virtio_scsi failover nvme_core bochs nvme_keyring nvme_auth serio_raw dm_mirror dm_region_hash dm_log be2iscsi iscsi_boot_sysfs cxgb4i cxgb4 tls libcxgbi libcxgb iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_multipath dm_mod nfnetlink
      2026-01-20-11:39:29: [ 79.516658] CPU: 1 UID: 0 PID: 58 Comm: khugepaged Kdump: loaded Not tainted 6.12.0-184.el10.x86_64 #1 PREEMPT(voluntary) 
      2026-01-20-11:39:29: [ 79.516681] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20251114-2.el10 11/14/2025
      2026-01-20-11:39:29: [ 79.601367] RIP: 0010:copy_mc_enhanced_fast_string+0xa/0x13
      2026-01-20-11:39:29: [ 79.601504] Code: 89 ca e9 b9 fe ff ff 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 48 89 f8 48 89 d1 <f3> a4 31 c0 e9 0d 9d 01 00 48 89 c8 e9 05 9d 01 00 0f 1f 44 00 00
      2026-01-20-11:39:29: [ 79.601509] RSP: 0018:ffffd162401ffbe8 EFLAGS: 00010206
      2026-01-20-11:39:29: [ 79.601516] RAX: ffff88d9c1f08000 RBX: fffff83a44895dc0 RCX: 0000000000001000
      2026-01-20-11:39:29: [ 79.601519] RDX: 0000000000001000 RSI: ffff88dae2577000 RDI: ffff88d9c1f08000
      2026-01-20-11:39:29: [ 79.601523] RBP: fffff83a40080000 R08: ffff88dad83a1370 R09: fffff83a444ddc68
      2026-01-20-11:39:29: [ 79.601525] R10: 00000000000390c0 R11: 0000000000000007 R12: 0000000000000000
      2026-01-20-11:39:29: [ 79.601528] R13: 0000000113771067 R14: ffff88dad3771840 R15: fffff83a4007c200
      2026-01-20-11:39:29: [ 79.601532] FS: 0000000000000000(0000) GS:ffff88db3bc80000(0000) knlGS:0000000000000000
      2026-01-20-11:39:29: [ 79.601541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      2026-01-20-11:39:29: [ 79.601544] CR2: 000000000015a001 CR3: 00000001033a8002 CR4: 0000000000772ef0
      2026-01-20-11:39:29: [ 79.601548] PKRU: 55555554
      2026-01-20-11:39:29: [ 79.601550] Call Trace:
      2026-01-20-11:39:29: [ 79.601553] <IRQ>
      2026-01-20-11:39:29: [ 79.601559] ? show_trace_log_lvl+0x1b0/0x2f0
      2026-01-20-11:39:29: [ 79.601629] ? show_trace_log_lvl+0x1b0/0x2f0
      2026-01-20-11:39:29: [ 79.601637] ? __collapse_huge_page_copy.isra.0+0x74/0x1f0
      2026-01-20-11:39:29: [ 79.601688] ? watchdog_timer_fn.cold+0x3d/0xa0
      2026-01-20-11:39:29: [ 79.601700] ? __pfx_watchdog_timer_fn+0x10/0x10
      2026-01-20-11:39:29: [ 79.601741] ? __hrtimer_run_queues+0x139/0x2a0
      2026-01-20-11:39:29: [ 79.601762] ? hrtimer_interrupt+0xff/0x230
      2026-01-20-11:39:29: [ 79.601775] ? __sysvec_apic_timer_interrupt+0x52/0x100
      2026-01-20-11:39:29: [ 79.601804] ? sysvec_apic_timer_interrupt+0x6c/0x90
      2026-01-20-11:39:29: [ 79.601816] </IRQ>
      2026-01-20-11:39:29: [ 79.601818] <TASK>
      2026-01-20-11:39:29: [ 79.601820] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
      2026-01-20-11:39:29: [ 79.601937] ? copy_mc_enhanced_fast_string+0xa/0x13
      2026-01-20-11:39:29: [ 79.601945] __collapse_huge_page_copy.isra.0+0x74/0x1f0
      2026-01-20-11:39:29: [ 79.601980] collapse_huge_page+0x546/0x830
      2026-01-20-11:39:29: [ 79.601992] hpage_collapse_scan_pmd+0x643/0x700
      2026-01-20-11:39:29: [ 79.602031] khugepaged_scan_mm_slot.constprop.0+0x3c3/0x570
      2026-01-20-11:39:29: [ 79.602039] khugepaged+0xce/0x210
      2026-01-20-11:39:29: [ 79.602045] ? __pfx_khugepaged+0x10/0x10
      2026-01-20-11:39:29: [ 79.602068] kthread+0xfa/0x240
      2026-01-20-11:39:29: [ 79.602096] ? __pfx_kthread+0x10/0x10
      2026-01-20-11:39:29: [ 79.602100] ret_from_fork+0x34/0x50
      2026-01-20-11:39:29: [ 79.602136] ? __pfx_kthread+0x10/0x10
      2026-01-20-11:39:29: [ 79.602139] ret_from_fork_asm+0x1a/0x30
      2026-01-20-11:39:29: [ 79.602174] </TASK>

       

      Additional info:

      Didn't reproduce if switch to postcopy when dirty-sync-count >= 2 

              virt-maint virt-maint
              rhn-support-xiaohli Xiaohui Li
              virt-maint virt-maint
              Xiaohui Li Xiaohui Li
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: