Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-90312

[aarch64] crash on target host during resume postcopy migration

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • None
    • None
    • rhel-virt-core-libvirt-1
    • ssg_virtualization
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • aarch64
    • None

      What were you trying to do that didn't work?

       libvirtd/virtqemud crashed on target host during resume postcopy migration

      What is the impact of this issue to you?

      Causes migration.pause_postcopy_migration_and_recover.pause_and_io_error_and_recover test to fail.

      Please provide the package NVR for which the bug is seen:

      RHEL-10.1-20250507.1
      libvirt-11.2.0-1.el10.aarch64
      qemu-kvm-10.0.0-1.el10.aarch64

      How reproducible is this bug?:

      100%

      Steps to reproduce

      1. Complete basic setup for migration between RHEL 10.1 machines with nfs storage and ssh
      2. Define vm (avocado-vt-vm1)
      3. # virsh migrate avocado-vt-vm1 --desturi qemu+ssh://10.6.12.65/system --live --verbose --timeout 10 --timeout-postcopy --postcopy --bandwidth 15 --postcopy-bandwidth 15
      4. In another terminal, abort postcopy migration
        1. virsh domjobabort avocado-vt-vm1 --postcopy
      5.  Will get following error in the first terminal:
        1. Migration: [25.73 %]error: operation failed: job 'migration in' failed in post-copy phase
      6. Resume postcopy migration.
        1. # virsh migrate avocado-vt-vm1 --desturi qemu+ssh://10.6.12.65/system --live --verbose --timeout 10 --timeout-postcopy --postcopy --bandwidth 15 --postcopy-bandwidth 15 --postcopy-resume
          error: End of file while reading data: Warning: Permanently added '10.6.12.65' (ED25519) to the list of known hosts.
          virt-ssh-helper: could not proxy traffic: End of file while reading data: Input/output error: Input/output error
      7. Check status of vm
        1. On source:  2    avocado-vt-vm1   paused
        2. On target: 4    avocado-vt-vm1          running
      8. Check the virtqemud coredump file on target host
        1. # coredumpctl
          TIME                          PID UID GID SIG     COREFILE EXE                 SIZE
          Thu 2025-05-08 10:59:28 EDT 43463   0   0 SIGSEGV present  /usr/sbin/virtqemud   3M
          Thu 2025-05-08 12:19:46 EDT 53571   0   0 SIGSEGV present  /usr/sbin/virtqemud   3M
          Thu 2025-05-08 12:51:38 EDT 75562   0   0 SIGSEGV present  /usr/sbin/virtqemud 2.9M
          Thu 2025-05-08 19:39:16 EDT 90156   0   0 SIGSEGV present  /usr/sbin/libvirtd  5.2M
        2. # coredumpctl dump 90156
                     PID: 90156 (libvirtd)
                     UID: 0 (root)
                     GID: 0 (root)
                  Signal: 11 (SEGV)
               Timestamp: Thu 2025-05-08 19:39:16 EDT (24min ago)
            Command Line: /usr/sbin/libvirtd --timeout 120
              Executable: /usr/sbin/libvirtd
           Control Group: /system.slice/libvirtd.service
                    Unit: libvirtd.service
                   Slice: system.slice
                 Boot ID: b86ae612ceab4dbeb0f789a46dd81115
              Machine ID: 3bc95fe2e2cd4e22bca2d385b320d97a
                Hostname: nvidia-grace-hopper-03.khw.eng.rdu2.dc.redhat.com
                 Storage: /var/lib/systemd/coredump/core.libvirtd.0.b86ae612ceab4dbeb0f789a46dd81115.90156.1746747556000000.zst (present)
            Size on Disk: 5.2M
                 Message: Process 90156 (libvirtd) of user 0 dumped core.
        3. [note: the /usr/sbin/virtqemud coredump files were from getting this same input/output error when running the libvirt test case. However, following the manual steps above, can only get /usr/sbin/libvirtd coredump.]
      9. coredumpctl dump 75562
                   PID: 75562 (virtqemud)
                   UID: 0 (root)
                   GID: 0 (root)
                Signal: 11 (SEGV)
             Timestamp: Thu 2025-05-08 12:51:38 EDT (7h ago)
          Command Line: /usr/sbin/virtqemud --timeout 120
            Executable: /usr/sbin/virtqemud
         Control Group: /system.slice/virtqemud.service
                  Unit: virtqemud.service
                 Slice: system.slice
               Boot ID: b86ae612ceab4dbeb0f789a46dd81115
            Machine ID: 3bc95fe2e2cd4e22bca2d385b320d97a
              Hostname: nvidia-grace-hopper-03.khw.eng.rdu2.dc.redhat.com
               Storage: /var/lib/systemd/coredump/core.virtqemud.0.b86ae612ceab4dbeb0f789a46dd81115.75562.1746723098000000.zst (present)
          Size on Disk: 2.9M
               Message: Process 75562 (virtqemud) of user 0 dumped core.
                        
                        Module [dso] from rpm libssh-0.11.1-1.el10.aarch64
                        Module libcap.so.2 from rpm libcap-2.69-7.el10.aarch64
                        Module libnss_systemd.so.2 from rpm systemd-257-11.el10.aarch64
                        Module libnbd.so.0 from rpm libnbd-1.22.2-1.el10.aarch64
                        Module libvirt_driver_qemu.so from rpm libvirt-11.3.0-1.el10.aarch64
                        Module libbrotlicommon.so.1 from rpm brotli-1.1.0-6.el10.aarch64
                        Module libevent-2.1.so.7 from rpm libevent-2.1.12-16.el10.aarch64
                        Module libkeyutils.so.1 from rpm keyutils-1.6.3-5.el10.aarch64
                        Module libkrb5support.so.0 from rpm krb5-1.21.3-7.el10.aarch64
                        Module libblkid.so.1 from rpm util-linux-2.40.2-10.el10.aarch64
                        Module libbrotlidec.so.1 from rpm brotli-1.1.0-6.el10.aarch64
                        Module libssl.so.3 from rpm openssl-3.5.0-2.el10.aarch64
                        Module libpsl.so.5 from rpm libpsl-0.21.5-6.el10.aarch64
                        Module libnghttp2.so.14 from rpm nghttp2-1.64.0-2.el10.aarch64
                        Module libcrypt.so.2 from rpm libxcrypt-4.4.36-10.el10.aarch64
                        Module libcrypto.so.3 from rpm openssl-3.5.0-2.el10.aarch64
                        Module libtasn1.so.6 from rpm libtasn1-4.20.0-1.el10.aarch64
                        Module libunistring.so.5 from rpm libunistring-1.1-10.el10.aarch64
                        Module libidn2.so.0 from rpm libidn2-2.3.7-3.el10.aarch64
                        Module libp11-kit.so.0 from rpm p11-kit-0.25.5-7.el10.aarch64
                        Module libattr.so.1 from rpm attr-2.5.2-5.el10.aarch64
                        Module liblzma.so.5 from rpm xz-5.6.2-3.el10.aarch64
                        Module libcom_err.so.2 from rpm e2fsprogs-1.47.1-3.el10.aarch64
                        Module libk5crypto.so.3 from rpm krb5-1.21.3-7.el10.aarch64
                        Module libkrb5.so.3 from rpm krb5-1.21.3-7.el10.aarch64
                        Module libgssapi_krb5.so.2 from rpm krb5-1.21.3-7.el10.aarch64
                        Module libmount.so.1 from rpm util-linux-2.40.2-10.el10.aarch64
                        Module libz.so.1 from rpm zlib-ng-2.2.3-2.el10.aarch64
                        Module libgmodule-2.0.so.0 from rpm glib2-2.80.4-4.el10.aarch64
                        Module libffi.so.8 from rpm libffi-3.4.4-9.el10.aarch64
                        Module libpcre2-8.so.0 from rpm pcre2-10.44-1.el10.3.aarch64
                        Module libcurl.so.4 from rpm curl-8.12.1-2.el10.aarch64
                        Module libsasl2.so.3 from rpm cyrus-sasl-2.1.28-27.el10.aarch64
                        Module libssh.so.4 from rpm libssh-0.11.1-1.el10.aarch64
                        Module libselinux.so.1 from rpm libselinux-3.8-1.el10.aarch64
                        Module libnuma.so.1 from rpm numactl-2.0.19-1.el10.aarch64
                        Module libnl-3.so.200 from rpm libnl3-3.11.0-1.el10.aarch64
                        Module libjson-c.so.5 from rpm json-c-0.18-3.el10.aarch64
                        Module libgnutls.so.30 from rpm gnutls-3.8.9-14.el10.aarch64
                        Module libcap-ng.so.0 from rpm libcap-ng-0.8.4-6.el10.aarch64
                        Module libaudit.so.1 from rpm audit-4.0.3-4.el10.aarch64
                        Module libacl.so.1 from rpm acl-2.3.2-4.el10.aarch64
                        Module libxml2.so.2 from rpm libxml2-2.12.5-5.el10_0.aarch64
                        Module libtirpc.so.3 from rpm libtirpc-1.3.5-1.el10.aarch64
                        Module libgio-2.0.so.0 from rpm glib2-2.80.4-4.el10.aarch64
                        Module libgobject-2.0.so.0 from rpm glib2-2.80.4-4.el10.aarch64
                        Module libglib-2.0.so.0 from rpm glib2-2.80.4-4.el10.aarch64
                        Module libvirt-qemu.so.0 from rpm libvirt-11.3.0-1.el10.aarch64
                        Module libvirt-lxc.so.0 from rpm libvirt-11.3.0-1.el10.aarch64
                        Module libvirt.so.0 from rpm libvirt-11.3.0-1.el10.aarch64
                        Stack trace of thread 75566:
                        #0  0x0000ffff305c8964 qemuProcessIncomingDefNew (libvirt_driver_qemu.so + 0x138964)
                        #1  0x0000ffff3057ea14 qemuMigrationDstPrepare (libvirt_driver_qemu.so + 0xeea14)
                        #2  0x0000ffff3057fb54 qemuMigrationDstPrepareAny (libvirt_driver_qemu.so + 0xefb54)
                        #3  0x0000ffff305811b4 qemuMigrationDstPrepareDirect (libvirt_driver_qemu.so + 0xf11b4)
                        #4  0x0000ffff30552870 qemuDomainMigratePrepare3Params.lto_priv.0 (libvirt_driver_qemu.so + 0xc2870)
                        #5  0x0000ffff355b2ca8 virDomainMigratePrepare3Params (libvirt.so.0 + 0x302ca8)
                        #6  0x0000aaada8f5e14c n/a (n/a + 0x0)
                        #7  0x0000aaada8f5e14c n/a (n/a + 0x0)
                        #8  0x0057ffff3549d1b4 n/a (n/a + 0x0)
                        #9  0x007cffff3549d718 n/a (n/a + 0x0)
                        #10 0x006effff3549d858 n/a (n/a + 0x0)
                        #11 0x000dffff353cd9cc n/a (n/a + 0x0)
                        #12 0x0021ffff353cc270 n/a (n/a + 0x0)
                        #13 0x004cffff34c79ea8 n/a (n/a + 0x0)
                        #14 0x006dffff34ce360c n/a (n/a + 0x0)
                        ELF object binary architecture: AARCH64

      Expected results

      Postcopy migration can be completed successfully

      Actual results

       libvirtd/virtqemud crashed on target host during resume postcopy migration

       

      Note: this is a clone of RHEL-90160, which is for x86_64. However, it is important to note that, despite both architectures experiencing the same issue, the migration.pause_postcopy_migration_and_recover.pause_and_io_error_and_recover test does not fail on x86_64 (only fails on aarch64).

        1. avocado-vt-vm1.log-target
          7 kB
        2. libvirtd.log-target
          207 kB
        3. test-failure-debug.log
          303 kB
        4. test-failure-libvirtd.log
          45.10 MB

              virt-maint virt-maint
              rh-ee-jugraham Julia Graham
              virt-maint virt-maint
              virt-bugs virt-bugs
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: