Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-18339

qemu-kvm crashes on rhel9.2 EUS when live migrating guests that are under heavy load

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • rhos-17.1.11
    • None
    • tripleo-ansible
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • openstack-tripleo-heat-templates-14.3.1-17.1.20250821170814.e7c7ce3.el9osttrunk
    • rhos-workloads-compute
    • None
    • Hide

      Cause:
      A change in behavior in gnutls causes live migration to fail, as GNUTLS will corrupt session state due to TLS 1.3 automatic rekeying.
      https://issues.redhat.com/browse/RHEL-98672

      Consequence:
      Live migrating guests that are under heavy load will crash if re-keying of the TLS session is triggered.

      Workaround:
      A new `UseCustomGnuTlsQemuCryptoPolicy` TripleO Heat template parameter has been added to allow enabling a custom security policy that disables the AES cipher that is impacted by the gnutls regression.

      Result:
      Live migration is possible without triggering the re-keying bug; however, the OpenStack deployment is no longer FIPS compliant.
      Show
      Cause: A change in behavior in gnutls causes live migration to fail, as GNUTLS will corrupt session state due to TLS 1.3 automatic rekeying. https://issues.redhat.com/browse/RHEL-98672 Consequence: Live migrating guests that are under heavy load will crash if re-keying of the TLS session is triggered. Workaround: A new `UseCustomGnuTlsQemuCryptoPolicy` TripleO Heat template parameter has been added to allow enabling a custom security policy that disables the AES cipher that is impacted by the gnutls regression. Result: Live migration is possible without triggering the re-keying bug; however, the OpenStack deployment is no longer FIPS compliant.
    • Release Note Not Required
    • Proposed
    • Regression Only
    • Moderate

      qemu-kvm crashes on rhel9.2 EUS when live migrating guests that are under heavy load with the following backtrace:

      #0 get_total_headers2 (params=0x55e0378b9b90, session=0x55e03738d600)
      at ./algorithms.h:270
      #1 recv_headers (ms=<optimized out>, record=0x7f9c63ffe320, htype=4294967295, 
      type=GNUTLS_APPLICATION_DATA, record_params=0x55e0378b9b90, session=0x55e03738d600)
      at /usr/src/debug/gnutls-3.7.6-21.el9_2.3.x86_64/lib/record.c:1198
      #2 _gnutls_recv_in_buffers (session=session@entry=0x55e03738d600, 
      type=type@entry=GNUTLS_APPLICATION_DATA, htype=htype@entry=4294967295, 
      ms=<optimized out>, ms@entry=0)
      at /usr/src/debug/gnutls-3.7.6-21.el9_2.3.x86_64/lib/record.c:1319
      #3 0x00007fa52493520f in _gnutls_recv_int (session=0x55e03738d600, 
      type=type@entry=GNUTLS_APPLICATION_DATA, data=0x55e038930788 "", data_size=32768, 
      seq=seq@entry=0x0, ms=0)
      at /usr/src/debug/gnutls-3.7.6-21.el9_2.3.x86_64/lib/record.c:1785
      #4 0x00007fa5249354f1 in gnutls_record_recv (session=<optimized out>, 
      data=<optimized out>, data_size=<optimized out>)
      at /usr/src/debug/gnutls-3.7.6-21.el9_2.3.x86_64/lib/record.c:2408
      #5 0x000055e034a23274 in qcrypto_tls_session_read (session=0xc2a6a1334db08f4f, 
      buf=0x5 <error: Cannot access memory at address 0x5>, len=0)
      at ../crypto/tlssession.c:472
      #6 qio_channel_tls_readv (ioc=0x55e036d728e0, iov=<optimized out>, niov=<optimized out>, fds=<optimized out>, nfds=<optimized out>, flags=<optimized out>, 
      errp=0x7f9c70ff9dd0) at ../io/channel-tls.c:273
      #7 0x000055e03477c1a0 in qio_channel_readv_full (ioc=0x55e036d728e0, niov=1, fds=0x0, nfds=0x0, flags=0, iov=<optimized out>, errp=<optimized out>)
      at ../io/channel.c:74
      #8 qio_channel_read (ioc=0x55e036d728e0, buf=0x55e038930788 "", buflen=32768, errp=<optimized out>) at ../io/channel.c:314
      #9 qemu_fill_buffer (f=0x55e038930750) at ../migration/qemu-file.c:415
      #10 0x000055e03477c9d5 in qemu_peek_byte (f=0x55e038930750, offset=0) at ../migration/qemu-file.c:707
      #11 qemu_get_byte (f=0x55e038930750) at ../migration/qemu-file.c:720
      #12 qemu_get_be16 (f=0x55e038930750) at ../migration/qemu-file.c:800
      #13 0x000055e03478ab62 in source_return_path_thread (opaque=0x55e036eff0b0) at ../migration/migration.c:2912
      #14 0x000055e034c40e9a in qemu_thread_start (args=0x55e037ed1ee0) at ../util/qemu-thread-posix.c:534
      #15 0x00007fa524421802 in start_thread (arg=<optimized out>) at pthread_create.c:443
      #16 0x00007fa5243c1314 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100 

              smooney@redhat.com Sean Mooney
              rhn-support-dhill Dave Hill
              rhos-workloads-evolution
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: