Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-11267

Block live migration does not work for a VM attached to geneve net when nmstate provider is set

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • ?
    • ?
    • No
    • Moderate

      Having the following VM attached to a dpdk geneve net and deployed in a fresh rhos-18 deployment:

      sh-5.1$ openstack network show dpdk-mgmt
      +---------------------------+--------------------------------------+
      | Field                     | Value                                |
      +---------------------------+--------------------------------------+
      | admin_state_up            | UP                                   |
      | availability_zone_hints   |                                      |
      | availability_zones        |                                      |
      | created_at                | 2024-11-06T10:56:31Z                 |
      | description               |                                      |
      | dns_domain                |                                      |
      | id                        | 1ff4eb77-7979-4500-a320-419ce1340b58 |
      | ipv4_address_scope        | None                                 |
      | ipv6_address_scope        | None                                 |
      | is_default                | None                                 |
      | is_vlan_transparent       | None                                 |
      | l2_adjacency              | True                                 |
      | mtu                       | 8942                                 |
      | name                      | dpdk-mgmt                            |
      | port_security_enabled     | False                                |
      | project_id                | 3400c5f3f43d47c6af8208212f5a6a88     |
      | provider:network_type     | geneve                               |
      | provider:physical_network | None                                 |
      | provider:segmentation_id  | 62127                                |
      | qos_policy_id             | None                                 |
      | revision_number           | 2                                    |
      | router:external           | Internal                             |
      | segments                  | None                                 |
      | shared                    | False                                |
      | status                    | ACTIVE                               |
      | subnets                   | 4a404373-57b6-4aac-a2b1-3d78df3efc25 |
      | tags                      |                                      |
      | tenant_id                 | 3400c5f3f43d47c6af8208212f5a6a88     |
      | updated_at                | 2024-11-06T10:56:36Z                 |
      +---------------------------+--------------------------------------+
      sh-5.1$ openstack server list --all --long 
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      | ID                                   | Name      | Status | Task State | Power State | Networks               | Image Name                                | Image ID                             | Flavor             | Availability Zone | Host                           | Properties | Host Status |
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      | 011daec6-3d03-4beb-959f-61471ce5d13d | instance1 | ACTIVE | None       | Running     | dpdk-mgmt=10.10.10.175 | rhel-guest-image-8.4-1245-nfv3.x86_64.img | 64918ef8-f665-4824-a152-939cdbc5f6ec | nfv_qe_base_flavor | nova              | compute-0.ctlplane.example.com |            | UP          |
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      

      The live migration cannot be performed (note still VM hosted by compute-0):

      sh-5.1$ openstack server migrate --block --live instance1
      sh-5.1$ openstack server list --all --long 
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      | ID                                   | Name      | Status | Task State | Power State | Networks               | Image Name                                | Image ID                             | Flavor             | Availability Zone | Host                           | Properties | Host Status |
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      | 011daec6-3d03-4beb-959f-61471ce5d13d | instance1 | ACTIVE | None       | Running     | dpdk-mgmt=10.10.10.175 | rhel-guest-image-8.4-1245-nfv3.x86_64.img | 64918ef8-f665-4824-a152-939cdbc5f6ec | nfv_qe_base_flavor | nova              | compute-0.ctlplane.example.com |            | UP          |
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      

      In the origin compute (compute-0) in nova_compute pod logs we can see the following error:

      2024-11-06 11:28:32.593 2 ERROR nova.virt.libvirt.driver [None req-3cd8bad3-60fd-4d1f-85a8-d18c41a5e625 0bae092bd95b44aab53ad60d88c4b018 3400c5f3f43d47c6af8208212f5a6a88 - - default default] [instance: 011daec6-3d03-4beb-959f-61471ce5d13d] Live Migration failure: internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported: vhost-user backend not capable of postcopy: libvirt.libvirtError: internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported: vhost-user backend not capable of postcopy^[[00m
      2024-11-06 11:28:32.593 2 DEBUG nova.virt.libvirt.driver [None req-3cd8bad3-60fd-4d1f-85a8-d18c41a5e625 0bae092bd95b44aab53ad60d88c4b018 3400c5f3f43d47c6af8208212f5a6a88 - - default default] [instance: 011daec6-3d03-4beb-959f-61471ce5d13d] Migration operation thread notification thread_finished /usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py:10630^[[00m
      Traceback (most recent call last):
        File "/usr/lib/python3.9/site-packages/eventlet/hubs/hub.py", line 476, in fire_timers
          timer()
        File "/usr/lib/python3.9/site-packages/eventlet/hubs/timer.py", line 59, in __call__
          cb(*args, **kw)
        File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 175, in _do_send
          waiter.switch(result)
        File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 221, in main
          result = function(*args, **kwargs)
        File "/usr/lib/python3.9/site-packages/nova/utils.py", line 654, in context_wrapper
          return func(*args, **kwargs)
        File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 10285, in _live_migration_operation
          LOG.error("Live Migration failure: %s", e, instance=instance)
        File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
          self.force_reraise()
        File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
          raise self.value
        File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 10273, in _live_migration_operation
          guest.migrate(self._live_migration_uri(dest),
        File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/guest.py", line 642, in migrate
          self._domain.migrateToURI3(
        File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 193, in doit
          result = proxy_call(self._autowrap, f, *args, **kwargs)
        File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 151, in proxy_call
          rv = execute(f, *args, **kwargs)
        File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 132, in execute
          six.reraise(c, e, tb)
        File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise
          raise value
        File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 86, in tworker
          rv = meth(*args, **kwargs)
        File "/usr/lib64/python3.9/site-packages/libvirt.py", line 2174, in migrateToURI3
          raise libvirtError('virDomainMigrateToURI3() failed')
      libvirt.libvirtError: internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported: vhost-user backend not capable of postcopy

      The migration can be performed if we create a server without network attachments (using --no-network option).

      Rebooting the destination compute (compute-1) seems to solve the problem:

      sh-5.1$ openstack server migrate --block --live instance1
      sh-5.1$ openstack server list --all --long 
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      | ID                                   | Name      | Status | Task State | Power State | Networks               | Image Name                                | Image ID                             | Flavor             | Availability Zone | Host                           | Properties | Host Status |
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+
      | 011daec6-3d03-4beb-959f-61471ce5d13d | instance1 | ACTIVE | None       | Running     | dpdk-mgmt=10.10.10.175 | rhel-guest-image-8.4-1245-nfv3.x86_64.img | 64918ef8-f665-4824-a152-939cdbc5f6ec | nfv_qe_base_flavor | nova              | compute-1.ctlplane.example.com |            | UP          |
      +--------------------------------------+-----------+--------+------------+-------------+------------------------+-------------------------------------------+--------------------------------------+--------------------+-------------------+--------------------------------+------------+-------------+

      (Reboot the source compute-0 and restart the nova_compute service did not work).

              Unassigned Unassigned
              rdiazcam@redhat.com Ricardo Diaz Campos
              rhos-dfg-nfv
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: