-
Bug
-
Resolution: Done
-
Normal
-
rhos-18.0.0
-
2
-
False
-
-
False
-
openstack-nova-27.1.1-0.20230518074959.b9089ac.el9ost
-
None
-
Moderate
+++ This bug was initially created as a clone of Bug #2088677 +++
+++ This bug was initially created as a clone of Bug #2088676 +++
+++ This bug was initially created as a clone of Bug #2074219 +++
+++ This bug was initially created as a clone of Bug #2074205 +++
Description of problem:
while live-migrating many instances concurrently, libvirt sometimes return internal error: migration was active, but no RAM info was set:
~~~
2022-03-30 06:08:37.197 7 WARNING nova.virt.libvirt.driver [req-5c3296cf-88ee-4af6-ae6a-ddba99935e23 - - - - -] [instance: af339c99-1182-4489-b15c-21e52f50f724] Error monitoring migration: internal error: migration was active, but no RAM info was set: libvirt.libvirtError: internal error: migration was active, but no RAM info was set
~~~
Version-Release number of selected component (if applicable):
libvirt-daemon-6.0.0-25.6.module+el8.2.1+12457+868e9540.ppc64le
How reproducible:
Random
Steps to Reproduce:
1. live evacuate a compute
2.
3.
Actual results:
live migration fails and leave database info in dire state
Expected results:
completes successfully
Additional info:
— Additional comment from David Hill on 2022-04-11 19:17:40 UTC —
https://review.opendev.org/c/openstack/nova/+/837320
— Additional comment from David Hill on 2022-04-11 19:22:57 UTC —
This is a clone of the libvirtd bug but I've found a commit in master that would avoid customer's main issue of VMs being stuck in-between two computes in regards to the database. We need that fix in nova-compute.
— Additional comment from Artom Lifshitz on 2022-04-13 16:33:52 UTC —
Triage notes: Backporting https://review.opendev.org/c/openstack/nova/+/837320 does add value, and is better than what we have now, but we can also have a complementary patch that will allow the migration continue in the specific case wherein we get the 'migration was active, but no RAM info was set' error, as that doesn't actually indicate a migration failure, and Nova should be more forgiving of it.