-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
-
None
-
2
-
False
-
-
False
-
?
-
openstack-nova-23.2.3-17.1.20250305081012.2ace99d.el9osttrunk
-
None
-
-
-
Critical
This is a regression caused by https://issues.redhat.com/browse/OSPRH-12217 for deployments using vGPUs.
The patch meant to fix https://bugs.launchpad.net/nova/+bug/2091033 unfortunately has broken the nova-compute update_available_resource periodic task when mdevs are involved with a traceback like shown below.
We are working on a partial revert that will be backported to the same branches where the original patch was merged.
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager Traceback (most recent call last):
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 10530, in _update_available_resource_for_node
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager self.rt.update_available_resource(context, nodename,
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/compute/resource_tracker.py", line 889, in update_available_resource
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py", line 414, in inner
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager return f(*args, **kwargs)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/compute/resource_tracker.py", line 994, in _update_available_resource
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager self._update(context, cn, startup=startup)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/compute/resource_tracker.py", line 1303, in _update
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager self._update_to_placement(context, compute_node, startup)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/retrying.py", line 49, in wrapped_f
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, *args, **kw)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/retrying.py", line 206, in call
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager return attempt.get(self._wrap_exception)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/retrying.py", line 247, in get
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager six.reraise(self.value[0], self.value[1], self.value[2])
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager raise value
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/retrying.py", line 200, in call
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/compute/resource_tracker.py", line 1216, in _update_to_placement
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager self.driver.update_provider_tree(prov_tree, nodename)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 9120, in update_provider_tree
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager self._update_provider_tree_for_vgpu(
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 9529, in _update_provider_tree_for_vgpu
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager inventories_dict = self._get_gpu_inventories()
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 8199, in _get_gpu_inventories
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager count_per_dev = self._count_mdev_capable_devices(enabled_mdev_types)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 8150, in _count_mdev_capable_devices
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager mdev_capable_devices = self._get_mdev_capable_devices(
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 8392, in _get_mdev_capable_devices
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager device = self._get_mdev_capabilities_for_dev(name, types)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 8361, in _get_mdev_capabilities_for_dev
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager virtdev = self._host.device_lookup_by_name(devname)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/host.py", line 1253, in device_lookup_by_name
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager return self.get_connection().nodeDeviceLookupByName(name)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 193, in doit
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager result = proxy_call(self._autowrap, f, *args, **kwargs)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 151, in proxy_call
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager rv = execute(f, *args, **kwargs)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 132, in execute
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager six.reraise(c, e, tb)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager raise value
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 86, in tworker
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager rv = meth(*args, **kwargs)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager File "/usr/lib64/python3.9/site-packages/libvirt.py", line 5201, in nodeDeviceLookupByName
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager ret = libvirtmod.virNodeDeviceLookupByName(self._o, name)
2025-02-18 20:43:33.462 2 ERROR nova.compute.manager TypeError: virNodeDeviceLookupByName() argument 2 must be str or None, not Proxy
- clones
-
OSPRH-14436 "TypeError: virNodeDeviceLookupByName() argument 2 must be str or None, not Proxy" from update_available_resource with mdevs
-
- Closed
-
- mentioned on