Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-16149

Race condition when creating VM in environment with HF for OSPRH-14377

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • rhos-17.1.8
    • rhos-17.1.z
    • openstack-neutron
    • None
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • openstack-neutron-18.6.1-17.1.20250529181015.85ff760.el9osttrunk
    • None
    • Neutron Sprint 13, Neutron Sprint 14, Neutron Sprint 15, Neutron Sprint 16
    • 4
    • Important

      To Reproduce Steps to reproduce the behavior:
      It is unclear if this problem is directly related to HF for https://issues.redhat.com/browse/OSPRH-14377, or a separate problem. But it is important to say that problem happens in RHOSP 17.1 environment with hot fix RPMs installed inside neutron-server container

      When creating huge number of VMs, one of them may fail because of of the following error returned by Neutron server for request to bind port:

      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers [req-0972e1f6-47a6-4850-828c-b8d3b0ec4561 ID ID - default default] Mechanism driver 'ovn' failed in update_port_postcommit: ovsdbapp.backend.ovs_idl.i
      dlutils.RowNotFound: Cannot find Logical_Switch_Port with name=PORT_UUID
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers Traceback (most recent call last):
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/managers.py", line 493, in _call_on_drivers
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     getattr(driver.obj, method_name)(context)
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", line 873, in update_port_postcommit
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     self._ovn_update_port(context._plugin_context, port, original_port,
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", line 755, in _ovn_update_port
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     self._ovn_client.update_port(plugin_context, port,
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py", line 809, in update_port
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     ovn_port = self._nb_idl.lookup('Logical_Switch_Port', port['id'])
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 208, in lookup
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     return self._lookup(table, record)
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 268, in _lookup
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     row = idlutils.row_by_value(self, rl.table, rl.column, record)
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers     raise RowNotFound(table=table, col=column, match=match)
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Switch_Port with name=PORT_UUID
      2025-04-17 04:52:01.005 28 ERROR neutron.plugins.ml2.managers
      

      Before trace is reported I can see multiple INFO messages indicating connection problems for OVN NB DB server:

      2025-04-17 04:52:01.004 26 INFO ovsdbapp.backend.ovs_idl.vlog [req-efa26f5d-533d-438b-aac4-01d4a94f24fb - - - - -] tcp:IP:6641: waiting 2 seconds before reconnect
      2025-04-17 04:52:01.006 30 INFO ovsdbapp.backend.ovs_idl.vlog [req-dc88b48c-ea5a-4c10-ba24-41008df1ca16 - - - - -] tcp:IP:6641: connection closed by client
      

      Affected port was successfully created shortly before on another controller, so I am not sure if Neutron server did its job properly here.

      Expected behavior
      Actual solutions may be very different, but in the end if OVN NB DB is not available, Neutron Server should probably report this instead of raising RowNotFound + we may consider changing failover logic if possible.

      Bug impact
      Some logic to process failovers must be implemented by customer

      Known workaround
      None

              twilson@redhat.com Terry Wilson
              rhn-support-astupnik Alex Stupnikov
              rhos-dfg-networking-squad-neutron
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: