Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-23875

Neutron agents flapping blocked VM creation

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • rhos-18.0.10 FR 3
    • None
    • Critical

      This is a continuation of https://issues.redhat.com/browse/OSPRH-20412: environment is the same, underlying problem may be the same as well. We have collected extra data and gone a bit deeper this time.

      From time to time customer observes Neutron agent flappings: Neutron agents are reported as dead. During mentioned periods of time it is impossible to start a VM because Neutron can't bind port to dead agents

      2025-12-19 08:35:31.288 12 DEBUG neutron.plugins.ml2.managers [req-e414952c-7c7c-4e16-8d26-9c4e483feafd req-64d5157d-a5cd-4946-a5ac-a7aca85ad007 858a82de6c9a4a60a13691ee4434ea81 cd27a2ea108a4abeafba13c60eb83dcf - - default default] Attempting to bind port 77aa3abe-b177-482e-8521-cefd86c3f6ca by drivers ovn,sriovnicswitch on host REMOVED at level 0 using segments [{'id': 'REMOVED', 'network_type': 'geneve', 'physical_network': None, 'segmentation_id': 41682, 'network_id': 'REMOVED'}] _bind_port_level /usr/lib/python3.9/site-packages/neutron/plugins/ml2/managers.py:835ESC[00m
      2025-12-19 08:35:31.325 12 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver [req-e414952c-7c7c-4e16-8d26-9c4e483feafd req-64d5157d-a5cd-4946-a5ac-a7aca85ad007 858a82de6c9a4a60a13691ee4434ea81 cd27a2ea108a4abeafba13c60eb83dcf - - default default] Refusing to bind port 77aa3abe-b177-482e-8521-cefd86c3f6ca to dead agent:  <neutron.plugins.ml2.drivers.ovn.agent.neutron_agent.ControllerAgent object at 0x7f75b6da5700>

      I tried to understand the reasons behind flapping agents and also collected data requested in older issue. Unfortunately there are no clear problem in logs:

      • there are some disparities between Neutron and OVN reported by neutron-ovn-db-sync-util, but they are reported regardless of agent state
      • ovn_controller periodically drops 10s of thousands of log messages, but they are dropped with and without error
      • no clear error pointers in OVN control plane
      • no errors reported by other services when OVN agent is down

      Bug impact
      Significant impact when problem is there affecting customer's customers.

      Known workaround
      None

      Additional context
      To be provided privately

              rodolfo_alonso Rodolfo Alonso
              rhn-support-astupnik Alex Stupnikov
              rhos-dfg-networking-squad-neutron
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: