-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
rhos-18.0.10 FR 3
-
None
-
0
-
False
-
-
False
-
?
-
rhos-connectivity-neutron
-
None
-
-
-
-
Critical
This is a continuation of https://issues.redhat.com/browse/OSPRH-20412: environment is the same, underlying problem may be the same as well. We have collected extra data and gone a bit deeper this time.
From time to time customer observes Neutron agent flappings: Neutron agents are reported as dead. During mentioned periods of time it is impossible to start a VM because Neutron can't bind port to dead agents
2025-12-19 08:35:31.288 12 DEBUG neutron.plugins.ml2.managers [req-e414952c-7c7c-4e16-8d26-9c4e483feafd req-64d5157d-a5cd-4946-a5ac-a7aca85ad007 858a82de6c9a4a60a13691ee4434ea81 cd27a2ea108a4abeafba13c60eb83dcf - - default default] Attempting to bind port 77aa3abe-b177-482e-8521-cefd86c3f6ca by drivers ovn,sriovnicswitch on host REMOVED at level 0 using segments [{'id': 'REMOVED', 'network_type': 'geneve', 'physical_network': None, 'segmentation_id': 41682, 'network_id': 'REMOVED'}] _bind_port_level /usr/lib/python3.9/site-packages/neutron/plugins/ml2/managers.py:835ESC[00m 2025-12-19 08:35:31.325 12 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver [req-e414952c-7c7c-4e16-8d26-9c4e483feafd req-64d5157d-a5cd-4946-a5ac-a7aca85ad007 858a82de6c9a4a60a13691ee4434ea81 cd27a2ea108a4abeafba13c60eb83dcf - - default default] Refusing to bind port 77aa3abe-b177-482e-8521-cefd86c3f6ca to dead agent: <neutron.plugins.ml2.drivers.ovn.agent.neutron_agent.ControllerAgent object at 0x7f75b6da5700>
I tried to understand the reasons behind flapping agents and also collected data requested in older issue. Unfortunately there are no clear problem in logs:
- there are some disparities between Neutron and OVN reported by neutron-ovn-db-sync-util, but they are reported regardless of agent state
- ovn_controller periodically drops 10s of thousands of log messages, but they are dropped with and without error
- no clear error pointers in OVN control plane
- no errors reported by other services when OVN agent is down
Bug impact
Significant impact when problem is there affecting customer's customers.
Known workaround
None
Additional context
To be provided privately
- links to