-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
3
-
False
-
-
False
-
Committed
-
python-os-ken-1.4.1-17.1.20241205090937.018d755.el9osttrunk
-
Committed
-
Committed
-
None
-
-
-
Important
Description of problem:
Running SRIOV FFU - OVS job[1][2]- from 16.2 to 17.1
For this scenario:
computesriov-0 has RHEL9.2
computesriov-1 has RHEL8.4
The Agent Type: Open vSwitch agent is not alive for compute-0
and causes many tests failures.
It seems it happened after FFU process [3]
2024-11-20 18:24:38.797 6144 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-a8faba15-3478-4761-8c73-1a8bd95b8685 - - - - -] OVS is dead. OVSNeutronAgent will keep running and checking OVS status periodically.
2024-11-20 18:24:38.797 6144 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-a8faba15-3478-4761-8c73-1a8bd95b8685 - - - - -] Agent rpc_loop - iteration:6 completed. Processed ports statistics: {'regular': {'added': 0, 'updated': 0, 'removed': 0}}. Elapsed:300.002
2024-11-20 18:24:38.798 6144 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-a8faba15-3478-4761-8c73-1a8bd95b8685 - - - - -] Agent rpc_loop - iteration:7 started
2024-11-20 18:29:37.481 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] OVS is down, not reporting state to server
2024-11-20 18:29:38.798 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-a8faba15-3478-4761-8c73-1a8bd95b8685 - - - - -] Switch connection timeout
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-a8faba15-3478-4761-8c73-1a8bd95b8685 - - - - -] Failed to communicate with the switch: RuntimeError: Switch connection timeout
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 66, in check_canary_table
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int flows = self.dump_flows(constants.CANARY_TABLE)
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 156, in dump_flows
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int (dp, ofp, ofpp) = self._get_dp()
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py", line 71, in _get_dp
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int self._cached_dpid = new_dpid
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in _exit_
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int self.force_reraise()
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int raise self.value
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py", line 54, in _get_dp
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int dp = self._get_dp_by_dpid(self._cached_dpid)
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 79, in _get_dp_by_dpid
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int raise RuntimeError(m)
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: Switch connection timeout
2024-11-20 18:29:38.799 6144 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2024-11-20 18:29:38.799 6144 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-a8faba15-3478-4761-8c73-1a8bd95b8685 - - - - -] OVS is dead. OVSNeutronAgent will keep running and checking OVS status periodically.
Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20241030.n.1
How reproducible:
Running the job
Actual results:
The Agent Type: Open vSwitch agent is not alive for compute-0
Expected results:
Alive status for all network agent
Slack thread: https://redhat-internal.slack.com/archives/C046JULBVJ7/p1732186604009099
Additional info:
(overcloud) [stack@undercloud-0 ~]$ openstack network agent list
---------------------------------------------------------------------------------------------------------------------------------------
ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
---------------------------------------------------------------------------------------------------------------------------------------
0ebf230c-968a-4476-819a-3ccabc5db378 | Open vSwitch agent | computesriov-0.redhat.local | None | XXX | UP | neutron-openvswitch-agent |
30a0dc8c-e411-4fc1-825d-313588d7e875 | Metadata agent | controller-0.redhat.local | None | UP | neutron-metadata-agent | |
3a621f5a-1660-47d0-8d3e-70306800e1f3 | DHCP agent | controller-1.redhat.local | nova | UP | neutron-dhcp-agent | |
6a0f4d6d-162e-4d87-849d-43839f02327f | Metadata agent | controller-1.redhat.local | None | UP | neutron-metadata-agent | |
7afb0a4c-debb-49e1-899b-f83c6a89bda0 | L3 agent | controller-0.redhat.local | nova | UP | neutron-l3-agent | |
82a50c8c-e85c-4179-8da4-62953f058426 | Open vSwitch agent | controller-1.redhat.local | None | UP | neutron-openvswitch-agent | |
8427d9da-cd01-488c-9dcc-6a8151d001a5 | DHCP agent | controller-0.redhat.local | nova | UP | neutron-dhcp-agent | |
9575d2cb-5294-45cc-b5d6-2b21c60e76c7 | NIC Switch agent | computesriov-0.redhat.local | None | UP | neutron-sriov-nic-agent | |
9a041508-671a-4db0-bd30-478613d7e63a | L3 agent | controller-2.redhat.local | nova | UP | neutron-l3-agent | |
9e324516-bb5a-4690-9483-166fc25d1bd7 | NIC Switch agent | computesriov-1.redhat.local | None | UP | neutron-sriov-nic-agent | |
b23b902e-dca0-4986-81fb-540ced78fc59 | Open vSwitch agent | controller-2.redhat.local | None | UP | neutron-openvswitch-agent | |
b9749684-022c-42ab-bf46-603da9fa4d09 | Open vSwitch agent | computesriov-1.redhat.local | None | UP | neutron-openvswitch-agent | |
c39682c0-c71e-4d31-8e1f-09c017d9328a | Open vSwitch agent | controller-0.redhat.local | None | UP | neutron-openvswitch-agent | |
cf05f4f2-3d23-4eb2-8d97-2fcd351a3fed | L3 agent | controller-1.redhat.local | nova | UP | neutron-l3-agent | |
d363ff5d-21a6-4288-b0b3-0c848c214eb8 | DHCP agent | controller-2.redhat.local | nova | UP | neutron-dhcp-agent | |
e2c5cfd5-0055-42e9-a6a5-95cd0256a2cc | Metadata agent | controller-2.redhat.local | None | UP | neutron-metadata-agent |
---------------------------------------------------------------------------------------------------------------------------------------
[1]https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-all-unified-ffu-upgrade-16.2-17.1_director-rhel-virthost-3cont_2comp-ipv4-vlan-ml2ovs-sriov-multirhel/
[2] https://rhos-ci-staging-jenkins.lab.eng.tlv2.redhat.com/job/DFG-all-unified-ffu-upgrade-16.2-17.1_director-rhel-virthost-3cont_2comp-ipv4-vlan-ml2ovs-sriov-multirhel-fyanac/
[3]https://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-all-unified-ffu-upgrade-16.2-17.1_director-rhel-virthost-3cont_2comp-ipv4-vlan-ml2ovs-sriov-multirhel/9/computesriov-0/var/log/containers/neutron/openvswitch-agent.log.gz