-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
4.9.z
-
-
-
Critical
-
None
-
Rejected
-
False
-
-
Customer Escalated
-
Description of problem:
PODs on other worker nodes can not connect to PODs on a particular node. Node to node connectivity works fine though. I have also checked OpenFlows from problematic node and I can see correct OpenFlows for all scenarios[POD to POD(same and different node), tun0 to POD on same node]. State of vxlan_sys_4789 interface on problematic node shows DOWN. In ovs-vswitchd log I can see couple of concerning warning messages. ~~~ ovs|113756|vconn|WARN|unix#347363: version negotiation failed (we support version 0x04, peer supports version 0x01) ovs|113757|rconn|WARN|br0<->unix#347363: connection dropped (Protocol error) ~~~ ~~~ [arghosh@supportshell-1 openvswitch]$ cat ovs-vsctl_-t_5_list_bridge_br0 name : br0 protocols : [OpenFlow13] ~~~ From dmesg log I can see that veth interfaces are flapping. ~~~ [arghosh@supportshell-1]$ cat ./sos_commands/kernel/dmesg |grep veth|grep 'not ready'|wc -l 166 ~~~
Version-Release number of selected component (if applicable):
4.9.43
How reproducible:
No Sure
Steps to Reproduce:
1. 2. 3.
Actual results:
PODs on other worker nodes can not connect to PODs on a particular node. Node to node connectivity works fine though.
Expected results:
POD to POD connectivity should work fine
Additional info:
Checked ofproto/trace command output for different scenarios and could see correct OpenFlows. Restarting the problematic node fixes the issue.
- depends on
-
OCPBUGS-4134 configure-ovs: delays boot waiting for patch port to be unmanaged
- Closed