Problem Description:
If we are starting ovn-controller at the same time we start ovs-vswitchd while being in a flow restoration process, we will get the following error:
++ ovs-ofctl -O OpenFlow14 add-tlv-map br-int '{class=0x102,type=0x80,len=4}->tun_metadata0' 2025-04-04T10:27:24Z|00045|connmgr|INFO|br-int<->unix#11: sending NXTTMFC_ALREADY_MAPPED error reply to NXT_TLV_TABLE_MOD message OFPT_ERROR (OF1.4) (xid=0x2): NXTTMFC_ALREADY_MAPPED NXT_TLV_TABLE_MOD (OF1.4) (xid=0x2): ADD mapping table: class type length match field ------ ---- ------ -------------- 0x102 0x80 4 tun_metadata0
See more details comment in PR: https://github.com/openstack-k8s-operators/ovn-operator/pull/422#issuecomment-2778314555
Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).
This is needed for the updates of the ovn-controller-ovs pods in RHOSO, since it stops us from being able to restore the flows. We are searching for a workaround (make ovn-controller pod to wait for ovn-controller-ovs pod)
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
Environment: CRC + openstack operators
ovn-controller 24.03.6
Open vSwitch Library 3.3.4
OpenFlow versions 0x6:0x6
SB DB Schema 20.33.0
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
100% in this environment. When a change is introduced in ovn-operator, all pods controlled by ovn-operator will be recreated. Thus, ovn-controller and ovn-controller-ovs start at the same time and the flow restoration in the vswitchd container fails
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
Deploy CRC + operators from openstack
clone local ovn-operator with this PR included https://github.com/openstack-k8s-operators/ovn-operator/pull/422/commits/f93b2c1f0849196ec9351cc413cc1e28cd9479db
change ovn-operator to see both pods recreate.
(I can provide access to this environment)
Expected Behavior: Describe what should happen under normal circumstances.
vswitchd doesn't fail and the flow restoration is completed.
Observed Behavior: Explain what actually happens.
Instead, the restoration fails, the pod restarts, and all the restoration data is lost.
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.
Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)
vswitchd logs: https://paste.opendev.org/show/b4uVxSJu37NSuE96tOCb/
ovn-controller logs: https://paste.opendev.org/show/b3P1bkvUNQl6012pqPyz/
- relates to
-
OSPRH-11228 ovs-vswitchd container not logging to console
-
- Closed
-
- links to