-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.13.0
-
None
-
Important
-
None
-
Rejected
-
False
In an scenario with multiple default routes, like
default via 10.196.0.1 dev ens3 proto dhcp src 10.196.2.166 metric 100 default via 172.17.5.1 dev ens4 proto dhcp src 172.17.5.223 metric 101
configure-ovs picks up the one with the lowest metric, ending up with a default routing configuration like
default via 10.196.0.1 dev br-ex proto dhcp src 10.196.2.166 metric 48 default via 172.17.5.1 dev ens4 proto dhcp src 172.17.5.223 metric 100
Then when configure-ovs runs again after a reboot, it tries to rebuild the configuration. This is divided in two steps, it first tries to rollback to the original configuration. After the rollback it ends up with a configuration like
default via 172.17.5.1 dev ens4 proto dhcp src 172.17.5.223 metric 100 default via 10.196.0.1 dev ens3 proto dhcp src 10.196.2.166 metric 101
Notice how the route with the lowest metric is reversed with respect the original situation. This happens because when the original ens3 profile is activated, it gets assigned the next available metric of the allocated range for the device type.
In general, this shouldn't be a problem, because if a default route is configured as such, it should provide with the external access. But it looks like on QE openstack setups this is not the case. And then further device activation can fail if a dispatcher script requires the external access, like resolv-prepender in OCPBUGS-1577
One idea to solve this is to force a lower metric on the default route that was being used prior to the rollback, probably the same metric we originally used for br-ex, so that after the rollback we end up with something like
default via 10.196.0.1 dev ens3 proto dhcp src 10.196.2.166 metric 48 default via 172.17.5.1 dev ens4 proto dhcp src 172.17.5.223 metric 101
This change should be ephemeral, only during the rollback, meaning it should be configured with the iproute2 cli and not set permanently on the connection profile.
- relates to
-
OCPBUGS-1577 Incorrect network configuration in worker node with two interfaces
- Closed