Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-38897

ovs-configuration fails with canceled DHCP transaction error

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None

      The ovs-configuration.service service fails every time whenever the node reboots with below errors in NetworkManager logs.

      Aug 24 15:08:33 example-node NetworkManager[1255]: <info>  [1724512113.7749] dhcp4 (br-ex): activation: beginning transaction (timeout in 45 seconds)
      
      Aug 24 15:08:33 example-node NetworkManager[1255]: <info>  [1724512113.7953] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:08:33 example-node NetworkManager[1255]: <info>  [1724512113.8228] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:08:35 example-node NetworkManager[1255]: <info>  [1724512115.8415] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:08:46 example-node NetworkManager[1255]: <info>  [1724512126.4140] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:08:54 example-node NetworkManager[1255]: <info>  [1724512134.4321] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:09:03 example-node NetworkManager[1255]: <warn>  [1724512143.7665] dispatcher: (54) /etc/NetworkManager/dispatcher.d/30-resolv-prepender failed (failed): Script '/etc/NetworkManager/dispatcher.d/30-resolv-prepender' exited with status 1.
      
      Aug 24 15:09:10 example-node NetworkManager[1255]: <info>  [1724512150.4514] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9688] device (br-ex): state change: ip-config -> failed (reason 'ip-config-unavailable', sys-iface-state: 'managed')
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9692] manager: NetworkManager state is now CONNECTED_LOCAL
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9695] device (br-ex): detaching ovs interface br-ex
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9768] dhcp4 (br-ex): canceled DHCP transaction
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9768] dhcp4 (br-ex): activation: beginning transaction (timeout in 45 seconds)
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9768] dhcp4 (br-ex): state changed no lease
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <info>  [1724512158.9770] device (br-ex): released from master device br-ex
      
      Aug 24 15:09:18 example-node NetworkManager[1255]: <warn>  [1724512158.9772] device (br-ex): Activation: failed for connection 'ovs-if-br-ex'' 

      This indicates that br-ex isn't able to fetch the IP address from DHCP and looks like a DHCP issue but that's wrong since the primary NIC is always able to get the IP address from DHCP and only br-ex fails whenever the 
      ovs-configuration.service starts.
       
      The ovs-configuration.service logs are having below errors.

      Aug 24 15:17:04 example-node configure-ovs.sh[663779]: 6: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
      Aug 24 15:17:04 example-node configure-ovs.sh[663779]:     link/ether x:x:x:x:x:x brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535
      Aug 24 15:17:04 example-node configure-ovs.sh[663779]:     openvswitch numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536
      Aug 24 15:17:04 example-node systemd[1]: ovs-configuration.service: Main process exited, code=exited, status=4/NOPERMISSION
      Aug 24 15:17:04 example-node configure-ovs.sh[627756]: + ip route show
      Aug 24 15:17:04 example-node systemd[1]: ovs-configuration.service: Failed with result 'exit-code'.
      Aug 24 15:17:04 example-node configure-ovs.sh[663780]: default via 10.x.x.x dev ens192 proto dhcp src 10.x.x.x metric 100
      Aug 24 15:17:04 example-node configure-ovs.sh[663780]: 10.x.0.0/14 via 10.129.0.1 dev ovn-k8s-mp0
      Aug 24 15:17:04 example-node configure-ovs.sh[663780]: 10.x.0.0/23 dev ovn-k8s-mp0 proto kernel scope link src 10.x.0.x
      Aug 24 15:17:04 example-node configure-ovs.sh[663780]: 10.x.x.64/26 dev ens192 proto kernel scope link src 10.x.x.x metric 100
      Aug 24 15:17:04 example-node configure-ovs.sh[663780]: 169.x.x.3 via 10.129.0.1 dev ovn-k8s-mp0
      Aug 24 15:17:04 example-node systemd[1]: Failed to start Configures OVS with proper host networking configuration.
      Aug 24 15:17:04 example-node configure-ovs.sh[627756]: + ip -6 route show
      Aug 24 15:17:04 example-node systemd[1]: ovs-configuration.service: Consumed 1.302s CPU time.
      Aug 24 15:17:04 example-node configure-ovs.sh[663781]: ::1 dev lo proto kernel metric 256 pref medium
      Aug 24 15:17:04 example-node configure-ovs.sh[663781]: fe80::/64 dev genev_sys_6081 proto kernel metric 256 pref medium
      Aug 24 15:17:04 example-node configure-ovs.sh[663781]: fe80::/64 dev ens192 proto kernel metric 1024 pref medium
      Aug 24 15:17:04 example-node configure-ovs.sh[627756]: + exit 4 

       
      Node reboot, NetworkManager and ovs-configuration.service restart didn't help at all.
       
      This issue is exactly similar to the below bug for a very older version.
      --> https://bugzilla.redhat.com/show_bug.cgi?id=2048352
       
      The issue was resolved by restarting the openvswitch service first and then ovs-configuration.service.
      --> https://bugzilla.redhat.com/show_bug.cgi?id=2048352#c9
      --> $ sudo systemctl restart openvswitch
      --> $ sudo systemctl restart ovs-configuration.service
       
      Still, this is a temporary workaround because whenever the node reboots the same issue comes up and a workaround needs to be applied. Even both the services needs to be restarted multiple times in some cases to bring up the br-ex.
       
      I will provide the one of the node sosreport.

              sdn-team-bot sdn-team bot
              rhn-support-aygarg Ayush Garg
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: