Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-7569

nodeip-configuration selects IP that does not exist after ovs-configuration reconfigures network

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Normal Normal
    • None
    • 4.12
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • No
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When a node IP changes due to DHCP change nodeip-configuration can select the old IP because it runs before ovs-configuration performs a nm_rollback and reloads the connections and gets new DHCP leases.

      In the case of bonding the bond MAC can change due to slave link failure causing a new DHCP lease to be acquired.

      Since nodeip-configuration chose the old IP, crio is given the old IP and crio fails, node becomes NotReady. This can be recovered by rebooting the node a second time and nodeip-configuration now selects the new IP.

      Maybe if the selected crio IP is no longer present on the system after ovs-configuration runs we should re-run the service.

      Feb 05 17:27:09 master-0-0 crio[6692]: time="2023-02-05 17:27:09.919185662Z" level=fatal msg="Failed to start streaming server: listen tcp 192.168.123.123:10010: bind: cannot assign requested address"
      

      Version-Release number of selected component (if applicable):

      4.12.0-0.nightly-2023-02-04-034821
      

      How reproducible:

      
      Always
      

      Steps to Reproduce:

      
      1. deploy IPI baremetal with DHCP and bonding
      2. fail primary slave link
      3. reboot
      
      

      Actual results:
      Node is NotReady

      crio error in logs

      
      crio[6692]: time="2023-02-05 17:27:09.919185662Z" level=fatal msg="Failed to start streaming server: listen tcp 192.168.123.123:10010: bind: cannot assign requested address"
      
      
      

      Expected results:

      Node is ready. Cluster may be degraded due to etcd rejecting IP change.

      Additional info:

      Workaround is to reboot again. On the second boot nodeip-configuration selects the new IP address.

      # from attached file  syslog-192.168.123.130.log    node IP changed from 123.123 to 123.130
      
      Writing Kubelet service override with content [Service]\nEnvironment=\"KUBELET_NODE_IP=192.168.123.123\" \
      
      Feb 05 17:27:09 master-0-0 crio[6692]: time="2023-02-05 17:27:09.919185662Z" level=fatal msg="Failed to start streaming server: listen tcp 192.168.123.123:10010: bind: cannot assign requested address"
      
      

              bnemec@redhat.com Benjamin Nemec
              rbrattai@redhat.com Ross Brattain
              None
              None
              Zhanqi Zhao Zhanqi Zhao
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: