Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-681

ovnkube-node CrashLoopBackOff looking up gw interface: "br-ex"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Normal Normal
    • None
    • 4.11
    • None
    • Important
    • None
    • SDN Sprint 231, SDN Sprint 232
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      machine-config-controller failed due to: /var/run/multus/cni/net.d/10-ovn-kubernetes.conf. pollimmediate error
      
      Seeing in ovnkube-node container: error looking up gw interface: "br-ex", error: Link not found

      Version-Release number of selected component (if applicable):

      OCP 4.11.0 on OSP 16.2 (PSI)

      How reproducible:

      Unknown

      Steps to Reproduce:

      Install OCP 4.11.0 with OVNKubernetes
      
      

      Actual results:

      ovnkube-node:
          Container ID:  cri-o://61f3a4dc4a2c102f8c4b129fe71e47c8aafa65fa8b64c468f024c174e875b567
          Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4dc0a54cd1e11e92cfefc261305b3d74c9d74c88fa9a98884da8140436ec2ad3
          Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4dc0a54cd1e11e92cfefc261305b3d74c9d74c88fa9a98884da8140436ec2ad3
          Port:          29103/TCP
          Host Port:     29103/TCP
          Command:
            /bin/bash
            -c
            set -xe
            if [[ -f "/env/${K8S_NODE}" ]]; then
              set -o allexport
              source "/env/${K8S_NODE}"
              set +o allexport
            fi
            cp -f /usr/libexec/cni/ovn-k8s-cni-overlay /cni-bin-dir/
            ovn_config_namespace=openshift-ovn-kubernetes
            echo "I$(date "+%m%d %H:%M:%S.%N") - disable conntrack on geneve port"
            iptables -t raw -A PREROUTING -p udp --dport 6081 -j NOTRACK
            iptables -t raw -A OUTPUT -p udp --dport 6081 -j NOTRACK
            ip6tables -t raw -A PREROUTING -p udp --dport 6081 -j NOTRACK
            ip6tables -t raw -A OUTPUT -p udp --dport 6081 -j NOTRACK
            echo "I$(date "+%m%d %H:%M:%S.%N") - starting ovnkube-node"
            
            if [ "shared" == "shared" ]; then
              gateway_mode_flags="--gateway-mode shared --gateway-interface br-ex"
            elif [ "shared" == "local" ]; then
              gateway_mode_flags="--gateway-mode local --gateway-interface br-ex"
            else
              echo "Invalid OVN_GATEWAY_MODE: \"shared\". Must be \"local\" or \"shared\"."
              exit 1
            fi
            
            export_network_flows_flags=
            if [[ -n "${NETFLOW_COLLECTORS}" ]] ; then
              export_network_flows_flags="--netflow-targets ${NETFLOW_COLLECTORS}"
            fi
            if [[ -n "${SFLOW_COLLECTORS}" ]] ; then
              export_network_flows_flags="$export_network_flows_flags --sflow-targets ${SFLOW_COLLECTORS}"
            fi
            if [[ -n "${IPFIX_COLLECTORS}" ]] ; then
              export_network_flows_flags="$export_network_flows_flags --ipfix-targets ${IPFIX_COLLECTORS}"
            fi
            if [[ -n "${IPFIX_CACHE_MAX_FLOWS}" ]] ; then
              export_network_flows_flags="$export_network_flows_flags --ipfix-cache-max-flows ${IPFIX_CACHE_MAX_FLOWS}"
            fi
            if [[ -n "${IPFIX_CACHE_ACTIVE_TIMEOUT}" ]] ; then
              export_network_flows_flags="$export_network_flows_flags --ipfix-cache-active-timeout ${IPFIX_CACHE_ACTIVE_TIMEOUT}"
            fi
            if [[ -n "${IPFIX_SAMPLING}" ]] ; then
              export_network_flows_flags="$export_network_flows_flags --ipfix-sampling ${IPFIX_SAMPLING}"
            fi
            gw_interface_flag=
            # if br-ex1 is configured on the node, we want to use it for external gateway traffic
            if [ -d /sys/class/net/br-ex1 ]; then
              gw_interface_flag="--exgw-interface=br-ex1"
            fi
            
            node_mgmt_port_netdev_flags=
            if [[ -n "${OVNKUBE_NODE_MGMT_PORT_NETDEV}" ]] ; then
              node_mgmt_port_netdev_flags="--ovnkube-node-mgmt-port-netdev ${OVNKUBE_NODE_MGMT_PORT_NETDEV}"
            fi
            
            exec /usr/bin/ovnkube --init-node "${K8S_NODE}" \
              --nb-address "ssl:10.167.1.250:9641,ssl:10.167.2.36:9641,ssl:10.167.3.240:9641" \
              --sb-address "ssl:10.167.1.250:9642,ssl:10.167.2.36:9642,ssl:10.167.3.240:9642" \
              --nb-client-privkey /ovn-cert/tls.key \
              --nb-client-cert /ovn-cert/tls.crt \
              --nb-client-cacert /ovn-ca/ca-bundle.crt \
              --nb-cert-common-name "ovn" \
              --sb-client-privkey /ovn-cert/tls.key \
              --sb-client-cert /ovn-cert/tls.crt \
              --sb-client-cacert /ovn-ca/ca-bundle.crt \
              --sb-cert-common-name "ovn" \
              --config-file=/run/ovnkube-config/ovnkube.conf \
              --loglevel "${OVN_KUBE_LOG_LEVEL}" \
              --inactivity-probe="${OVN_CONTROLLER_INACTIVITY_PROBE}" \
              ${gateway_mode_flags} \
              --metrics-bind-address "127.0.0.1:29103" \
              --ovn-metrics-bind-address "127.0.0.1:29105" \
              --metrics-enable-pprof \
              --export-ovs-metrics \
              --disable-snat-multiple-gws \
              ${export_network_flows_flags} \
              ${gw_interface_flag}
            
          State:       Waiting
            Reason:    CrashLoopBackOff
          Last State:  Terminated
            Reason:    Error
            Message:   p                  : true\n\nup                  : true\n\nup                  : true\n\nup                  : false\n\nup                  : true\n\nup                  : true\n\nup                  : false\n\nup                  : false\n\nup                  : false\n\nup                  : true\n"
      I0829 22:14:47.075161 2328472 ovs.go:204] Exec(3): stderr: ""
      I0829 22:14:47.075172 2328472 node.go:312] Detected support for port binding with external IDs
      I0829 22:14:47.075275 2328472 ovs.go:200] Exec(4): /usr/bin/ovs-vsctl --timeout=15 -- --if-exists del-port br-int k8s-osp-nmanos- -- --may-exist add-port br-int ovn-k8s-mp0 -- set interface ovn-k8s-mp0 type=internal mtu_request=1350 external-ids:iface-id=k8s-osp-nmanos-b2-8w9fj-master-1
      I0829 22:14:47.086558 2328472 ovs.go:203] Exec(4): stdout: ""
      I0829 22:14:47.086585 2328472 ovs.go:204] Exec(4): stderr: ""
      I0829 22:14:47.086599 2328472 ovs.go:200] Exec(5): /usr/bin/ovs-vsctl --timeout=15 --if-exists get interface ovn-k8s-mp0 mac_in_use
      I0829 22:14:47.097350 2328472 ovs.go:203] Exec(5): stdout: "\"f6:b1:f3:e3:76:5a\"\n"
      I0829 22:14:47.097391 2328472 ovs.go:204] Exec(5): stderr: ""
      I0829 22:14:47.097409 2328472 ovs.go:200] Exec(6): /usr/bin/ovs-vsctl --timeout=15 set interface ovn-k8s-mp0 mac=f6\:b1\:f3\:e3\:76\:5a
      I0829 22:14:47.110497 2328472 ovs.go:203] Exec(6): stdout: ""
      I0829 22:14:47.110540 2328472 ovs.go:204] Exec(6): stderr: ""
      I0829 22:14:47.149172 2328472 gateway_init.go:261] Initializing Gateway Functionality
      I0829 22:14:47.149371 2328472 gateway_localnet.go:163] Node local addresses initialized to: map[10.167.3.240:{10.167.0.0 ffff0000} 10.228.0.2:{10.228.0.0 fffffe00} 127.0.0.1:{127.0.0.0 ff000000} ::1:{::1 ffffffffffffffffffffffffffffffff} fe80::6243:84de:f9a2:870c:{fe80:: ffffffffffffffff0000000000000000} fe80::f45a:9fff:fe00:ff62:{fe80:: ffffffffffffffff0000000000000000} fe80::f4b1:f3ff:fee3:765a:{fe80:: ffffffffffffffff0000000000000000}]
      F0829 22:14:47.149441 2328472 ovnkube.go:133] error looking up gw interface: "br-ex", error: Link not found      Exit Code:    1
            Started:      Mon, 29 Aug 2022 17:14:46 -0500
            Finished:     Mon, 29 Aug 2022 17:14:47 -0500
          Ready:          False
          Restart Count:  794
          Requests:
            cpu:      10m
            memory:   300Mi
          Readiness:  exec [test -f /etc/cni/net.d/10-ovn-kubernetes.conf] delay=5s timeout=1s period=5s #success=1 #failure=3
          Environment:
            KUBERNETES_SERVICE_PORT:          6443
            KUBERNETES_SERVICE_HOST:          api-int.osp-nmanos-b2.devcluster.openshift.com
            OVN_CONTROLLER_INACTIVITY_PROBE:  180000
            OVN_KUBE_LOG_LEVEL:               4
            K8S_NODE:                          (v1:spec.nodeName)
          Mounts:
            /cni-bin-dir from host-cni-bin (rw)
            /env from env-overrides (rw)
            /etc/cni/net.d from host-cni-netd (rw)
            /etc/openvswitch from etc-openvswitch (rw)
            /etc/ovn/ from etc-openvswitch (rw)
            /etc/systemd/system from systemd-units (ro)
            /host from host-slash (ro)
            /ovn-ca from ovn-ca (rw)
            /ovn-cert from ovn-cert (rw)
            /run/netns from host-run-netns (ro)
            /run/openvswitch from run-openvswitch (rw)
            /run/ovn-kubernetes/ from host-run-ovn-kubernetes (rw)
            /run/ovn/ from run-ovn (rw)
            /run/ovnkube-config/ from ovnkube-config (rw)
            /var/lib/cni/networks/ovn-k8s-cni-overlay from host-var-lib-cni-networks-ovn-kubernetes (rw)
            /var/lib/openvswitch from var-lib-openvswitch (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dbwqt (ro)
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 

      Expected results:

      All OVN pods should indicate "Running" within `oc get all -n openshift-ovn-kubernetes`

      Additional info:

      Additional logs attached

              jcaamano@redhat.com Jaime Caamaño Ruiz
              nmanos@redhat.com Noam Manos
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: