Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-5840

[4.13] Modifying node_mgmt_port_netdev_flags on OVN-K node will crash

    XMLWordPrintable

Details

    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      When trying to enable Hardware Backed Management Ports (e.g. Virtual functions) on BF2 in NIC mode OR any other MLX NICs (CX-6, CX-5) by setting the node_mgmt_port_netdev_flags flags to a VF in the CNO; then OVN-K Node will crash.

      Version-Release number of selected component (if applicable):

      4.12.0

      How reproducible:

      Always

      Steps to Reproduce:

      Start by enabling OvS HWOL and setting sriovnetworknodepolicy
      https://docs.openshift.com/container-platform/4.11/networking/hardware_networks/configuring-hardware-offloading.html
      1. Scale down CNO: oc scale --replicas=0 deploy/network-operator -n openshift-network-operator
      2. Make changes to OVN-K node: oc edit daemonsets ovnkube-node -n openshift-ovn-kubernetes
          a. Find "node_mgmt_port_netdev_flags=" and replace it with something like this:
                node_mgmt_port_netdev_flags=
                if [[ ${K8S_NODE} != *"master"* ]]; then
                      node_mgmt_port_netdev_flags="--ovnkube-node-mgmt-port-netdev=ens1f0v0"
                fi
           b. Additionally you have to add the "node_mgmt_port_netdev_flags"  to the " exec /usr/bin/ovnkube --init-node "${K8S_NODE}"" call in the same script. Since this is missing.
      3. Save the edit.
      4. Observe OVN-K node on baremetal worker nodes.

      Actual results:

      I0822 14:21:56.250285  496356 ovs.go:204] Exec(3): stderr: ""
      I0822 14:21:56.250290  496356 node.go:310] Detected support for port binding with external IDs
      I0822 14:21:56.250516  496356 management-port-dpu.go:181] Setup management port dpu host: ens1f0v0
      F0822 14:21:56.250568  496356 ovnkube.go:133] failed to set management port name. file exists
      
      Workaround is to go to the node and run this command: sudo ovs-vsctl del-port br-int ovn-k8s-mp0

      Expected results:

      There should not be any errors when changing node_mgmt_port_netdev_flags to a valid value.

      Additional info:

      Reported here: https://github.com/ovn-org/ovn-kubernetes/pull/3160
      Discussed briefly here: https://issues.redhat.com/browse/OCPBUGS-4098
      Fixed Upstream here: https://github.com/ovn-org/ovn-kubernetes/pull/3251

      Attachments

        Issue Links

          Activity

            People

              wizhao@redhat.com William Zhao
              wizhao@redhat.com William Zhao
              Huiran Wang Huiran Wang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: