Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-9090

[1879458] Bridge creation fails on CNV due to def route lost

XMLWordPrintable

    • Important
    • None

      Description of problem:

      Trying to create a bridge for VMs on OCP 4.5 as detailed here (https://docs.openshift.com/container-platform/4.5/virt/node_network/virt-updating-node-network-config.html#virt-creating-interface-on-nodes_virt-updating-node-network-config), the NNCP creation fails with the following error: "rolling back desired state configuration: failed runnig probes after network
      changes: failed to retrieve default gw at runProbes: timed out waiting for the
      condition".
      Digging into the code (https://github.com/nmstate/kubernetes-nmstate/blob/master/pkg/probe/probes.go#L98) I've seen that the failed check is the one on the default route (this one: Get("routes.running.#(destination==\"0.0.0.0/0\").next-hop-address").String()"). I actually see from both nmstate and IP route that the default route on the node gets lost when applying the NNCP, which is why the probe fails. The nodes loses connectivity due to that until the roll-back is applied.

      Version-Release number of selected component (if applicable):
      CNV 2.4

      How reproducible:
      Follow docs above with the following bridge definition:

      apiVersion: nmstate.io/v1alpha1
      kind: NodeNetworkConfigurationPolicy
      metadata:
      name: br1-eno1np0
      spec:
      nodeSelector:
      kubernetes.io/hostname: worker-0.ocp4rm.poste.exp
      desiredState:
      interfaces:

      • name: br1
        description: Linux bridge with eno1np0 as a port
        type: linux-bridge
        state: up
        ipv4:
        dhcp: true
        enabled: true
        bridge:
        options:
        stp:
        enabled: false
        port:
      • name: eno1np0

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:
      Bridge not created, node lost connectivity

      Expected results:
      Bridge created

      Additional info:
      The node networking is configured via kernel params at boot, it's a UPI cluster on bare metal.
      The issue is solved with the following workaround, i.e. adding the definition of the default route explicitely in the NNCP object:

      routes:
      config:

      • destination: 0.0.0.0/0
        next-hop-address: 10.77.3.193
        next-hop-interface: bond0

              phoracek@redhat.com Petr Horacek
              gcofano@redhat.com Giuseppe Cofano (Inactive)
              Meni Yakove Meni Yakove
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: