Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-28989

[2209318] [4.12.z] VM connected to a VLAN is also receiving packets from VLAN 1

XMLWordPrintable

    • Urgent
    • None

      +++ This bug was initially created as a clone of Bug #2179333 +++

      Description of problem:
      When a VM is connected to a bridge using a NetworkAttachmentDefinition with a VLAN ID, we can see that the VM receives packets from the VLAN ID and also the untagged packets from the bridge.

      Version-Release number of selected component (if applicable):
      OCP 4.10.51
      OpenShift Virtualization 4.10.8

      How reproducible:
      Always

      Steps to Reproduce:
      1. Create a bridge with a physical interface:

      ```
      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
      name: br-extnet-enp3s0-policy-workers
      spec:
      desiredState:
      interfaces:

      • description: enp3s0 interface
        name: enp3s0
        state: up
        type: ethernet
      • bridge:
        options:
        stp:
        enabled: false
        port:
      • name: enp3s0
        description: Linux bridge with enp3s0 as a port
        ipv4:
        dhcp: true
        enabled: true
        name: br-extnet
        state: up
        type: linux-bridge
        nodeSelector:
        node-role.kubernetes.io/worker: ""
        ```

      2. Create a NAD using the bridge and a VLAN ID:

      ```
      apiVersion: k8s.cni.cncf.io/v1
      kind: NetworkAttachmentDefinition
      metadata:
      name: br-extnet-network
      namespace: jorti
      spec:
      config: '

      { "cniVersion": "0.3.1", "name": "br-extnet-network", "type": "cnv-bridge", "bridge": "br-extnet", "vlan": 1000, "macspoofchk": false }

      '
      ```

      3. Create a VM connected to the NAD:

      ```
      domain:
      devices:
      interfaces:

      • bridge: {}
        macAddress: 02:a2:a6:00:00:06
        model: virtio
        name: default
        networks:
      • multus:
        networkName: br-extnet-network
        name: default
        ```

      Actual results:

      In the node we can see that the veth of the VM is also using VLAN 1:

      ```
      sh-4.4# bridge -d vlan
      port vlan-id
      vetheb545c41 1 Egress Untagged <----------
      1000 PVID Egress Untagged
      ```

      A tcpdump in the VM reveals packets not belonging to VLAN 1000.

      Expected results:
      The VM must be connected only to VLAN 1000

      Additional info:
      I see a similar issue reported upstream:
      https://github.com/containernetworking/plugins/issues/667

      — Additional comment from Petr Horáček on 2023-03-17 12:39:27 UTC —

      Juan, thanks for reporting this. The team will look into it ASAP.

      — Additional comment from Petr Horáček on 2023-03-20 13:34:19 UTC —

      We suspect that this is caused by https://github.com/containernetworking/plugins/blob/main/plugins/main/bridge/bridge.go#L373 setting the untagged attribute to `true`. I'm working on an offline reproducer. After that, I will expedite a downstream patch to fix the issue.

      — Additional comment from Petr Horáček on 2023-03-20 14:59:35 UTC —

      Confirmed with a reproducer.

      Configure the bridge and NAD:
      ```sh
      cat <<EOF | kubectl apply -f -

      apiVersion: nmstate.io/v1
      kind: NodeNetworkConfigurationPolicy
      metadata:
      name: br-extnet
      spec:
      desiredState:
      interfaces:

      • bridge:
        options:
        stp:
        enabled: false
        port:
      • name: eth2
        description: Linux bridge with eth2 as a port
        ipv4:
        enabled: true
        dhcp: false
        address:
      • ip: 11.11.11.1
        prefix-length: 24
        name: br-extnet
        state: up
        type: linux-bridge
        nodeSelector:
        kubernetes.io/hostname: node02

        apiVersion: k8s.cni.cncf.io/v1
        kind: NetworkAttachmentDefinition
        metadata:
        name: br-extnet-network
        spec:
        config: ' { "cniVersion": "0.3.1", "name": "br-extnet-network", "type": "bridge", "bridge": "br-extnet", "vlan": 1000, "macspoofchk": false }

        '
        EOF
        ```

      Once the bridge is created, connect a Pod to it:

      ```sh
      cat <<EOF | kubectl apply -f -
      apiVersion: v1
      kind: Pod
      metadata:
      name: samplepod1
      annotations:
      k8s.v1.cni.cncf.io/networks: br-extnet-network
      spec:
      nodeSelector:
      node-role.kubernetes.io/worker: ""
      containers:

      • name: samplepod
        command: ["/bin/sh", "-c", "ip a add 11.11.11.11/24 dev net1; ip l set up net1; sleep 99999"]
        image: itsthenetwork/alpine-tcpdump
        securityContext:
        capabilities:
        add: ["NET_RAW", "NET_ADMIN"]
        EOF
        ```

      Start GARP on the node:

      ```sh
      arping -A -I br-extnet 11.11.11.1
      ```

      Listen to it in the Pod:

      ```sh
      tcpdump -lni net1 arp
      ```

      Despite the Pod is assigned to a VLAN, it sees GARP traffic.

      — Additional comment from Petr Horáček on 2023-03-20 15:32:39 UTC —

      Changing the `untagged` attribute to `false` did not help. The VLAN 1 is added by the OS:

      $ ip link add br11 type bridge
      $ ip link add veth1 type veth peer name veth2
      $ ip link set veth1 master br11

      $ bridge vlan
      br11 1 PVID Egress Untagged
      veth1 1 PVID Egress Untagged

      — Additional comment from Petr Horáček on 2023-03-20 15:38:19 UTC —

      The solution suggested in https://github.com/containernetworking/plugins/issues/667 addresses the issue. With it, ARP is not visible in the Pod, while traffic between Pods connected to the same VLAN still works.

      — Additional comment from Petr Horáček on 2023-04-13 10:21:14 UTC —

      https://github.com/containernetworking/plugins/pull/875 proposes a well received solution. It is still under a review

      — Additional comment from Rinat Gertzberg on 2023-05-21 13:39:55 UTC —

      Hello team,
      Is this fix is still planned for CNV 4.13.1?
      Is there a plan for back porting to 4.12?

      — Additional comment from Petr Horáček on 2023-05-22 13:54:04 UTC —

      Hello. Yes, it is posed for 4.13.1. Once we will have 4.13.1 build available, it will be tested by our QE.

      The fix depends on a change that's only available in RHEL 9, so we won't be able to backport the fix to 4.12, which is still based on RHEL 8.

      We could try asking the RHEL team for a backport, but IIUIC the changes there were substantial, so it would be a hard sell. Is it critical for the customer to resolve this in 4.12?

      — Additional comment from Petr Horáček on 2023-05-22 13:55:13 UTC —

      Sorry, my previous comment was meant for a different BZ. Please disregard.

      — Additional comment from Petr Horáček on 2023-05-22 13:57:17 UTC —

      Hello. Yes, it is posted for 4.13.1. Once we will have 4.13.1 build available, it will be tested by our QE.

      As for a backport, we are now considering our options. I'll update this BZ once we decide on a way forward.

              phoracek@redhat.com Petr Horacek
              phoracek@redhat.com Petr Horacek
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: