Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-51300

OCP 4.14.22 - Multus network (network-operator bound) whereabouts MacVLAN allocating duplicate IPs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.14
    • Networking / multus
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          OCP 4.14.22 - Created a Network Attachment Definition for ODF Networking, defined a MacVLAN using Whereabouts. Defined the interface config using the network Operator spec, so a reconciler has been created. Observed that there are two pods with the same IP address managed by net1. this is causing packet loss and traffic interruptions intermittently between ODF peers. 

      Version-Release number of selected component (if applicable):

          4.14.22

      How reproducible:

          one time - customer env, no internal repo available

      Steps to Reproduce:

      MacVLAN NAD definition:
      
      
      fi-918:~$ oc get net-attach-def -n openshift-storage -o yaml
      apiVersion: v1
      items:
      - apiVersion: k8s.cni.cncf.io/v1
        kind: NetworkAttachmentDefinition
        metadata:
          creationTimestamp: "2024-07-31T19:50:46Z"
          generation: 1
          name: ocs-cluster-network
          namespace: openshift-storage
          resourceVersion: "8297200"
          uid: 13516483-ee53-45db-b856-f8d593be7bdf
        spec:
          config: |-
            {
              "cniVersion": "0.3.1",
              "type": "macvlan",
              "master": "tenant-vlan.98",
              "mode": "bridge",
              "ipam": {
                "type": "whereabouts",
                "range": "192.168.255.0/24"
              }
            }
      kind: List
      metadata:
        resourceVersion: ""
      
      
      
      //Pod IPs that are colliding:
      
      OSD-103 on storage-0 
      Name:                 rook-ceph-osd-103-9cfb79ddf-rg8gm                   
                            topology-location-host=storage-0-<redacted>
                                 "name": "openshift-storage/ocs-cluster-network",
                                  "interface": "net1",
                                  "ips": [
                                      "192.168.255.8"
                                  ],
                                  "mac": "1e:a0:ce:4d:3c:8f",
                                  "dns": {}
                              }]
      
      & 
      OSD-71 on storage-1
      Name:                 rook-ceph-osd-71-5ff7687d79-8fhcn
                            topology-location-host=storage-1-<redacted>
                              },{
                                  "name": "openshift-storage/ocs-cluster-network",
                                  "interface": "net1",
                                  "ips": [
                                      "192.168.255.8"
                                  ],
                                  "mac": "f2:3b:2b:04:be:b1",
                                  "dns": {}
                              }]

      Actual results:

          pods are intermittently dropping packets - we observe the following arp queries also:
      
      1094967                         Who has 192.168.255.34? Tell 192.168.255.8 (duplicate use of 192.168.255.8 detected!)
      
      conversation between .34 and .8 is interrupted - we see packets are periodically dropped/retransmitted when routed to .8 (likely being sent to the wrong backend).

      Expected results:

         multus should validate and prevent duplicate IP binding from whereabout pools on multiple hosts. Reconciliation is also not catching/cleaning this up.

      Additional info:

      This network is required for storage handling.
      
      Workaround exists: forcibly delete + clean up duplicate IP pods to force new IP allocation and resolve overlap.     

              sdn-team-bot sdn-team bot
              rhn-support-wrussell Will Russell
              None
              None
              Weibin Liang Weibin Liang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: