Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-12355

BZ#2323873 During live migration of controller between nodes, MAC flaps between source and destination worker resulting in port shutdown [17.1]

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Proposed
    • None

      Description of problem:
      when live migrating a controller from one worker node to the other, mac starts flapping between source and destination worker getting the switch port shutted down. The The customer involved the network hardware vendor which provided a patch and this solved the issue but there's still between 40 and 80 mac flapping when live-migrating. There's a fix for that in OCP 4.16 which needs to be configured in the osp operator as per this :
      ~~~
      From OCP Virt, to solve the MAC flaps during the migration, we have to pass the parameter "disableContainerInterface: true" to the NAD. However, we cannot edit it directly since the NADs are managed by the director operator. Also, there are OSP pods using some of the NADs. However, it looks like the NADs used by the VMs and other OSP pods are different:

      VMs using NAD without "static" suffix:

      1. oc get vm vm-ctl-0 -o yaml |yq '.spec.template.spec.networks'
      • name: default
        pod: {}
      • multus:
        networkName: ctlplane
        name: ctlplane
      • multus:
        networkName: external
        name: external
      • multus:
        networkName: internalapi
        name: internalapi
      • multus:
        networkName: storage
        name: storage
      • multus:
        networkName: tenant
        name: tenant
        openstackclient pod is using NAD with "static" suffix:
      1. oc get pod openstackclient -o yaml |yq '.metadata.annotations["k8s.v1.cni.cncf.io/networks"]'
        [ {"name": "ctlplane-static", "namespace": "openstack", "ips": ["10.10.104.10/22"]}

        ,

        {"name": "internalapi-static", "namespace": "openstack", "ips": ["10.10.103.10/23"]}

        ,

        {"name": "external-static", "namespace": "openstack", "ips": ["10.10.102.10/28"]}

        ]
        So it maybe possible only to pass "disableContainerInterface" on NADs used only by VMs?

      As of now, customer increased the threshold of mac flaps so it won't get blocklisted, but they still looking for solution from us.
      ~~~

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:

      Expected results:

      Additional info:

              Unassigned Unassigned
              jira-bugzilla-migration RH Bugzilla Integration
              rhos-conplat-core-operators
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: