Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4603

SR-IOV VFs may get reseted after being allocated by other pods

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • 4.12.0
    • Networking / SR-IOV
    • None
    • None
    • CNF Network Sprint 228, CNF Network Sprint 229, CNF Network Sprint 230, CNF Network Sprint 231, CNF Network Sprint 232, CNF Network Sprint 233, CNF Network Sprint 235, CNF Network Sprint 236
    • 8
    • Proposed
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-4585. The following is the description of the original issue:

      Description of problem:

      In the scenario where pods with SR-IOV interface from the same resource pool are created and deleted a few times,
      the underlying VF may end up with a default configuration instead of the desired one (i.e: no MAC address, no VLAN).
      
      U/S issue: https://github.com/k8snetworkplumbingwg/sriov-cni/issues/219

      Version-Release number of selected component (if applicable):

      SR-IOV CNI: v2.6
      SR-IOV device plugin: v3.3
      Multus: v3.8
      Kubernetes: v1.22.2

      How reproducible:

      17%

      Steps to Reproduce:

      1. Terminal 1: Monitor SR-IOV VFs state on node 'worker1'
      2. Terminal 2:
        2.1 Create a pod 'test1' test with:
          nodeSelector to node 'worker1.'
          SR-IOV interface with MAC address '02:02:02:02:02'.
        2.2 Wait for the pod 'test1' to be ready.
        2.3 Delete pod 'test1' in background and immediately create a similar pod 'test2'.
        2.4 Wait for pod 'test2' to be ready.
      
      After a few iterations 'test2' pod is Running but looking at the node VFs (terminal 2) it shows that it's not configured with the desired MAC address.
      
      Scripts and manifests I used for reproducing the issue: https://gist.github.com/ormergi/3ddbf901ddc95baf316b604994285a69

       

      Actual results:

      Pod ended up with unexpected MAC address on its VF.

      Expected results:

      The pod underlying VF is to be configured correctly.

      Additional info:

      On Kubevirt CI SR-IOV tests lane some tests are failing due to VMs SR-IOV interfaces ends up with unexpected MAC address.
      
      It seems that when the VM underlying pod is deleted, CNI cmdDEL 
      command is executed, it will reset the VF whether it's been allocated 
      by another pod or not.
      Also, it's not guaranteed that as soon as a pod is disposed its underlying VF is reseted.
      
      It takes 1-2 seconds for it to reset, which seem odd because I would expect all its resource to be free when it's gone.

       

              carlosgoncalves Carlos Goncalves
              openshift-crt-jira-prow OpenShift Prow Bot
              Zhanqi Zhao Zhanqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: