Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-5237

SRIOV-CNI failed to configure VF "failed to set vf <vf_id> vlan: invalid argument"

XMLWordPrintable

    • -
    • Important
    • CNF Network Sprint 231
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • 1/10: waiting on Nokia to confirm whether this issue only happens on the out of tree driver

      Description of problem:

      When scaling SR-IOV workload pods after creating NetworkAttachmentDefinition for creating an interface internal,, the interface dosen't get bind to netns of the pod & below error can be seen within events
      
      2h31m      Warning  FailedCreatePodSandBox  pod/po-ci1-cpnrt-0                                       (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_po-ci1-cpnrt-0_ci1_1ed90c84-2707-45ab-93eb-bd441010df8d_0(ff161a3e0943a399720b9dcc422064fee170f2b5dcde04474c9ce5c7b2334f22): error adding pod ci1_po-ci1-cpnrt-0 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [ci1/po-ci1-cpnrt-0/1ed90c84-2707-45ab-93eb-bd441010df8d:internal]: error adding container to network "internal": SRIOV-CNI failed to configure VF "failed to set vf 2 vlan: invalid argument"
      55s        Warning  FailedCreatePodSandBox  pod/po-ci1-upue-0                                        (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_po-ci1-upue-0_ci1_d07e541d-aef8-4435-a59c-56d071f07b46_0(8094b0cc005de2d71bfec346e4405a31266851c555c14b80415da78313fc596c): error adding pod ci1_po-ci1-upue-0 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [ci1/po-ci1-upue-0/d07e541d-aef8-4435-a59c-56d071f07b46:external-u]: error adding container to network "external-u": SRIOV-CNI failed to configure VF "failed to find vf 48"

      Version-Release number of selected component (if applicable):

      SR-IOV Network Operator on OCP 4.11

      How reproducible:

      Customer facing it every time scaling pods using the below NIC
      
      8a:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller E810-C for QSFP [8086:1592] (rev 02)
          Subsystem: Intel Corporation Ethernet Network Adapter E810-C-Q2 for OCP3.0 [8086:0006]
              Product Name: Intel(R) Ethernet Network Adapter E810-CQDA2 for OCP 3.0
                  [V1] Vendor specific: Intel(R) Ethernet Network Adapter E810-CQDA2 for OCP 3.0

      Steps to Reproduce:

      1. Configure VFs using SriovNetworkNodePolicy first
      2. Then create a NetworkAttachmentDefinition using below similar manifest
      
      apiVersion: k8s.cni.cncf.io/v1
      kind: NetworkAttachmentDefinition
      metadata:
        annotations:
          k8s.v1.cni.cncf.io/resourceName: openshift.io/sriov_netdevice_ens43f2
          meta.helm.sh/release-name: ci1-prerequisite
          meta.helm.sh/release-namespace: ci1
        creationTimestamp: "2022-12-25T09:12:43Z"
        generation: 1
        labels:
          app.kubernetes.io/managed-by: Helm
        name: internal
        namespace: ci1
        resourceVersion: "5935710"
        uid: 6f3edb7d-f45d-44ab-a349-75b7721bdc43
      spec:
        config: '{ "type": "sriov", "cniVersion": "0.3.0", "trust": "on" }'
      
      3. Scale the pods with proper annotation to attach IP from the VF. Cu used below pod manifest
      
      apiVersion: v1
      kind: Pod
      metadata:
        annotations:
          k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.128.0.92/23"],"mac_address":"0a:58:0a:80:00:5c","gateway_ips":["10.128.0.1"],"ip_address":"10.128.0.92/23","gateway_ip":"10.128.0.1"}}'
          k8s.v1.cni.cncf.io/networks: internal@e1
          openshift.io/scc: ci1-cnf5g
        creationTimestamp: "2022-12-25T08:41:17Z"
        generateName: po-ci1-cpnrt- 

      Actual results:

      Interface not getting bind to the pod & below error within events
      
      2h31m      Warning  FailedCreatePodSandBox  pod/po-ci1-cpnrt-0                                       (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_po-ci1-cpnrt-0_ci1_1ed90c84-2707-45ab-93eb-bd441010df8d_0(ff161a3e0943a399720b9dcc422064fee170f2b5dcde04474c9ce5c7b2334f22): error adding pod ci1_po-ci1-cpnrt-0 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [ci1/po-ci1-cpnrt-0/1ed90c84-2707-45ab-93eb-bd441010df8d:internal]: error adding container to network "internal": SRIOV-CNI failed to configure VF "failed to set vf 2 vlan: invalid argument"
      55s        Warning  FailedCreatePodSandBox  pod/po-ci1-upue-0                                        (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_po-ci1-upue-0_ci1_d07e541d-aef8-4435-a59c-56d071f07b46_0(8094b0cc005de2d71bfec346e4405a31266851c555c14b80415da78313fc596c): error adding pod ci1_po-ci1-upue-0 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [ci1/po-ci1-upue-0/d07e541d-aef8-4435-a59c-56d071f07b46:external-u]: error adding container to network "external-u": SRIOV-CNI failed to configure VF "failed to find vf 48"
      2h31m      Warning  FailedCreatePodSandBox  pod/po-ci1-upue-0                                        (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_po-ci1-upue-0_ci1_2b1b514d-d318-40f2-baa8-ed907b2f5fbe_0(28def88525c591b3be758a915d1717a56e1a995f087bdb77177a4eb370efcb83): error adding pod ci1_po-ci1-upue-0 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [ci1/po-ci1-upue-0/2b1b514d-d318-40f2-baa8-ed907b2f5fbe:external-u]: error adding container to network "external-u": SRIOV-CNI failed to configure VF "failed to find vf 49"

      Expected results:

      The netDevice interface should get mounted to the pod with events showing the same
      
      added interface net1 to the pod-xxxx-xxx

      Additional info:

      I noticed that although there aren't any VF initialization error within SR-IOV config-Daemon pod logs. But, few VF id dosen't exist which have been pointed. For e.g. the error points to --> failed to set vf 18 vlan
      
      And the vf 18 isn't up
      
      $ ip link show ens43f2
      16: ens43f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether b4:96:91:b3:1c:92 brd ff:ff:ff:ff:ff:ff
          vf 14     link/ether 6e:11:1a:b3:dd:c9 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
          vf 19     link/ether 4a:61:29:5a:11:52 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
          vf 20     link/ether 36:e0:ac:6a:d6:52 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off

            sscheink@redhat.com Sebastian Scheinkman
            rhn-support-adubey Akash Dubey
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: