Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-30931

Enabling MAC address support in SriovNetwork resource doesn't allow to show MAC addresses of vfio-pci-based interfaces in k8s.v1.cni.cncf.io/network-status annotation

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • 4.13, 4.12
    • Networking / SR-IOV
    • None
    • No
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      Enabling MAC address support in SriovNetwork resource doesn't allow to show MAC addresses of vfio-pci-based interfaces in k8s.v1.cni.cncf.io/network-status annotation
      

      Version-Release number of selected component (if applicable):

      Only seen in OCP 4.12 and OCP 4.13. In OCP >= 4.14, this is not present.
      Last versions where the issue has been seen are the following nightly releases, but the issue may be present in previous versions:
      - 4.13 nightly 2024-03-13 08:15
      - 4.12 nightly 2024-03-13 03:40
      

      How reproducible:

      The issue has appeared in all tests using the affected versions.
      

      Steps to Reproduce:

      1. Deploy OCP cluster, with 3 master and 4 worker nodes, using IPI baremetal installation
      2. Install SRIOV network operator
      3. Create the two following SriovNetworkNodePolicy resources, where:
      
      - We define 16 VFs in two network interfaces of all worker nodes in the cluster
      - Device type is vfio-pci
      
      $ oc get sriovnetworknodepolicy -n openshift-sriov-network-operator intel-numa0-policy1 -o json
      {
          "apiVersion": "sriovnetwork.openshift.io/v1",
          "kind": "SriovNetworkNodePolicy",
          "metadata": {
              "creationTimestamp": "2024-03-14T09:08:34Z",
              "generation": 1,
              "name": "intel-numa0-policy1",
              "namespace": "openshift-sriov-network-operator",
              "resourceVersion": "583413",
              "uid": "df592c4e-7aeb-4754-a987-b5d21c5e33ac"
          },
          "spec": {
              "deviceType": "vfio-pci",
              "isRdma": false,
              "mtu": 9000,
              "nicSelector": {
                  "deviceID": "158b",
                  "pfNames": [
                      "ens2f0#0-15"
                  ],
                  "vendor": "8086"
              },
              "nodeSelector": {
                  "node-role.kubernetes.io/worker": ""
              },
              "numVfs": 16,
              "priority": 99,
              "resourceName": "intel_numa0_res1"
          }
      }
      
      $ oc get sriovnetworknodepolicy -n openshift-sriov-network-operator intel-numa0-policy2 -o json
      {
          "apiVersion": "sriovnetwork.openshift.io/v1",
          "kind": "SriovNetworkNodePolicy",
          "metadata": {
              "creationTimestamp": "2024-03-14T09:08:35Z",
              "generation": 1,
              "name": "intel-numa0-policy2",
              "namespace": "openshift-sriov-network-operator",
              "resourceVersion": "583429",
              "uid": "92615150-0929-4b47-9d70-9a1023f37b59"
          },
          "spec": {
              "deviceType": "vfio-pci",
              "isRdma": false,
              "mtu": 9000,
              "nicSelector": {
                  "deviceID": "158b",
                  "pfNames": [
                      "ens2f1#0-15"
                  ],
                  "vendor": "8086"
              },
              "nodeSelector": {
                  "node-role.kubernetes.io/worker": ""
              },
              "numVfs": 16,
              "priority": 99,
              "resourceName": "intel_numa0_res2"
          }
      }
      
      4. Create the four following SriovNetwork resources, where:
      
      - Two networks refer to the first SriovNetworkNodePolicy, the other two networks refer to the other SriovNetworkNodePolicy
      - MAC address support is enabled
      - A different VLAN is used for each network (network devices are also properly configured, so that the ports connected to the network interfaces of the worker nodes that are referenced in the SriovNetworkNodePolicy resources are using these VLANs)
      
      $ oc get sriovnetwork -n openshift-sriov-network-operator -o json
      {
          "apiVersion": "v1",
          "items": [
              {
                  "apiVersion": "sriovnetwork.openshift.io/v1",
                  "kind": "SriovNetwork",
                  "metadata": {
                      "annotations": {
                          "operator.sriovnetwork.openshift.io/last-network-namespace": "example-cnf"
                      },
                      "creationTimestamp": "2024-03-14T09:09:23Z",
                      "finalizers": [
                          "netattdef.finalizers.sriovnetwork.openshift.io"
                      ],
                      "generation": 1,
                      "name": "intel-numa0-net1",
                      "namespace": "openshift-sriov-network-operator",
                      "resourceVersion": "707858",
                      "uid": "6bb854aa-3e0f-43fe-9501-7fb0180165ee"
                  },
                  "spec": {
                      "capabilities": "{\"mac\": true}",
                      "logLevel": "info",
                      "networkNamespace": "example-cnf",
                      "resourceName": "intel_numa0_res1",
                      "spoofChk": "off",
                      "trust": "on",
                      "vlan": 407
                  }
              },
              {
                  "apiVersion": "sriovnetwork.openshift.io/v1",
                  "kind": "SriovNetwork",
                  "metadata": {
                      "annotations": {
                          "operator.sriovnetwork.openshift.io/last-network-namespace": "example-cnf"
                      },
                      "creationTimestamp": "2024-03-14T09:09:24Z",
                      "finalizers": [
                          "netattdef.finalizers.sriovnetwork.openshift.io"
                      ],
                      "generation": 1,
                      "name": "intel-numa0-net2",
                      "namespace": "openshift-sriov-network-operator",
                      "resourceVersion": "707862",
                      "uid": "516b7ae4-fb3e-492e-8491-8b5746aa0bb5"
                  },
                  "spec": {
                      "capabilities": "{\"mac\": true}",
                      "logLevel": "info",
                      "networkNamespace": "example-cnf",
                      "resourceName": "intel_numa0_res1",
                      "spoofChk": "off",
                      "trust": "on",
                      "vlan": 408
                  }
              },
              {
                  "apiVersion": "sriovnetwork.openshift.io/v1",
                  "kind": "SriovNetwork",
                  "metadata": {
                      "annotations": {
                          "operator.sriovnetwork.openshift.io/last-network-namespace": "example-cnf"
                      },
                      "creationTimestamp": "2024-03-14T09:09:24Z",
                      "finalizers": [
                          "netattdef.finalizers.sriovnetwork.openshift.io"
                      ],
                      "generation": 1,
                      "name": "intel-numa0-net3",
                      "namespace": "openshift-sriov-network-operator",
                      "resourceVersion": "707866",
                      "uid": "e02344a1-5e83-4e9d-966b-9aa664b20c10"
                  },
                  "spec": {
                      "capabilities": "{\"mac\": true}",
                      "logLevel": "info",
                      "networkNamespace": "example-cnf",
                      "resourceName": "intel_numa0_res2",
                      "spoofChk": "off",
                      "trust": "on",
                      "vlan": 410
                  }
              },
              {
                  "apiVersion": "sriovnetwork.openshift.io/v1",
                  "kind": "SriovNetwork",
                  "metadata": {
                      "annotations": {
                          "operator.sriovnetwork.openshift.io/last-network-namespace": "example-cnf"
                      },
                      "creationTimestamp": "2024-03-14T09:09:25Z",
                      "finalizers": [
                          "netattdef.finalizers.sriovnetwork.openshift.io"
                      ],
                      "generation": 1,
                      "name": "intel-numa0-net4",
                      "namespace": "openshift-sriov-network-operator",
                      "resourceVersion": "707854",
                      "uid": "3014e259-07ea-42d6-ba72-00e44d6d8fe7"
                  },
                  "spec": {
                      "capabilities": "{\"mac\": true}",
                      "logLevel": "info",
                      "networkNamespace": "example-cnf",
                      "resourceName": "intel_numa0_res2",
                      "spoofChk": "off",
                      "trust": "on",
                      "vlan": 411
                  }
              }
          ],
          "kind": "List",
          "metadata": {
              "resourceVersion": ""
          }
      }
      
      5. Create a pod having interfaces that are connected to the desired SriovNetworks, example:
      
      $ oc get deployment testpmd-app -n example-cnf -o json
      {
      "apiVersion": "apps/v1",
      "kind": "Deployment",
      "metadata": {
          "annotations": {
              "deployment.kubernetes.io/revision": "1"
          },
          "creationTimestamp": "2024-03-14T09:16:23Z",
          "generation": 1,
          "name": "testpmd-app",
          "namespace": "example-cnf",
          "ownerReferences": [
              {
                  "apiVersion": "examplecnf.openshift.io/v1",
                  "kind": "TestPMD",
                  "name": "testpmd",
                  "uid": "f996edbb-a010-4de3-9781-101f4b9e56a4"
              }
          ],
          "resourceVersion": "731486",
          "uid": "65cda8b8-0bc0-456e-9c9b-f8acf1625c9d"
      },
      "spec": {
          "progressDeadlineSeconds": 600,
          "replicas": 1,
          "revisionHistoryLimit": 10,
          "selector": {
              "matchLabels": {
                  "example-cnf-type": "cnf-app",
                  "restart-on-reboot": "true"
              }
          },
          "strategy": {
              "rollingUpdate": {
                  "maxSurge": "25%",
                  "maxUnavailable": "25%"
              },
              "type": "RollingUpdate"
          },
          "template": {
              "metadata": {
                  "annotations": {
                      "k8s.v1.cni.cncf.io/networks": "[ { \"name\": \"intel-numa0-net1\", \"namespace\": \"example-cnf\" },          { \"name\": \"intel-numa0-net2\", \"namespace\": \"example-cnf\" }        ]"
                  },
                  "creationTimestamp": null,
                  "labels": {
                      "example-cnf-type": "cnf-app",
                      "restart-on-reboot": "true"
                  }
              },
              "spec": {
                  "affinity": {
                      "podAntiAffinity": {
                          "requiredDuringSchedulingIgnoredDuringExecution": [
                              {
                                  "labelSelector": {
                                      "matchExpressions": [
                                          {
                                              "key": "example-cnf-type",
                                              "operator": "In",
                                              "values": [
                                                  "lb-app"
                                              ]
                                          }
                                      ]
                                  },
                                  "topologyKey": "kubernetes.io/hostname"
                              }
                          ]
                      }
                  },
                  "containers": [
                      {
                          "env": [
                              {
                                  "name": "NETWORK_NAME_LIST",
                                  "value": "openshift.io/intel_numa0_res1"
                              },
                              {
                                  "name": "TESTPMD_CPU_COUNT",
                                  "value": "6"
                              },
                              {
                                  "name": "NODE_NAME",
                                  "valueFrom": {
                                      "fieldRef": {
                                          "apiVersion": "v1",
                                          "fieldPath": "spec.nodeName"
                                      }
                                  }
                              },
                              {
                                  "name": "CR_NAME",
                                  "value": "testpmd"
                              },
                              {
                                  "name": "eth_peer",
                                  "value": "0,20:04:0f:f1:89:01;1,20:04:0f:f1:89:02;"
                              },
                              {
                                  "name": "socket_mem",
                                  "value": "1024"
                              },
                              {
                                  "name": "memory_channels",
                                  "value": "4"
                              },
                              {
                                  "name": "rx_queues",
                                  "value": "1"
                              },
                              {
                                  "name": "tx_queues",
                                  "value": "1"
                              },
                              {
                                  "name": "rx_descriptors",
                                  "value": "1024"
                              },
                              {
                                  "name": "tx_descriptors",
                                  "value": "1024"
                              }
                          ],
                          "image": "xxx",
                          "imagePullPolicy": "IfNotPresent",
                          "lifecycle": {
                              "postStart": {
                                  "exec": {
                                      "command": [
                                          "/bin/sh",
                                          "-c",
                                          "echo Hello from the postStart handler"
                                      ]
                                  }
                              },
                              "preStop": {
                                  "exec": {
                                      "command": [
                                          "/bin/sh",
                                          "-c",
                                          "echo Hello from the preStop handler"
                                      ]
                                  }
                              }
                          },
                          "livenessProbe": {
                              "failureThreshold": 3,
                              "httpGet": {
                                  "path": "/healthz",
                                  "port": 8095,
                                  "scheme": "HTTP"
                              },
                              "initialDelaySeconds": 15,
                              "periodSeconds": 10,
                              "successThreshold": 1,
                              "timeoutSeconds": 1
                          },
                          "name": "testpmd",
                          "readinessProbe": {
                              "failureThreshold": 3,
                              "httpGet": {
                                  "path": "/readyz",
                                  "port": 8095,
                                  "scheme": "HTTP"
                              },
                              "initialDelaySeconds": 5,
                              "periodSeconds": 10,
                              "successThreshold": 1,
                              "timeoutSeconds": 1
                          },
                          "resources": {
                              "limits": {
                                  "cpu": "6",
                                  "hugepages-1Gi": "4Gi",
                                  "memory": "1000Mi",
                                  "openshift.io/intel_numa0_res1": "2"
                              },
                              "requests": {
                                  "cpu": "6",
                                  "hugepages-1Gi": "4Gi",
                                  "memory": "1000Mi",
                                  "openshift.io/intel_numa0_res1": "2"
                              }
                          },
                          "securityContext": {
                              "capabilities": {
                                  "add": [
                                      "IPC_LOCK",
                                      "NET_ADMIN"
                                  ]
                              }
                          },
                          "startupProbe": {
                              "failureThreshold": 3,
                              "httpGet": {
                                  "path": "/startz",
                                  "port": 8095,
                                  "scheme": "HTTP"
                              },
                              "initialDelaySeconds": 30,
                              "periodSeconds": 10,
                              "successThreshold": 1,
                              "timeoutSeconds": 1
                          },
                          "terminationMessagePath": "/dev/termination-log",
                          "terminationMessagePolicy": "FallbackToLogsOnError",
                          "volumeMounts": [
                              {
                                  "mountPath": "/dev/hugepages",
                                  "name": "hugepage"
                              },
                              {
                                  "mountPath": "/var/log/testpmd",
                                  "name": "log-dir"
                              },
                              {
                                  "mountPath": "/var/lib/testpmd",
                                  "name": "lib-dir"
                              }
                          ]
                      }
                  ],
                  "dnsPolicy": "ClusterFirst",
                  "restartPolicy": "Always",
                  "schedulerName": "default-scheduler",
                  "securityContext": {},
                  "serviceAccount": "testpmd-account",
                  "serviceAccountName": "testpmd-account",
                  "terminationGracePeriodSeconds": 30,
                  "volumes": [
                      {
                          "emptyDir": {
                              "medium": "HugePages"
                          },
                          "name": "hugepage"
                      },
                      {
                          "emptyDir": {},
                          "name": "log-dir"
                      },
                      {
                          "emptyDir": {},
                          "name": "lib-dir"
                      }
                  ]
              }
          }
      },
      "status": {
          "availableReplicas": 1,
          "conditions": [
              {
                  "lastTransitionTime": "2024-03-14T09:16:23Z",
                  "lastUpdateTime": "2024-03-14T09:17:04Z",
                  "message": "ReplicaSet \"testpmd-app-5dcb4f7775\" has successfully progressed.",
                  "reason": "NewReplicaSetAvailable",
                  "status": "True",
                  "type": "Progressing"
              },
              {
                  "lastTransitionTime": "2024-03-14T10:19:41Z",
                  "lastUpdateTime": "2024-03-14T10:19:41Z",
                  "message": "Deployment has minimum availability.",
                  "reason": "MinimumReplicasAvailable",
                  "status": "True",
                  "type": "Available"
              }
          ],
          "observedGeneration": 1,
          "readyReplicas": 1,
          "replicas": 1,
          "updatedReplicas": 1
      }
      }
      
      

      Actual results:

      If getting the pod info in OCP >= 4.14, if you check the network-status annotation, we can see the MAC address on the interfaces created for the SRIOV networks. Example below is with OCP 4.14:
      
      "k8s.v1.cni.cncf.io/network-status": "[{
          \"name\": \"ovn-kubernetes\",
          \"interface\": \"eth0\",
          \"ips\": [
              \"10.129.2.50\",
              \"fd02:0:0:6::32\"
          ],
          \"mac\": \"0a:58:0a:81:02:32\",
          \"default\": true,
          \"dns\": {}
      },{
          \"name\": \"example-cnf/intel-numa0-net1\",
          \"interface\": \"net1\",
          \"mac\": \"20:04:0f:f1:89:01\",
          \"dns\": {},
          \"device-info\": {
              \"type\": \"pci\",
              \"version\": \"1.1.0\",
              \"pci\": {
                  \"pci-address\": \"0000:37:02.5\"
              }
          }
      },{
          \"name\": \"example-cnf/intel-numa0-net2\",
          \"interface\": \"net2\",
          \"mac\": \"20:04:0f:f1:89:02\",
          \"dns\": {},
          \"device-info\": {
              \"type\": \"pci\",
              \"version\": \"1.1.0\",
              \"pci\": {
                  \"pci-address\": \"0000:37:02.1\"
              }
          }
      }]"
      
      However, in OCP 4.12 and OCP 4.13, the MAC address in the vfio-pci interfaces are not displayed. Example below is with OCP 4.12 and with a different pod but following the same SRIOV network setup. Same happens with OCP 4.13:
      
      "k8s.v1.cni.cncf.io/network-status": "[{
          \"name\": \"ovn-kubernetes\",
          \"interface\": \"eth0\",
          \"ips\": [
              \"10.130.2.37\",
              \"fd02:0:0:7::25\"
          ],
          \"mac\": \"0a:58:0a:82:02:25\",
          \"default\": true,
          \"dns\": {}
      },{
          \"name\": \"example-cnf/intel-numa0-net1\",
          \"interface\": \"net1\",
          \"dns\": {},
          \"device-info\": {
              \"type\": \"pci\",
              \"version\": \"1.0.0\",
              \"pci\": {
                  \"pci-address\": \"0000:37:03.0\"
              }
          }
      },{
          \"name\": \"example-cnf/intel-numa0-net2\",
          \"interface\": \"net2\",
          \"dns\": {},
          \"device-info\": {
              \"type\": \"pci\",
              \"version\": \"1.0.0\",
              \"pci\": {
                  \"pci-address\": \"0000:37:02.2\"
              }
          }
      }]"
      

      Expected results:

      The result should be the same than the one observed in OCP >= 4.14: the MAC address should appear in the interfaces created by SRIOV resources.
      

      Additional info:

      Probably the issue can be reproduced by just using one SriovNetworkNodePolicy and one SriovNetwork, I have just left our test scenario for the sake of completeness.
      Note that the SriovNetwork config, regarding the MAC capability, has not changed in these OCP releases. In fact, checking the logs, oldest version I've checked, which is OCP 4.6, already defined the MAC address support in the same way it's used in latest OCP releases: https://access.redhat.com/documentation/es-es/openshift_container_platform/4.6/html/networking/hardware-networks
      

      Attachments

        Activity

          People

            sscheink@redhat.com Sebastian Scheinkman
            raperez@redhat.com Ramon Perez
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: