Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-34934

sriov-device-plugin pod ends in a restart loop after deleting a SriovNetworkNodePolicy resource

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • None
    • 4.16
    • Networking / SR-IOV
    • None
    • Critical
    • No
    • CNF Network Sprint 254, CNF Network Sprint 255
    • 2
    • False
    • Hide

      None

      Show
      None
    • Hide
      KNOWN ISSUE ALREADY DOCUMENTED IN THE 4.16 NOTES

      If you delete a `SriovNetworkNodePolicy` resource for a virtual function with a `vfio-pci` driver type, the SR-IOV Network Operator is unable to reconcile the policy. As a consequence the `sriov-device-plugin` pod enters a continuous restart loop. As a workaround, delete all remaining policies affecting the physical function, then re-create them. (link:https://issues.redhat.com/browse/(link:https://issues.redhat.com/browse/OCPBUGS-34934[*OCPBUGS-34934*])[*
      Show
      KNOWN ISSUE ALREADY DOCUMENTED IN THE 4.16 NOTES If you delete a `SriovNetworkNodePolicy` resource for a virtual function with a `vfio-pci` driver type, the SR-IOV Network Operator is unable to reconcile the policy. As a consequence the `sriov-device-plugin` pod enters a continuous restart loop. As a workaround, delete all remaining policies affecting the physical function, then re-create them. (link: https://issues.redhat.com/browse/(link:https://issues.redhat.com/browse/OCPBUGS-34934 [* OCPBUGS-34934 *])[*
    • Known Issue
    • Done

      Description of problem:

      sriov-device-plugin pod ends in a restart loop after deleting a SriovNetworkNodePolicy resource.    

      Version-Release number of selected component (if applicable):

      4.16.0-rc.3
      sriov-network-operator.v4.16.0-202405301906

      How reproducible:

       100%

      Steps to Reproduce:

          1. On an SNO with DU profile create the following SNNP resources:
      
      apiVersion: sriovnetwork.openshift.io/v1
      kind: SriovNetworkNodePolicy
      metadata:
        name: snnp1
        namespace: openshift-sriov-network-operator
      spec:
        deviceType: vfio-pci
        isRdma: false
        nicSelector:
          pfNames:
          - ens2f3#32-33
        nodeSelector:
          node-role.kubernetes.io/master: ""
        numVfs: 48
        resourceName: snnp1
      #########################################
      apiVersion: sriovnetwork.openshift.io/v1
      kind: SriovNetworkNodePolicy
      metadata:
        name: snnp2
        namespace: openshift-sriov-network-operator
      spec:
        deviceType: vfio-pci
        isRdma: false
        nicSelector:
          pfNames:
          - ens2f3#34-35
        nodeSelector:
          node-role.kubernetes.io/master: ""
        numVfs: 48
        resourceName: snnp2
      
      
        2. Wait for the resources to show up in the node resources:
      
      oc get nodes -o json | jq -r .items[0].status.allocatable
      {
        "cpu": "60",
        "ephemeral-storage": "1725943497941",
        "hugepages-1Gi": "32Gi",
        "hugepages-2Mi": "0",
        "intel.com/intel_fec_acc100": "16",
        "management.workload.openshift.io/cores": "64k",
        "memory": "96370036Ki",
        "openshift.io/du_fh": "16",
        "openshift.io/du_mh": "16",
        "openshift.io/pci_sriov_net_f1": "2",
        "openshift.io/snnp1": "2",
        "openshift.io/snnp2": "2",
        "pods": "250"
      }
      
        3. Delete snnp2 resource:
      
      oc -n openshift-sriov-network-operator delete sriovnetworknodepolicy snnp2
      
      4. Check openshift-sriov-network-operator pods:
      
       oc -n openshift-sriov-network-operator get pods     

      Actual results:

          sriov-device-plugin pod gets restarted continuously
      
      oc -n openshift-sriov-network-operator get pods
      NAME                                      READY   STATUS              RESTARTS   AGE
      sriov-device-plugin-2ntll                 0/1     Terminating         0          4s
      sriov-device-plugin-4kw94                 0/1     ContainerCreating   0          0s
      sriov-network-config-daemon-59k4c         1/1     Running             0          53m
      sriov-network-operator-58c996d746-nwktl   1/1     Running             0          57m
      
      This also impacts the other sriov resources reporting 0 allocatable:
      
      oc get nodes -o json | jq -r .items[0].status.allocatable
      {
        "cpu": "60",
        "ephemeral-storage": "1725943497941",
        "hugepages-1Gi": "32Gi",
        "hugepages-2Mi": "0",
        "intel.com/intel_fec_acc100": "16",
        "management.workload.openshift.io/cores": "64k",
        "memory": "96370036Ki",
        "openshift.io/du_fh": "0",
        "openshift.io/du_mh": "0",
        "openshift.io/pci_sriov_net_f1": "0",
        "openshift.io/snnp1": "0",
        "openshift.io/snnp2": "0",
        "pods": "250"
      }
      

      Expected results:

      Resources get updated correctly after deletion.    

      Additional info:

      Attaching must-gather.    

            [OCPBUGS-34934] sriov-device-plugin pod ends in a restart loop after deleting a SriovNetworkNodePolicy resource

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (OpenShift Container Platform 4.16.16 security and extras update), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHBA-2024:7598

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (OpenShift Container Platform 4.16.16 security and extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:7598

            apanatto@redhat.com can you confirm that this is not a regression but a bug that has always been there? Thanks!

            Franck Baudin added a comment - apanatto@redhat.com can you confirm that this is not a regression but a bug that has always been there? Thanks!

            As a workaround, deleting and recreating all the remaining SriovNetworkNodePolicies solves the issue. This should help reduce the bug's severity.

            We should make it before 4.16 GA. I'll update the release notes fields as soon as I imagine to miss the deadline

            Andrea Panattoni added a comment - As a workaround, deleting and recreating all the remaining SriovNetworkNodePolicies solves the issue. This should help reduce the bug's severity. We should make it before 4.16 GA. I'll update the release notes fields as soon as I imagine to miss the deadline

            TL;DR
            The problem is that the sriov-network-config-daemon is not able to reconcile the desired state with the current configuration when a vfio-pci policy is deleted, but other policies on the same PF are still in place. Working on a fix.

            Detailed analysis:

            SriovNetworkNodeState
              spec:
                ...
                - name: ens2f3
                  numVfs: 48
                  pciAddress: "0000:12:00.3"
                  vfGroups:
                  - deviceType: vfio-pci
                    policyName: snnp1
                    resourceName: snnp1
                    vfRange: 32-33
            

            config daemon logs after setting `SriovOperatorConfig.Spec.LogLevel = 2`. VF 34 and 35 have been previously configured by the deleted snnp2. snnp1 keeps `numVfs = 48`:

            2024-06-06T10:13:25.89985124Z	LEVEL(-2)	generic/generic_plugin.go:130	generic plugin needDrainNode()	
                {"current": [
                    ...
                    {"name":"ens2f3","mac":"b4:96:91:d2:e6:49","driver":"ice","pciAddress":"0000:12:00.3","vendor":"8086","deviceID":"1593","mtu":1500,"numVfs":48,"linkSpeed":"-1 Mb/s","linkType":"ETH","eSwitchMode":"legacy","totalvfs":64,"Vfs":[
                        ...
                        {"name":"ens2f3v31","mac":"62:1c:60:56:a0:55","driver":"iavf","pciAddress":"0000:12:1c.7","vendor":"8086","deviceID":"1889","mtu":1500,"vfID":31},
                        {"driver":"vfio-pci","pciAddress":"0000:12:1d.0","vendor":"8086","deviceID":"1889","vfID":32},
                        {"driver":"vfio-pci","pciAddress":"0000:12:1d.1","vendor":"8086","deviceID":"1889","vfID":33},
                        {"driver":"vfio-pci","pciAddress":"0000:12:1d.2","vendor":"8086","deviceID":"1889","vfID":34},
                        {"driver":"vfio-pci","pciAddress":"0000:12:1d.3","vendor":"8086","deviceID":"1889","vfID":35},
                        {"name":"ens2f3v36","mac":"ea:09:b2:75:dd:ee","driver":"iavf","pciAddress":"0000:12:1d.4","vendor":"8086","deviceID":"1889","mtu":1500,"vfID":36},
                        ...
                    ]}], 
                    
                    "desired": [
                        ...
                        {"pciAddress":"0000:12:00.3","numVfs":48,"name":"ens2f3","vfGroups":[
                            {"resourceName":"snnp1","deviceType":"vfio-pci","vfRange":"32-33","policyName":"snnp1"}]}]}
            

            and they trigger the reconfiguration in NeedToUpdateSriov(...) but they doesn't get reconfigured
            (configSriovVFDevices(...)).

            The problem does not occur if the deleted SriovNetworkNodePolicy has `spec.deviceType: netdevice`:

            # Starting with no snnp2
            [kni@registry.kni-qe-0 ~]$ oc get sriovnetworknodepolicy
            NAME               AGE
            pci-sriov-net-f1   47h
            snnp1              18h
            sriov-nnp-du-fh    22h
            sriov-nnp-du-mh    22h
            
            # creating vfio-pci snnp2
            [kni@registry.kni-qe-0 ~]$ cat <<EOF | oc create -f -
            > apiVersion: sriovnetwork.openshift.io/v1
            > kind: SriovNetworkNodePolicy
            > metadata:
            >   name: snnp2
            >   namespace: openshift-sriov-network-operator
            > spec:
            >   deviceType: vfio-pci
            >   isRdma: false
            >   nicSelector:
            >     pfNames:
            >     - ens2f3#34-35
            >   nodeSelector:
            >     node-role.kubernetes.io/master: ""
            >   numVfs: 48
            >   resourceName: snnp2
            > EOF
            sriovnetworknodepolicy.sriovnetwork.openshift.io/snnp2 created
            
            # device plugin works correctly
            [kni@registry.kni-qe-0 ~]$ oc get pods
            NAME                                          READY   STATUS    RESTARTS   AGE
            ...
            sriov-device-plugin-7mftg                     1/1     Running   0          2m56s
            
            # deleting the vfio-pci policy
            [kni@registry.kni-qe-0 ~]$ oc delete sriovnetworknodepolicy snnp2
            sriovnetworknodepolicy.sriovnetwork.openshift.io "snnp2" deleted
            
            # device plugin problem occurs
            [kni@registry.kni-qe-0 ~]$ oc get pods
            NAME                                          READY   STATUS        RESTARTS   AGE
            ...
            sriov-device-plugin-sd2p2                     0/1     Pending       0          1s
            sriov-device-plugin-ztzm4                     0/1     Terminating   0          5s
            
            # creating a netdevice policy
            [kni@registry.kni-qe-0 ~]$ cat <<EOF | oc create -f -
            > apiVersion: sriovnetwork.openshift.io/v1
            > kind: SriovNetworkNodePolicy
            > metadata:
            >   name: snnp2-netdevice
            >   namespace: openshift-sriov-network-operator
            > spec:
            >   deviceType: netdevice
            >   isRdma: false
            >   nicSelector:
            >     pfNames:
            >     - ens2f3#34-35
            >   nodeSelector:
            >     node-role.kubernetes.io/master: ""
            >   numVfs: 48
            >   resourceName: snnp2
            > EOF
            sriovnetworknodepolicy.sriovnetwork.openshift.io/snnp2-netdevice created
            
            # device plugin works correctly
            [kni@registry.kni-qe-0 ~]$ oc get pods
            NAME                                      READY   STATUS    RESTARTS   AGE
            sriov-device-plugin-dtv8m                 1/1     Running   0          9s
            ...
            
            # deleting the netdevice policy
            [kni@registry.kni-qe-0 ~]$ oc delete sriovnetworknodepolicy snnp2-netdevice
            sriovnetworknodepolicy.sriovnetwork.openshift.io "snnp2-netdevice" deleted
            
            # device plugin is restarted and works correctly
            [kni@registry.kni-qe-0 ~]$ oc get pods
            NAME                                      READY   STATUS    RESTARTS   AGE
            sriov-device-plugin-bvlxj                 1/1     Running   0          8s
            

            Andrea Panattoni added a comment - TL;DR The problem is that the sriov-network-config-daemon is not able to reconcile the desired state with the current configuration when a vfio-pci policy is deleted, but other policies on the same PF are still in place. Working on a fix. Detailed analysis: SriovNetworkNodeState spec: ... - name: ens2f3 numVfs: 48 pciAddress: "0000:12:00.3" vfGroups: - deviceType: vfio-pci policyName: snnp1 resourceName: snnp1 vfRange: 32-33 config daemon logs after setting `SriovOperatorConfig.Spec.LogLevel = 2`. VF 34 and 35 have been previously configured by the deleted snnp2. snnp1 keeps `numVfs = 48`: 2024-06-06T10:13:25.89985124Z LEVEL(-2) generic/generic_plugin.go:130 generic plugin needDrainNode() {"current": [ ... {"name":"ens2f3","mac":"b4:96:91:d2:e6:49","driver":"ice","pciAddress":"0000:12:00.3","vendor":"8086","deviceID":"1593","mtu":1500,"numVfs":48,"linkSpeed":"-1 Mb/s","linkType":"ETH","eSwitchMode":"legacy","totalvfs":64,"Vfs":[ ... {"name":"ens2f3v31","mac":"62:1c:60:56:a0:55","driver":"iavf","pciAddress":"0000:12:1c.7","vendor":"8086","deviceID":"1889","mtu":1500,"vfID":31}, {"driver":"vfio-pci","pciAddress":"0000:12:1d.0","vendor":"8086","deviceID":"1889","vfID":32}, {"driver":"vfio-pci","pciAddress":"0000:12:1d.1","vendor":"8086","deviceID":"1889","vfID":33}, {"driver":"vfio-pci","pciAddress":"0000:12:1d.2","vendor":"8086","deviceID":"1889","vfID":34}, {"driver":"vfio-pci","pciAddress":"0000:12:1d.3","vendor":"8086","deviceID":"1889","vfID":35}, {"name":"ens2f3v36","mac":"ea:09:b2:75:dd:ee","driver":"iavf","pciAddress":"0000:12:1d.4","vendor":"8086","deviceID":"1889","mtu":1500,"vfID":36}, ... ]}], "desired": [ ... {"pciAddress":"0000:12:00.3","numVfs":48,"name":"ens2f3","vfGroups":[ {"resourceName":"snnp1","deviceType":"vfio-pci","vfRange":"32-33","policyName":"snnp1"}]}]} and they trigger the reconfiguration in NeedToUpdateSriov(...) but they doesn't get reconfigured ( configSriovVFDevices(...) ). The problem does not occur if the deleted SriovNetworkNodePolicy has `spec.deviceType: netdevice`: # Starting with no snnp2 [kni@registry.kni-qe-0 ~]$ oc get sriovnetworknodepolicy NAME AGE pci-sriov-net-f1 47h snnp1 18h sriov-nnp-du-fh 22h sriov-nnp-du-mh 22h # creating vfio-pci snnp2 [kni@registry.kni-qe-0 ~]$ cat <<EOF | oc create -f - > apiVersion: sriovnetwork.openshift.io/v1 > kind: SriovNetworkNodePolicy > metadata: > name: snnp2 > namespace: openshift-sriov-network-operator > spec: > deviceType: vfio-pci > isRdma: false > nicSelector: > pfNames: > - ens2f3#34-35 > nodeSelector: > node-role.kubernetes.io/master: "" > numVfs: 48 > resourceName: snnp2 > EOF sriovnetworknodepolicy.sriovnetwork.openshift.io/snnp2 created # device plugin works correctly [kni@registry.kni-qe-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE ... sriov-device-plugin-7mftg 1/1 Running 0 2m56s # deleting the vfio-pci policy [kni@registry.kni-qe-0 ~]$ oc delete sriovnetworknodepolicy snnp2 sriovnetworknodepolicy.sriovnetwork.openshift.io "snnp2" deleted # device plugin problem occurs [kni@registry.kni-qe-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE ... sriov-device-plugin-sd2p2 0/1 Pending 0 1s sriov-device-plugin-ztzm4 0/1 Terminating 0 5s # creating a netdevice policy [kni@registry.kni-qe-0 ~]$ cat <<EOF | oc create -f - > apiVersion: sriovnetwork.openshift.io/v1 > kind: SriovNetworkNodePolicy > metadata: > name: snnp2-netdevice > namespace: openshift-sriov-network-operator > spec: > deviceType: netdevice > isRdma: false > nicSelector: > pfNames: > - ens2f3#34-35 > nodeSelector: > node-role.kubernetes.io/master: "" > numVfs: 48 > resourceName: snnp2 > EOF sriovnetworknodepolicy.sriovnetwork.openshift.io/snnp2-netdevice created # device plugin works correctly [kni@registry.kni-qe-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE sriov-device-plugin-dtv8m 1/1 Running 0 9s ... # deleting the netdevice policy [kni@registry.kni-qe-0 ~]$ oc delete sriovnetworknodepolicy snnp2-netdevice sriovnetworknodepolicy.sriovnetwork.openshift.io "snnp2-netdevice" deleted # device plugin is restarted and works correctly [kni@registry.kni-qe-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE sriov-device-plugin-bvlxj 1/1 Running 0 8s

              apanatto@redhat.com Andrea Panattoni
              mcornea@redhat.com Marius Cornea
              Marius Cornea Marius Cornea
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: