Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36243

SR-IOV daemonset pods enter termination loop due to policy priority issue

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • None
    • 4.14.z, 4.15.z, 4.16
    • Networking / SR-IOV
    • None
    • Important
    • Yes
    • CNF Network Sprint 255
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the SR-IOV Network Operator was listing the `SriovNetworkNodePolicies` resources in random order. This caused the `sriov-device-plugin` pod to enter a continuous restart loop. With this release, the SR-IOV Network Operator lists policies in a deterministic order so that
      the `sriov-device-plugin` pod does not enter a continuous restart loop. (link:https://issues.redhat.com/browse/OCPBUGS-36243[*OCPBUGS-36243*])
      Show
      * Previously, the SR-IOV Network Operator was listing the `SriovNetworkNodePolicies` resources in random order. This caused the `sriov-device-plugin` pod to enter a continuous restart loop. With this release, the SR-IOV Network Operator lists policies in a deterministic order so that the `sriov-device-plugin` pod does not enter a continuous restart loop. (link: https://issues.redhat.com/browse/OCPBUGS-36243 [* OCPBUGS-36243 *])
    • Bug Fix
    • Done

      Description of problem:

      If policies are unsorted, it can result in a termination loop for the daemonset pods.
      
      Currently being worked on upstream under https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/721

      Version-Release number of selected component (if applicable):

      v4.14.0-202406060838

      How reproducible:

          

      Steps to Reproduce:

          1. Provision a cluster with a non-trivial number of SRIOV  nodes (> 6)
          2. Install the SRIOV network operator
          3. Create 10+ SriovNetworkNodePolicies using different values for `.spec.nodeSelector`
          

      Actual results:

      sriov-device-plugin daemonset pods enter a termination loop

      Expected results:

      sriov-device-plugin  daemonset pods enter a running state

      Additional info:

          

            apanatto@redhat.com Andrea Panattoni
            rh-ee-jocarrol John Carroll
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: