Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36308

SR-IOV daemonset pods enter termination loop due to policy priority issue

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • None
    • 4.14.z, 4.15.z, 4.16
    • Networking / SR-IOV
    • None
    • Important
    • Yes
    • CNF Network Sprint 255
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      Cause: The operator lists SriovNetworkNodePolicies in a random order to create device-plugin DaemonSet
      Consequence: The device-plugin pods are restarted during every operator's reconcile loop.
      Fix: Make the operator list the policies in a deterministic order
      Show
      Cause: The operator lists SriovNetworkNodePolicies in a random order to create device-plugin DaemonSet Consequence: The device-plugin pods are restarted during every operator's reconcile loop. Fix: Make the operator list the policies in a deterministic order
    • Bug Fix
    • In Progress

      This is a clone of issue OCPBUGS-36243. The following is the description of the original issue:

      Description of problem:

      If policies are unsorted, it can result in a termination loop for the daemonset pods.
      
      Currently being worked on upstream under https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/721

      Version-Release number of selected component (if applicable):

      v4.14.0-202406060838

      How reproducible:

          

      Steps to Reproduce:

          1. Provision a cluster with a non-trivial number of SRIOV  nodes (> 6)
          2. Install the SRIOV network operator
          3. Create 10+ SriovNetworkNodePolicies using different values for `.spec.nodeSelector`
          

      Actual results:

      sriov-device-plugin daemonset pods enter a termination loop

      Expected results:

      sriov-device-plugin  daemonset pods enter a running state

      Additional info:

          

            apanatto@redhat.com Andrea Panattoni
            openshift-crt-jira-prow OpenShift Prow Bot
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: