Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14360

[4.10.z] MCP fails to update after SR-IOV drains the node

    XMLWordPrintable

Details

    • Moderate
    • No
    • 5
    • NHE Sprint 237
    • 1
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      After applying manifests for SR-IOV configuration MCP fails to update

      Version-Release number of selected component (if applicable):

      4.10.60

      How reproducible:

      100%

      Steps to Reproduce:

      1. Deploy baremetal OCP cluster
      2. Create MCPs - 1st for regular worker nodes and 2nd for SR-IOV workloads
      3. Apply manifests:
      ---
      apiVersion: sriovnetwork.openshift.io/v1
      kind: SriovOperatorConfig
      metadata:
        name: default
        namespace: openshift-sriov-network-operator
      spec:
        configDaemonNodeSelector:
          node-role.kubernetes.io/sriov: ""
        enableInjector: true
        enableOperatorWebhook: true 
      
      ---
      apiVersion: sriovnetwork.openshift.io/v1
      kind: SriovNetworkNodePolicy
      metadata:
        name: "sriovleftdpdkmellanox"
        namespace: openshift-sriov-network-operator
      spec:
        resourceName: "sriovleftdpdkmellanox"
        nodeSelector:
          node-role.kubernetes.io/sriov: ""
        mtu: 9000
        numVfs: 4
        nicSelector:
          # consider switching to PCI paths
          pfNames: ['ens4f0', 'ens5f0']
        deviceType: netdevice
        isRdma: True

      Actual results:

      Node stuck in SchedulingDisabled and MCP is marked as `paused`

      Expected results:

      SR-IOV is successfully configured and nodes are rebooted

      Additional info:

      Notes from Sebastian:
      ---------------------
      the problem is now we need to drain also after a reboot when the number of devices is 0
      but we are not able to lock the drain as in the reboot time another node already took the lock
      so we are in a dead lock

      Attachments

        Issue Links

          Activity

            People

              wizhao@redhat.com William Zhao
              yprokule@redhat.com Yurii Prokulevych
              Yurii Prokulevych Yurii Prokulevych
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: