Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-15407

Can't install sriov networker operator using subscription on DPU mode cluster

XMLWordPrintable

    • Important
    • No
    • 3
    • NHE Sprint 242, NHE Sprint 243
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      On dpu two design cluster. when installed sriov operator, if the job/pods in openshift-marketplace scheduled to dpu host,  the it will be failed to started. 

      Version-Release number of selected component (if applicable):

      4.14

      How reproducible:

      always

      Steps to Reproduce:

      1. setup DPU two design cluster
      2. install sriov operator by subscription
      
      OC_VERSION=stable
      echo '${OC_VERSION}'
      echo 'apiVersion: v1
      kind: Namespace
      metadata:
        name: openshift-sriov-network-operator
      ---
      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: sriov-network-operators
        namespace: openshift-sriov-network-operator
      spec:
        targetNamespaces:
        - openshift-sriov-network-operator
      ---
      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: sriov-network-operator-subsription
        namespace: openshift-sriov-network-operator
      spec:
        channel: '\"${OC_VERSION}\"'
        name: sriov-network-operator
        source: qe-app-registry
        sourceNamespace: openshift-marketplace'  | oc create -f  -
      
      3. Check the pod in openshift-marketplace,  the pod `e4664691525d08c33e724a1c120af3d76360d69ba24b4e5ed0669216c2rlgxh` will trigger installplan to setup sriov operator.  however sometimes it was scheduled dpu host and always ContainerCreating unless marked as dpu to schedule to false and make it schedule to master worker. 
      
      # oc get pod -n openshift-marketplace -o wide
      NAME                                                              READY   STATUS      RESTARTS   AGE     IP             NODE                     NOMINATED NODE   READINESS GATES
      certified-operators-c87qq                                         1/1     Running     0          3d3h    10.130.0.76    tenantcluster-master-1   <none>           <none>
      community-operators-jpcw5                                         1/1     Running     0          4h25m   10.130.0.83    tenantcluster-master-1   <none>           <none>
      d073ae03ff17c9777345936280ca4af890eb34396f0c26c3591873caa425p9g   0/1     Completed   0          5h49m   10.130.0.29    tenantcluster-master-1   <none>           <none>
      e4664691525d08c33e724a1c120af3d76360d69ba24b4e5ed0669216c2rlgxh   0/1     Completed   0          5d      10.130.0.97    tenantcluster-master-1   <none>           <none> 

      Actual results:

      normal container pod schedule to dpu host

      Expected results:

      normal container pod should not schedule to dpu host

      Additional info:

       

            wizhao@redhat.com William Zhao
            zzhao1@redhat.com Zhanqi Zhao
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: