-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.14
-
Quality / Stability / Reliability
-
False
-
-
3
-
Important
-
No
-
None
-
None
-
Rejected
-
NHE Sprint 242, NHE Sprint 243
-
2
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
On dpu two design cluster. when installed sriov operator, if the job/pods in openshift-marketplace scheduled to dpu host, the it will be failed to started.
Version-Release number of selected component (if applicable):
4.14
How reproducible:
always
Steps to Reproduce:
1. setup DPU two design cluster
2. install sriov operator by subscription
OC_VERSION=stable
echo '${OC_VERSION}'
echo 'apiVersion: v1
kind: Namespace
metadata:
name: openshift-sriov-network-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: sriov-network-operators
namespace: openshift-sriov-network-operator
spec:
targetNamespaces:
- openshift-sriov-network-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: sriov-network-operator-subsription
namespace: openshift-sriov-network-operator
spec:
channel: '\"${OC_VERSION}\"'
name: sriov-network-operator
source: qe-app-registry
sourceNamespace: openshift-marketplace' | oc create -f -
3. Check the pod in openshift-marketplace, the pod `e4664691525d08c33e724a1c120af3d76360d69ba24b4e5ed0669216c2rlgxh` will trigger installplan to setup sriov operator. however sometimes it was scheduled dpu host and always ContainerCreating unless marked as dpu to schedule to false and make it schedule to master worker.
# oc get pod -n openshift-marketplace -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
certified-operators-c87qq 1/1 Running 0 3d3h 10.130.0.76 tenantcluster-master-1 <none> <none>
community-operators-jpcw5 1/1 Running 0 4h25m 10.130.0.83 tenantcluster-master-1 <none> <none>
d073ae03ff17c9777345936280ca4af890eb34396f0c26c3591873caa425p9g 0/1 Completed 0 5h49m 10.130.0.29 tenantcluster-master-1 <none> <none>
e4664691525d08c33e724a1c120af3d76360d69ba24b4e5ed0669216c2rlgxh 0/1 Completed 0 5d 10.130.0.97 tenantcluster-master-1 <none> <none>
Actual results:
normal container pod schedule to dpu host
Expected results:
normal container pod should not schedule to dpu host
Additional info:
- blocks
-
NHE-666 Install dpu-network-operator via subscription
-
- Closed
-