-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.17.z
-
None
-
Yes
-
CNF Network Sprint 261, CNF Network Sprint 262, CNF Network Sprint 264
-
3
-
False
-
Description of problem:
After externallymanaged SRIOV VFs being deleted then recreated with an additional VF, test pod that uses the sriovnetwork stuck in pending state
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Ran automated case OCP-63533, or follow the steps defined in OCP-63533 2. First created SR-IOV policy with 2 VFs, created two test pods, traffic between two pods ran fine. 3. After deleting SR-IOV policy with previous 2 VFs, recreated the SR-IOV policy with 3 VFs, then recreated test pods that use the sriovnetwork, test pods stuck the in pending state
Actual results:
Test pods that uses the recreated sriovnetwork stuck in pending state although nns are shown available
Expected results:
Test pods that uses the recreated sriovnetwork should be running, and traffic pass between two test pods
Additional info:
# oc describe pod/sriov-63533-test-pod1 Name: sriov-63533-test-pod1 Namespace: e2e-test-sriov-oalqtuey-ng8jq Priority: 0 Service Account: default Node: <none> Labels: app=sriov-63533-test-pod1 Annotations: k8s.v1.cni.cncf.io/networks: [ { "name": "sriovnn", "mac": "20:04:0f:f1:88:01", "ips": ["192.168.10.1/24"] } ] openshift.io/scc: anyuid Status: Pending IP: IPs: <none> Containers: sample-container: Image: quay.io/openshifttest/hello-sdn@sha256:c89445416459e7adea9a5a416b3365ed3d74f2491beb904d61dc8d1eb89a72a4 Port: <none> Host Port: <none> Limits: openshift.io/sriovnn: 1 Requests: openshift.io/sriovnn: 1 Environment: <none> Mounts: /etc/podnetinfo from podnetinfo (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-h28jr (ro) Conditions: Type Status PodScheduled False Volumes: kube-api-access-h28jr: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> podnetinfo: Type: DownwardAPI (a volume populated by information about the pod) Items: metadata.labels -> labels metadata.annotations -> annotations QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 70s default-scheduler 0/8 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 Insufficient openshift.io/sriovnn. preemption: 0/8 nodes are available: 3 Preemption is not helpful for scheduling, 5 No preemption victims found for incoming pod. # oc get nns NAME AGE master-0 94m master-1 94m master-2 94m openshift-qe-025.lab.eng.rdu2.redhat.com 94m openshift-qe-029.lab.eng.rdu2.redhat.com 94m worker-0 93m worker-1 94m worker-2 94m