-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
4.17.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
Yes
-
None
-
None
-
None
-
CNF Network Sprint 261, CNF Network Sprint 262, CNF Network Sprint 264, CNF Network Sprint 265, CNF Network Sprint 269
-
5
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
After externallymanaged SRIOV VFs being deleted then recreated with an additional VF, test pod that uses the sriovnetwork stuck in pending state
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Ran automated case OCP-63533, or follow the steps defined in OCP-63533
2. First created SR-IOV policy with 2 VFs, created two test pods, traffic between two pods ran fine.
3. After deleting SR-IOV policy with previous 2 VFs, recreated the SR-IOV policy with 3 VFs, then recreated test pods that use the sriovnetwork, test pods stuck the in pending state
Actual results:
Test pods that uses the recreated sriovnetwork stuck in pending state although nns are shown available
Expected results:
Test pods that uses the recreated sriovnetwork should be running, and traffic pass between two test pods
Additional info:
# oc describe pod/sriov-63533-test-pod1
Name: sriov-63533-test-pod1
Namespace: e2e-test-sriov-oalqtuey-ng8jq
Priority: 0
Service Account: default
Node: <none>
Labels: app=sriov-63533-test-pod1
Annotations: k8s.v1.cni.cncf.io/networks:
[
{
"name": "sriovnn",
"mac": "20:04:0f:f1:88:01",
"ips": ["192.168.10.1/24"]
}
]
openshift.io/scc: anyuid
Status: Pending
IP:
IPs: <none>
Containers:
sample-container:
Image: quay.io/openshifttest/hello-sdn@sha256:c89445416459e7adea9a5a416b3365ed3d74f2491beb904d61dc8d1eb89a72a4
Port: <none>
Host Port: <none>
Limits:
openshift.io/sriovnn: 1
Requests:
openshift.io/sriovnn: 1
Environment: <none>
Mounts:
/etc/podnetinfo from podnetinfo (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-h28jr (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-h28jr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
podnetinfo:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.labels -> labels
metadata.annotations -> annotations
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 70s default-scheduler 0/8 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 Insufficient openshift.io/sriovnn. preemption: 0/8 nodes are available: 3 Preemption is not helpful for scheduling, 5 No preemption victims found for incoming pod.
# oc get nns
NAME AGE
master-0 94m
master-1 94m
master-2 94m
openshift-qe-025.lab.eng.rdu2.redhat.com 94m
openshift-qe-029.lab.eng.rdu2.redhat.com 94m
worker-0 93m
worker-1 94m
worker-2 94m