Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36486

Pod stuck in creating state

XMLWordPrintable

    • Critical
    • No
    • SDN Sprint 256
    • 1
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      This is a clone of issue OCPBUGS-33005. The following is the description of the original issue:

      Description of problem:

          Pod stuck in creating state when running performance benchmark
      
      The exact error when describing the pod -
      Events:
        Type     Reason                  Age                    From     Message
        ----     ------                  ----                   ----     -------
        Warning  FailedCreatePodSandBox  45s (x114 over 3h47m)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_client-1-5c978b7665-n4tds_cluster-density-v2-35_f57d8281-5a79-4c91-9b83-bb3e4b553597_0(5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564): error adding pod cluster-density-v2-35_client-1-5c978b7665-n4tds to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&\{ContainerID:5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564 Netns:/var/run/netns/e06c9af7-c13d-426f-9a00-73c54441a20b IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=cluster-density-v2-35;K8S_POD_NAME=client-1-5c978b7665-n4tds;K8S_POD_INFRA_CONTAINER_ID=5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564;K8S_POD_UID=f57d8281-5a79-4c91-9b83-bb3e4b553597 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 104 114 111 111 116 68 105 114 34 58 34 47 104 111 115 116 114 111 111 116 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 99 110 105 47 110 101 116 46 100 47 49 48 45 111 118 110 45 107 117 98 101 114 110 101 116 101 115 46 99 111 110 102 34 44 34 99 110 105 67 111 110 102 105 103 68 105 114 34 58 34 47 104 111 115 116 47 101 116 99 47 99 110 105 47 110 101 116 46 100 34 44 34 99 110 105 86 101 114 115 105 111 110 34 58 34 48 46 51 46 49 34 44 34 100 97 101 109 111 110 83 111 99 107 101 116 68 105 114 34 58 34 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 34 103 108 111 98 97 108 78 97 109 101 115 112 97 99 101 115 34 58 34 100 101 102 97 117 108 116 44 111 112 101 110 115 104 105 102 116 45 109 117 108 116 117 115 44 111 112 101 110 115 104 105 102 116 45 115 114 105 111 118 45 110 101 116 119 111 114 107 45 111 112 101 114 97 116 111 114 34 44 34 108 111 103 76 101 118 101 108 34 58 34 118 101 114 98 111 115 101 34 44 34 108 111 103 84 111 83 116 100 101 114 114 34 58 116 114 117 101 44 34 109 117 108 116 117 115 65 117 116 111 99 111 110 102 105 103 68 105 114 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 99 110 105 47 110 101 116 46 100 34 44 34 109 117 108 116 117 115 67 111 110 102 105 103 70 105 108 101 34 58 34 97 117 116 111 34 44 34 110 97 109 101 34 58 34 109 117 108 116 117 115 45 99 110 105 45 110 101 116 119 111 114 107 34 44 34 110 97 109 101 115 112 97 99 101 73 115 111 108 97 116 105 111 110 34 58 116 114 117 101 44 34 112 101 114 78 111 100 101 67 101 114 116 105 102 105 99 97 116 101 34 58 123 34 98 111 111 116 115 116 114 97 112 75 117 98 101 99 111 110 102 105 103 34 58 34 47 118 97 114 47 108 105 98 47 107 117 98 101 108 101 116 47 107 117 98 101 99 111 110 102 105 103 34 44 34 99 101 114 116 68 105 114 34 58 34 47 101 116 99 47 99 110 105 47 109 117 108 116 117 115 47 99 101 114 116 115 34 44 34 99 101 114 116 68 117 114 97 116 105 111 110 34 58 34 50 52 104 34 44 34 101 110 97 98 108 101 100 34 58 116 114 117 101 125 44 34 115 111 99 107 101 116 68 105 114 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 34 116 121 112 101 34 58 34 109 117 108 116 117 115 45 115 104 105 109 34 125]} ContainerID:"5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564" Netns:"/var/run/netns/e06c9af7-c13d-426f-9a00-73c54441a20b" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=cluster-density-v2-35;K8S_POD_NAME=client-1-5c978b7665-n4tds;K8S_POD_INFRA_CONTAINER_ID=5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564;K8S_POD_UID=f57d8281-5a79-4c91-9b83-bb3e4b553597" Path:"" ERRORED: error configuring pod [cluster-density-v2-35/client-1-5c978b7665-n4tds] networking: [cluster-density-v2-35/client-1-5c978b7665-n4tds/f57d8281-5a79-4c91-9b83-bb3e4b553597:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[cluster-density-v2-35/client-1-5c978b7665-n4tds 5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564 network default NAD default] [cluster-density-v2-35/client-1-5c978b7665-n4tds 5a8d6897ca792d91f1c52054f5f8c596530fbf72d3abb07b19a20fd9c95cc564 network default NAD default] failed to configure pod interface: timed out waiting for OVS port binding (ovn-installed) for 0a:58:0a:83:03:f6 [10.131.3.246/23]
      '
      ': StdinData: \{"binDir":"/var/lib/cni/bin","clusterNetwork":"/host/run/multus/cni/net.d/10-ovn-kubernetes.conf","cniVersion":"0.3.1","daemonSocketDir":"/run/multus/socket","globalNamespaces":"default,openshift-multus,openshift-sriov-network-operator","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","namespaceIsolation":true,"type":"multus-shim"}
      

      Version-Release number of selected component (if applicable):

          4.16.0-ec.5\{code}
      How reproducible:
      {code:none}
          50-60%
      
      It seems to be related to the number of times I have ran our test on a single cluster. Many of our performance tests are on ephemeral clusters - so we build the cluster, run the test, tear down. Currently I have a long lived cluster (1 week old), and I have been running many performance tests against this cluster -- serially. After each test, the previous resources are cleaned up. \{code}
      Steps to Reproduce:
      {code:none}
          1. Use the following cmdline as an example.
          2.  ./bin/amd64/kube-burner-ocp cluster-density-v2 --iterations 90     3. Repeat until issue arises ( usually after 3-4 attempts)./
          \{code}
      Actual results:
      {code:none}
          client-1-5c978b7665-n4tds    0/1     ContainerCreating   0          4h14m
      

      Expected results:

          For the benchmark not to get stuck waiting for this pod. \{code}
      Additional info:
      {code:none}
          Looking at the ovnkube-controller pod logs, grepping for the pod which was stuck
      
      oc logs -n openshift-ovn-kubernetes ovnkube-node-qpkws -c ovnkube-controller | grep client-1-5c978b7665-n4tds
      
      W0425 13:12:09.302395    6996 base_network_controller_policy.go:545] Failed to get get LSP for pod cluster-density-v2-35/client-1-5c978b7665-n4tds NAD default for networkPolicy allow-from-openshift-ingress, err: logical port cluster-density-v2-35/client-1-5c978b7665-n4tds for pod cluster-density-v2-35_client-1-5c978b7665-n4tds not found in cache
      I0425 13:12:09.302412    6996 obj_retry.go:370] Retry add failed for *factory.localPodSelector cluster-density-v2-35/client-1-5c978b7665-n4tds, will try again later: unable to get port info for pod cluster-density-v2-35/client-1-5c978b7665-n4tds NAD default
      W0425 13:12:09.908446    6996 helper_linux.go:481] [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4] pod uid f57d8281-5a79-4c91-9b83-bb3e4b553597: timed out waiting for OVS port binding (ovn-installed) for 0a:58:0a:83:03:f6 [10.131.3.246/23]
      I0425 13:12:09.963651    6996 cni.go:279] [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default] ADD finished CNI request [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default], result "", err failed to configure pod interface: timed out waiting for OVS port binding (ovn-installed) for 0a:58:0a:83:03:f6 [10.131.3.246/23]
      I0425 13:12:09.988397    6996 cni.go:258] [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default] DEL starting CNI request [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default]
      W0425 13:12:09.996899    6996 helper_linux.go:697] Failed to delete pod "cluster-density-v2-35/client-1-5c978b7665-n4tds" interface 7f80514901cbc57: failed to lookup link 7f80514901cbc57: Link not found
      I0425 13:12:10.009234    6996 cni.go:279] [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default] DEL finished CNI request [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default], result "\{\"dns\":{}}", err <nil>
      I0425 13:12:10.059917    6996 cni.go:258] [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default] DEL starting CNI request [cluster-density-v2-35/client-1-5c978b7665-n4tds 7f80514901cbc57517d263f1a5aa143d2c82f470132c01f8ba813c18f3160ee4 network default NAD default]
      
      
      
      

              jcaamano@redhat.com Jaime Caamaño Ruiz
              openshift-crt-jira-prow OpenShift Prow Bot
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: