Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-15750

pods always in ContainerCreating on OVN IC cluster

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • No
    • 7/6: telco review pending discussion at bug scrub
    • None
    • None
    • Proposed
    • SDN Sprint 239
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Using openshift/cluster-network-operator#1838,openshift/ovn-kubernetes#1728 two PR do pre-merge testing
      
      find some pods cannot be ready with error:
      
        Warning  FailedCreatePodSandBox  18m                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_network-metrics-daemon-967zc_openshift-multus_b3f0ad8c-4c71-4310-b890-38ce0a3c53c4_0(a510c4cd22e708ff850a2b1e14af54a82680563e4e4e352197ce492f0fe38f05): error adding pod openshift-multus_network-metrics-daemon-967zc to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [openshift-multus/network-metrics-daemon-967zc/b3f0ad8c-4c71-4310-b890-38ce0a3c53c4:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[openshift-multus/network-metrics-daemon-967zc a510c4cd22e708ff850a2b1e14af54a82680563e4e4e352197ce492f0fe38f05 network default NAD default] [openshift-multus/network-metrics-daemon-967zc a510c4cd22e708ff850a2b1e14af54a82680563e4e4e352197ce492f0fe38f05 network default NAD default] failed to configure pod interface: failed to add pod route 10.128.0.0/14 via 10.130.2.1: file exists
      '
      
      # oc get pod -A -o wide | grep ContainerCreating
      openshift-dns                                      dns-default-lzltr                                                    0/2     ContainerCreating   2              48m    <none>           openshift-qe-024.lab.eng.rdu2.redhat.com   <none>           <none>
      openshift-dns                                      dns-default-stxxc                                                    0/2     ContainerCreating   2              50m    <none>           openshift-qe-027.lab.eng.rdu2.redhat.com   <none>           <none>
      openshift-ingress-canary                           ingress-canary-tfkwc                                                 0/1     ContainerCreating   1              48m    <none>           openshift-qe-024.lab.eng.rdu2.redhat.com   <none>           <none>
      openshift-ingress-canary                           ingress-canary-zvzkh                                                 0/1     ContainerCreating   1              50m    <none>           openshift-qe-027.lab.eng.rdu2.redhat.com   <none>           <none>
      openshift-multus                                   network-metrics-daemon-c4dlh                                         0/2     ContainerCreating   2              50m    <none>           openshift-qe-027.lab.eng.rdu2.redhat.com   <none>           <none>
      openshift-network-diagnostics                      network-check-target-8khhn                                           0/1     ContainerCreating   1              50m    <none>           openshift-qe-027.lab.eng.rdu2.redhat.com   <none>           <none>
      openshift-network-diagnostics                      network-check-target-ppp47                                           0/1     ContainerCreating   1              49m    <none>           openshift-qe-024.lab.eng.rdu2.redhat.com   <none>           <none>
      
      
      

      Version-Release number of selected component (if applicable):

       

      How reproducible:

       

      Steps to Reproduce:

      1. build payload image with openshift/cluster-network-operator#1838,openshift/ovn-kubernetes#1728
      2. Setup dualstack cluster with IPI baremetal with above payload image
      3. Scale up two baremetal servers
      4. Find pods cannot be ready on those two scaled up workers

      Actual results:

      see node openshift-qe-024.lab.eng.rdu2.redhat.com and openshift-qe-027.lab.eng.rdu2.redhat.com are scale up worker. 
      
      # oc get node
      NAME                                       STATUS   ROLES                  AGE    VERSION
      master-0.sriov.openshift-qe.sdn.com        Ready    control-plane,master   124m   v1.27.3+ab0b8ee
      master-1.sriov.openshift-qe.sdn.com        Ready    control-plane,master   128m   v1.27.3+ab0b8ee
      master-2.sriov.openshift-qe.sdn.com        Ready    control-plane,master   124m   v1.27.3+ab0b8ee
      openshift-qe-024.lab.eng.rdu2.redhat.com   Ready    sriov,worker           52m    v1.27.3+ab0b8ee
      openshift-qe-027.lab.eng.rdu2.redhat.com   Ready    sriov,worker           53m    v1.27.3+ab0b8ee
      worker-0.sriov.openshift-qe.sdn.com        Ready    worker                 92m    v1.27.3+ab0b8ee
      worker-1.sriov.openshift-qe.sdn.com        Ready    worker                 91m    v1.27.3+ab0b8ee
      
      #####found some pods are not ready on those two worker####
      
      
      
      

      Expected results:

       

      Additional info:

      sounds like time condition issue. 
      
      if I recreate the pods, it can become ready. 

              ffernand@redhat.com Flavio Fernandes (Inactive)
              zzhao1@redhat.com Zhanqi Zhao
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: