Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-7062

OVN-k pods may not be starting occasionally

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 4.13.0
    • 4.13.0
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      MicroShift fails to start due to the failing OVN-k pods

      Version-Release number of selected component (if applicable):

      $ microshift version
      MicroShift Version: 4.13.0_0.nightly_2023_01_27_165107_20230205163228_c6e90108
      Base OCP Version: 4.13.0-0.nightly-2023-01-27-165107

      How reproducible:

      Occasionally

      Steps to Reproduce:

      echo 1 | cleanup-all-microshift-data
      sudo systemctl start microshift
      watch sudo $(which oc) --kubeconfig /var/lib/microshift/resources/kubeadmin/kubeconfig get pods -A

      Actual results:

      $ sudo $(which oc) --kubeconfig /var/lib/microshift/resources/kubeadmin/kubeconfig get pods -A
      NAMESPACE                  NAME                                  READY   STATUS    RESTARTS      AGE
      openshift-dns              node-resolver-67628                   1/1     Running   0             33m
      openshift-ingress          router-default-5d4d6d9cf9-fpf2z       0/1     Pending   0             33m
      openshift-ovn-kubernetes   ovnkube-master-xm6hp                  4/5     Running   1 (32m ago)   32m
      openshift-service-ca       service-ca-66dc44968-xx5jn            0/1     Pending   0             33m
      openshift-storage          topolvm-controller-78c647d9c9-js9mm   0/4     Pending   0             33m

      Expected results:

      All OVN pods to start normaly

      Additional info:

      Full ovnkube-master logs attached.
      
      $ ls -l /var/run/ovn
      total 16
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovn-controller.60446.ctl
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovn-controller.61066.ctl
      -rw-r--r--. 1 root root 6 Feb  6 07:47 ovn-controller.pid
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovnnb_db.ctl
      -rw-r--r--. 1 root root 6 Feb  6 07:47 ovnnb_db.pid
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovnnb_db.sock
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovn-northd.60542.ctl
      -rw-r--r--. 1 root root 6 Feb  6 07:47 ovn-northd.pid
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovnsb_db.ctl
      -rw-r--r--. 1 root root 6 Feb  6 07:47 ovnsb_db.pid
      srwxr-x---. 1 root root 0 Feb  6 07:47 ovnsb_db.sock
      
      $ oc logs -n openshift-ovn-kubernetes -f  ovnkube-master-xm6hp -c ovn-controller -p
      2023-02-06T07:47:23Z|00001|vlog|INFO|opened log file /var/log/ovn/acl-audit-log.log
      2023-02-06T07:47:23.658Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
      2023-02-06T07:47:23.658Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
      2023-02-06T07:47:23.676Z|00004|main|INFO|OVN internal version is : [22.12.1-20.27.0-70.6]
      2023-02-06T07:47:23.676Z|00005|main|INFO|OVS IDL reconnected, force recompute.
      2023-02-06T07:47:23.676Z|00006|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connecting...
      2023-02-06T07:47:23.676Z|00007|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connection attempt failed (No such file or directory)
      2023-02-06T07:47:23.676Z|00008|main|INFO|OVNSB IDL reconnected, force recompute.
      2023-02-06T07:47:24.678Z|00009|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connecting...
      2023-02-06T07:47:24.678Z|00010|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connection attempt failed (No such file or directory)
      2023-02-06T07:47:24.678Z|00011|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: waiting 2 seconds before reconnect
      2023-02-06T07:47:26.679Z|00012|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connecting...
      2023-02-06T07:47:26.680Z|00013|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connected
      2023-02-06T07:47:26.695Z|00014|features|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
      2023-02-06T07:47:26.695Z|00015|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
      2023-02-06T07:47:26.703Z|00016|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
      2023-02-06T07:47:26.703Z|00017|features|INFO|OVS Feature: ct_zero_snat, state: supported
      2023-02-06T07:47:26.703Z|00018|main|INFO|OVS feature set changed, force recompute.
      2023-02-06T07:47:26.703Z|00019|ofctrl|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
      2023-02-06T07:47:26.703Z|00020|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
      2023-02-06T07:47:26.703Z|00021|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected

              zshi@redhat.com Zenghui Shi
              ggiguash@redhat.com Gregory Giguashvili
              None
              None
              John George John George
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: