Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-17501

ovs-ovn migration failure leads to connectivity break. All VMs are unreachable

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • rhos-17.1.9
    • None
    • openstack-neutron
    • None
    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • openstack-neutron-1:18.6.1-17.1.20250627040747.85ff760.el9ost
    • None
    • Critical

      Cu lost connectivity to all VMs and from neutron-server logs it seems lost ovsdb since nothing is listening on 6642.expected ovsdb to be listening on 6642.

      +++
      2025-06-15 04:00:21.123 7 INFO neutron.plugins.ml2.managers [req-d8f1bc4e-77fe-42e5-aadf-362856e7f83f - - - - -] Initializing mechanism driver 'ovn'
      2025-06-15 04:00:21.123 7 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver [req-d8f1bc4e-77fe-42e5-aadf-362856e7f83f - - - - -] Starting OVNMechanismDriver
      2025-06-15 04:00:21.125 7 ERROR ovsdbapp.backend.ovs_idl.idlutils [req-d8f1bc4e-77fe-42e5-aadf-362856e7f83f - - - - -] Unable to open stream to tcp:192.168.20.23:6642 to retrieve schema: Connection refused
      2025-06-15 04:00:21.126 7 ERROR ovsdbapp.backend.ovs_idl.idlutils [req-d8f1bc4e-77fe-42e5-aadf-362856e7f83f - - - - -] Unable to open stream to tcp:192.168.20.14:6642 to retrieve schema: Connection refused
      2025-06-15 04:00:21.127 7 ERROR ovsdbapp.backend.ovs_idl.idlutils [req-d8f1bc4e-77fe-42e5-aadf-362856e7f83f - - - - -] Unable to open stream to tcp:192.168.20.24:6642 to retrieve schema: Connection refused
      2025-06-15 04:00:21.127 7 ERROR neutron.service [req-d8f1bc4e-77fe-42e5-aadf-362856e7f83f - - - - -] Unrecoverable error: please check log for details.: Exception: Could not retrieve schema from tcp:192.168.20.23:6642,tcp:192.168.20.14:6642,tcp:192.168.20.24:6642
      2025-06-15 04:00:21.127 7 ERROR neutron.service Traceback (most recent call last):
      2025-06-15 04:00:21.127 7 ERROR neutron.service File "/usr/lib/python3.9/site-packages/neutron/service.py", line 88, in serve_wsgi
      2025-06-15 04:00:21.127 7 ERROR neutron.service service.start()

      grep 6642 0100-sosreport-oscar22ctr001-04170571-2025-06-15-gjfllfj.tar.xz/sosreport-oscar22ctr001-04170571-2025-06-15-gjfllfj/sos_commands/networking/netstat_-W_-neopa
      +++

      As far as ovn-migation is considered it failed because one of the node is faulty.

      +++

      2025-06-15 02:57:42,878 p=60025 u=stack n=ansible | fatal: [localhost]: FAILED! => {"changed": true, "cmd": "bash -eo pipefail /home/stack/common_templates/overcloud-deploy-ovn.sh 2>&1 > /home/stack/common_templates/overcloud-deploy-ovn.sh.log\n", "delta": "1:06:33.320598", "end": "2025-06-15 02:57:42.838850", "msg": "non-zero return code", "rc": 1, "start": "2025-06-15 01:51:09.518252", "stderr": "", "stderr_lines": [], "stdout": "/usr/lib/python3.9/site-packages/heatclient/common/template_utils.py:206: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working\n if isinstance(v, collections.Mapping):\nHost oscar22.tc.corp not found in /home/stack/.ssh/known_hosts", "stdout_lines": ["/usr/lib/python3.9/site-pa
      ckages/heatclient/common/template_utils.py:206: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working", " if isinstance(v, collections.Mapping):", "Host oscar22.tc.corp not found in /home/stack/.ssh/known_hosts"]}

      C[0;32m2025-06-15 04:14:02.414462 | 9440c987-a400-6181-5218-000000000034 | OK | Wait for connection to become available | 192.168.2.8ESC[0m
      ESC[1;30m2025-06-15 04:14:02.415116 | 9440c987-a400-6181-5218-000000000034 | TIMING | Wait for connection to become available |
      192.168.2.8

      0:00:48.298476 32.59sESC[0m
      ESC[0;32m2025-06-15 04:14:03.935874
      9440c987-a400-6181-5218-000000000034 OK Wait for connection to become available 192.168.1.157ESC[0m
      ESC[1;30m2025-06-15 04:14:03.936580
      9440c987-a400-6181-5218-000000000034 TIMING Wait for connection to become available

      192.168.1.157

      0:00:49.819938 32.27sESC[0m
      ESC[1;35m[WARNING]: Unhandled error in Python interpreter discovery for hostESC[0m
      ESC[1;35m192.168.0.207: Failed to connect to the host via ssh: ssh: connect to hostESC[0m
      ESC[1;35m192.168.0.207 port 22: No route to hostESC[0m

              ykarel@redhat.com Yatin Karel
              rhn-support-ravsingh Ravi Singh
              Renjing Xiao Renjing Xiao
              rhos-dfg-networking-squad-neutron
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: