Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-16922

FFU to 17.1 cause network outage if fail before OVNDB recreation

XMLWordPrintable

    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • openstack-tripleo-heat-templates-14.3.1-17.1.20250704160748.e7c7ce3.el9ost openstack-tripleo-heat-templates-14.3.1-17.1.20250704163743.e7c7ce3.el8ost
    • rhos-ops-day1day2-upgrades
    • None
    • Hide
      .Fixes forced restart of ovn-controller on every upgrade run
      Before this update, during an upgrade from RHOSP 16.2 to 17.1, tasks that were reused from an earlier environment forced ovn-controller to restart on every upgrade run. As a result, if the ovn-dbs were already taken down, the ovn-controller restart caused an outage. With this update, ovn-controller no longer restarts after every upgrade run.
      Show
      .Fixes forced restart of ovn-controller on every upgrade run Before this update, during an upgrade from RHOSP 16.2 to 17.1, tasks that were reused from an earlier environment forced ovn-controller to restart on every upgrade run. As a result, if the ovn-dbs were already taken down, the ovn-controller restart caused an outage. With this update, ovn-controller no longer restarts after every upgrade run.
    • Bug Fix
    • Done
    • RHOS Upgrades 2025 Sprint 6, RHOS Upgrades 2025 Sprint 7, Pending Compose
    • 3
    • Important

      To Reproduce Steps to reproduce the behavior:

      1. Execute overcloud upgrade without CephAnsibleRepo, that will fail at task:
        Fail if ceph-ansible doesn't belong to the specified repo
      2. Check that OVNDBs were removed at task:
        Remove OVNDBs from pacemaker
      3. Restart ovn-controller's causing network outage because of database unavailable.
        2025-05-00T00:00:00.000Z|00033|reconnect|INFO|tcp:10.10.10.22:6642: connection attempt failed (Connection refused)

      Expected behavior

      • Network continue to works if upgrade fails in a task not related to it.

      Device Info (please complete the following information):

      • OS Version: RHEL9.2
      • Red Hat OpenStack Platform release 17.1.4

      Bug impact

      • clients were not able to connect to their external IPs.

      Known workaround

      • Solve the upgrade issue and run another overcloud upgrade.
      • Need an workaround if the upgrade issue would take longer, maybe just recreate the databases.

      Additional context

      • The restart of ovn-controller could be executed from another failed overcloud upgrade.

              rhn-engineering-lbezdick Lukas Bezdicka
              rh-ee-cgussobo Conrado Gusso Bozza
              rhos-dfg-upgrades
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: