Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-2912

BZ#2218465 Change default neutron gateway port scheduler behavior to deny scheduling to an empty AZ

XMLWordPrintable

    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • Committed
    • Committed
    • Committed
    • No
    • Moderate

      +++ This bug was initially created as a clone of Bug #2209092 +++

      +++ This bug was initially created as a clone of Bug #2209090 +++

      This bz is to follow up on the action items listed in https://bugzilla.redhat.com/show_bug.cgi?id=2195898#c9

      Copying here for reference.

      ===

      The issue with the router ports is the AZs. The AZs are used in Neutron to define a set of resources for high availability. If a router is created within an AZ (or several AZs), the L3 OVN scheduler ("OVNGatewayLeastLoadedScheduler" by default) will use this AZs information to filter out only the GW chassis with this AZ tag [1].

      In ML2/OVS, if we don't have any L3 agent on the requested AZ, the router port is not scheduled. In ML2/OVN will do the same filtering by AZ but the scheduler has a "default behavior" that is causing this issue: if no chassis candidates are provided, it will use all of them. In other words, if no GW chassis belong to the AZs requested, the L3 OVN scheduler will list all chassis (GW chassis and not GW chassis) and will schedule the LRP randomly on this list. This is why, in this customer case, you see this [2]

      • No chassis candidates for GLS AZ.
      • The LRP is scheduled in 5 chassis.

      Actions:

      • For current deployments (OSP16 and OSP17): keep the current behaviour but print a warning message in the server logs and the CLI. This warning message will inform the user that the LRP has been scheduled outside the router AZs, using any existing chassis in the environment (GW or not).
      • For newer versions (OSP18+): prevent this behaviour (and document it). First of all, revert the current "default behaviour"; that means if no GW chassis is available, do not schedule the LRP. That applies for calls with and without AZs filter. And at the same time, because the LRP scheduling is executed during the API call, raise an exception in the server and the CLI.

      — Additional comment from RHEL Program Management on 2023-05-22 15:19:20 UTC —

      This bugzilla has been removed from the release since it does not have an acked release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning.'

      — Additional comment from Ihar Hrachyshka on 2023-05-22 15:52:29 UTC —

      FYI the OSP 18 Jira ticket can be found here: https://issues.redhat.com/browse/OSP-25256

      — Additional comment from RHEL Program Management on 2023-05-29 14:41:01 UTC —

      This bugzilla has been removed from the release since it does not have an acked release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning.'

      — Additional comment from RHEL Program Management on 2023-05-29 14:47:01 UTC —

      This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag.

              rodolfo_alonso Rodolfo Alonso
              jira-bugzilla-migration RH Bugzilla Integration
              rhos-dfg-networking-squad-neutron
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: