Uploaded image for project: 'RHOS Request for Features'
  1. RHOS Request for Features
  2. RHOSRFE-231

[Octavia/OVN] Preserve FIP reachability across VRRP failover

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Major Major
    • Neutron, Octavia LBaaS
    • None
    • Moderate
    • False
    • False
    • Hide

      None

      Show
      None

      Feature Request Overview 

      In RHOSO environments using Octavia Amphora (VRRP) with Neutron ML2/OVN, enable_distributed_floating_ip = True, and ovn-bgp-agent, a VRRP failover between amphorae can render the Floating IP (FIP) to the VIP unreachable even though internal VIP connectivity remains intact. Root cause: Octavia intentionally creates the VIP “reservation” port admin_state_up = false; when the OVN Logical_Switch_Port reflects this as enabled = false, Neutron’s OVN driver removes/loses neutron:host_id on LSP update, breaking ovn-bgp-agent’s ability to locate the chassis to (re-)announce the FIP after failover. This is tracked upstream as Launchpad Bug #2125573 and acknowledged by Octavia and Neutron maintainers. We’re requesting Red Hat to (a) deliver a supported fix path (backport or config-gated change) and (b) document a supported workaround in RHOSO.

       

      Business justification 

      • Customers using RHOSO with OVN and ovn-bgp-agent expect zero-touch HA during amphora VRRP failover. Losing FIP reachability on failover defeats the LB’s HA purpose and causes visible production outages. This degrades trust in RHOSO/OVN for external traffic handling and forces ops teams into disruptive “bounce the LB/amphora” playbooks to restore reachability.
      • The underlying behaviors are well-documented: Octavia’s VIP reservation port is intentionally created admin_state_up: False (historically to keep it unbound and avoid DVR bottlenecks), and Amphora VRRP depends on allowed-address-pairs. With OVN/DFIP + ovn-bgp-agent, disabling the VIP port leads OVN to treat the LSP as disabled (ingress/egress dropped) and to lose host mapping, which breaks FIP advertising. Red Hat shipping a supported mitigation reduces escalation load and aligns RHOSO with current OVN + ovn-bgp-agent deployments.

      Functional requirements

      1. On the neutron side, ensure OVN ML2 driver retains neutron:host_id for disabled ports that back Octavia VIP reservations (or otherwise preserve the chassis mapping used by ovn-bgp-agent) so DFIP announcements survive VRRP failover.
      2. Add a tempest/CI scenario for RHOSO that
        A. Creates an Amphora LB with a VIP+FIP (OVN+DFIP+ovn-bgp-agent)
        B. Triggers a VRRP failover.
        C. Verifies FIP reachability within bounded time without recreating amphorae.
      3. Update documentation to explain Octavia VIP reservation semantics (VIP port disabled vs base VRRP ports), OVN LSP enabled behavior, and ovn-bgp-agent reliance on neutron:host_id.  As well as publishing clear deployment guidance for OVN + DFIP + ovn-bgp-agent with Octavia Amphora.

      Describe the customer impact

      During amphora VRRP failover, external traffic to the LB VIP via FIP blackholes even though the LB is otherwise healthy. Internal VIP traffic and full amphora/LB restarts still work, but the HA promise for internet-facing services is broken at the exact moment it’s needed. This impacts availability SLAs and can force emergency operational workarounds (failing over entire LBs or recreating amphorae), causing prolonged incidents. Upstream maintainers confirm the current design trade-offs (disabled VIP reservation port) and are discussing whether Neutron/OVN behavior has changed enough to warrant Octavia changes; Red Hat needs a supported path for customers now.

      Point of contact

      Additional links

      https://bugs.launchpad.net/octavia/%2Bbug/2125573

       

              michjohn@redhat.com Michael Johnson
              gprocuni@redhat.com Greg Procunier
              Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: