Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-807

Persistent DB replica for ovs idl

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • openvswitch3.1
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      Given the ovn-controller is connected to the Southbound database (SB DB) and the OVS IDL has an in-memory replica of the database,

      When the SB DB becomes unavailable and OVS IDL detects a disconnection,

      Then, the OVS IDL should persist the current in-memory replica of the database to a local file on disk.

      Given the SB DB is still unavailable,

      When the ovn-controller is restarted,

      Then, OVS IDL should load the previously saved database replica from the local file on disk and make it available to ovn-controller. This will allow the ovn-controller to continue handling requests based on the last known state.

      Show
      Given the ovn-controller is connected to the Southbound database (SB DB) and the OVS IDL has an in-memory replica of the database, When the SB DB becomes unavailable and OVS IDL detects a disconnection, Then, the OVS IDL should persist the current in-memory replica of the database to a local file on disk. – Given the SB DB is still unavailable, When the ovn-controller is restarted, Then, OVS IDL should load the previously saved database replica from the local file on disk and make it available to ovn-controller. This will allow the ovn-controller to continue handling requests based on the last known state.
    • rhel-sst-network-fastdatapath
    • ssg_networking

      I would like to propose a feature that could be used for the ovn-controller when it loses its connectivity to the database. Currently if the ovn-controller process disconnects from the database, it is still able to handle any upcalls in pinctrl because it has a local DB replica in memory. However, if in this case the ovn-controller service is restarted while the SB DB is still down, the replica is lost and ovn-controller is no longer able to handle any requests, such as DHCP renewals.

      A potential solution to this would be to storing the DB blob in a file locally to the filesystem. Then the content would be loaded on restart from the file, reconstructing the local replica.

      This won't solve an issue with stale data but will keep the state of ovn-controller same as it was prior to the restart.

      This requests comes from OpenStack customer escalation where ovn-controller was restarted (cause by a bug in the OpenStack installer) during and upgrade procedure but the OVN DBs were not up yet. It lead to a long outage for their workloads as the instances lost their IP assignments.

      The customer case for a reference: https://access.redhat.com/support/cases/#/case/03877136

              ovsdpdk-triage ovsdpdk triage
              jlibosva Jakub Libosvar
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: