Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-144

[OVSDB] [RAFT] Inactivity interval too low while getting initial snapshot / joining the cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • openvswitch3.3
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      The inactivity probe in RAFT currently depends on the election timer.  However, if the new server is just joining a cluster, it will not know the desired election timer of that cluster, because the election timer value is part of the database.  In case the cluster is loaded enough to not be able to send the full initial snapshot in a short time, the new server may disconnect before it learns the desired election timer value for that cluster.

      Potential solution is to disable inactivity probe until the data is updated from the cluster.  But that should be done carefully in order to avoid waiting on a dead connection indefinitely.

      Alternative might be to check if the election timer value can be communicated before the bulk of the database data, for example, by sending a small empty update with only the cluster-related information.

            imaximet@redhat.com Ilya Maximets
            twilson@redhat.com Terry Wilson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: