Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-13220

Initial server list switch should increment topology age


    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 13.0.0.Final
    • 9.4.23.Final, 11.0.11.Final, 12.1.7.Final, 13.0.0.Final
    • Hot Rod
    • None

      When ChannelFactory switches to an alternative cluster after all the servers are marked as failed (or, in older versions, once max-retries attempts have failed with a transport error), it increments topologyAge to prevent concurrent switching:

      1. Other threads deciding to switch clusters have an older age and give up.
      2. Topology updates from the old cluster also have an older age (the age sent in the Hot Rod request header), so they are ignored.

      Switching to the initial server list is very similar to switching to another cluster, but it does not currently increment topologyAge. That makes it possible for multiple threads to try switching in parallel, sometimes reverting the latest topology update from the servers.

      The impact is higher in versions that do not include the ISPN-12598 fix, because the repeated switching and closing of connections is enough to get one operation to exhaust max-retries and switch to the initial server list again.

            dberinde@redhat.com Dan Berindei (Inactive)
            dberinde@redhat.com Dan Berindei (Inactive)
            0 Vote for this issue
            2 Start watching this issue