Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-4860

[8.3] Hot Rod java client retries too many times


      The Java Hot Rod client has a maxRetries configuration option which tells it how many times to retry an operation after a failure (default: 10).

      When the number of retries is exceeded, the client does not fail immediately: instead, it tries to switch to another site, and tries maxRetries times on the new site as well. The client doesn't keep track of the clusters it switched off of, so it seems possible to go in an infinite loop, switching from one site to the next.

      If the client cannot switch to another site (e.g. because it was configured with a single site), it logs a debug message (`Cluster might have completely shut down, try resetting transport layer and topology id`) and tries the current site again for maxRetries times. So the actual number of retries with a single site is 2 * maxRetries (or 1, if maxRetries == 0).

      Maybe automatic site switching is a good idea in some cases, but I'm not convinced it should be the default behaviour. At the very least, site switching should be decided at the remote cache manager level, when the client fails to open a new connection to any server in the current site, and not based on the number of retries done for any particular operation.

            dberinde@redhat.com Dan Berindei (Inactive)
            dberinde@redhat.com Dan Berindei (Inactive)
            0 Vote for this issue
            1 Start watching this issue