Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Minor
Fix Version/s: 8.2.1.Final, 9.0.0.Final
Affects Version/s: None
Component/s: Core, Server, Test Suite
Labels:
None

Git Pull Request:
https://github.com/infinispan/infinispan/pull/4188, https://github.com/infinispan/infinispan/pull/4199

GMS.join_timeout is used by JGroups for two purposes:

Wait for FIND_INITIAL_MBRS responses. If other nodes are running, but they don't answer within join_timeout ms, the node will start a new partition by itself.
If no other nodes are running when the request is sent, but another node starts and sends its own discovery request within join_timeout, the initial cluster view will contain both nodes, but this isn't really useful in Infinispan (we have gcb.transport().initialClusterSize() instead).
Once a coordinator is located, the node sends a join request and waits for a response for join_timeout ms. After a timeout, the node re-sends the join request (up to a maximum of max_join_attempts, which defaults to 10).

The default GMS.join_timeout in Infinispan is 15000, vs. 2000 in JGroups (actually 3000 in GMS itself, but 2000 in the example configurations).

The higher timeout will only help us when a node is running, but it's inaccessible (e.g. because of a long GC) at the exact time a node is joining. I'd argue that applications that can tolerate multi-second pauses would be better served by gcb.transport().initialClusterSize(2) and/or an external discovery mechanism (e.g. FILE_PING, or something based on the WildFly domain controller). For most applications, the current default means just a 15s delay every time the cluster is (re)started.

In particular, because our integration tests use the default configuration, it means a delay of 15s for every test that starts a cluster.

is related to

WFLY-1066 Automatic configuration of 'Initial_hosts' for a cluster using JGroups TCP-stack in domain mode (aka DOMAIN_PING)

Open

Assignee:: Dan Berindei (Inactive)

Reporter:: Dan Berindei (Inactive)

Archiver:: Amol Dongare

Created:: 2016/03/18 9:22 AM

Updated:: 2019/11/22 4:36 AM

Resolved:: 2016/03/31 4:11 AM

Archived:: 2024/11/28 6:21 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty