Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: JDG 7.3.1 ER2
Affects Version/s: JDG 7.2.3 GA, JDG 7.3 ER3
Component/s: Clustering
Labels:
None

Affects:

Release Notes
CDW devel_ack:
CDW docs_ack:
CDW pm_ack:
CDW qa_ack:
CDW release:
Target Release:

JDG 7.3.1 GA
Workaround:

Workaround Exists
Workaround Description:

Hide

increasing segment

Show
increasing segment
Git Pull Request:
https://github.com/infinispan/infinispan/pull/6704
Steps to Reproduce:
Hide

1. set segments to 1, and add machine setting to transport.

<distributed-cache name="default" segments="1" /> ... <stack name="udp"> <transport type="UDP" socket-binding="jgroups-udp" machine="${jboss.jgroups.transport.machine:machine1}" rack="${jboss.jgroups.transport.rack:rack1}" site="${jboss.jgroups.transport.site:site1}" /> </stack>

2. startup 3 nodes.
3. the 3rd node will fail with Replication timeout by state-transfer timeout.

This log and clustered.xml was attached as log.zip.
Show
1. set segments to 1, and add machine setting to transport. <distributed-cache name= " default " segments= "1" /> ... <stack name= "udp" > <transport type= "UDP" socket-binding= "jgroups-udp" machine= "${jboss.jgroups.transport.machine:machine1}" rack= "${jboss.jgroups.transport.rack:rack1}" site= "${jboss.jgroups.transport.site:site1}" /> </stack> 2. startup 3 nodes. 3. the 3rd node will fail with Replication timeout by state-transfer timeout. This log and clustered.xml was attached as log.zip.

Sprint:
JDG Sprint #25

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

When setting small segment to a cache and using server hinting, node can't start with the following error[1].
It can be reproduced with RHDG 7.2.3 and 7.3 ER2.

[1]

ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service jboss.datagrid-infinispan.clustered.test: org.jboss.msc.service.StartException in service jboss.datagrid-infinispan.clustered.test: Failed to start service
...
Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.start() throws java.lang.Exception on object of type StateTransferManagerImpl
...
Caused by: org.infinispan.util.concurrent.TimeoutException: Replication timeout for svr01 (flags=0), site-id=site1, rack-id=rack1, machine-id=machine1)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:916)
...

For example, 3rd node will fail to start with the following setting in 3 nodes cluster.
When set the segments to 20 (6.6.2 default), 6th node will fail to start with the above timeout.
Nodes seems to not be able to finish the initial state transfer and start up fails if the segments are set insufficiently against the number of nodes,

<distributed-cache name="default" segments="1" />
...
<stack name="udp">
    <transport type="UDP" socket-binding="jgroups-udp" machine="${jboss.jgroups.transport.machine:machine1}" rack="${jboss.jgroups.transport.rack:rack1}" site="${jboss.jgroups.transport.site:site1}" />
</stack>

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Hide
logs.zip
2019/01/29 12:39 AM
16 kB
Hiroki Daicho
Extracting archive...
Show
logs.zip
2019/01/29 12:39 AM
16 kB
Hiroki Daicho
Hide
reproducer.zip
2019/02/04 3:53 AM
150 kB
Hiroki Daicho
Extracting archive...
Show
reproducer.zip
2019/02/04 3:53 AM
150 kB
Hiroki Daicho

is cloned by

ISPN-9908 Cache startup failure with server hinting and insufficient segments

Closed

Assignee:: Dan Berindei (Inactive)

Reporter:: Hiroki Daicho (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2019/01/29 12:31 AM

Updated:: 2023/05/08 4:49 PM

Resolved:: 2019/03/08 6:14 AM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates