-
Bug
-
Resolution: Done
-
Major
-
7.4.2.GA
-
None
-
False
-
False
-
-
-
-
-
-
Environment:
- JBoss EAP 7.2
- 8 nodes cluster
Issue:
They started cluster nodes one by one. And they faced "ISPN000073: Unexpected error while replicating" due to the following ConcurrentModificationException on node3 :
2021-10-18 20:20:16,081 ERROR [org.infinispan.remoting.rpc.RpcManagerImpl] (transport-thread--p6-t17) ISPN000073: Unexpected error while replicating: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) at java.util.HashMap$EntryIterator.next(HashMap.java:1479) at java.util.HashMap$EntryIterator.next(HashMap.java:1477) at org.wildfly.clustering.marshalling.spi.util.MapExternalizer.writeMap(MapExternalizer.java:56) at org.wildfly.clustering.marshalling.spi.util.MapExternalizer.writeObject(MapExternalizer.java:51) at org.wildfly.clustering.marshalling.spi.util.MapExternalizer.writeObject(MapExternalizer.java:38) at org.wildfly.clustering.marshalling.spi.DefaultExternalizer.writeObject(DefaultExternalizer.java:179) at org.wildfly.clustering.marshalling.jboss.ExternalizerObjectTable$ExternalizerWriter.writeObject(ExternalizerObjectTable.java:126) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:137) at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58) at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111) at org.wildfly.clustering.marshalling.jboss.SimpleMarshalledValue.getBytes(SimpleMarshalledValue.java:78) at org.wildfly.clustering.marshalling.jboss.SimpleMarshalledValueExternalizer.writeObject(SimpleMarshalledValueExternalizer.java:51) at org.wildfly.clustering.marshalling.jboss.SimpleMarshalledValueExternalizer.writeObject(SimpleMarshalledValueExternalizer.java:36) at org.wildfly.clustering.marshalling.infinispan.AdvancedExternalizerAdapter.writeObject(AdvancedExternalizerAdapter.java:51) at org.infinispan.marshall.core.ExternalExternalizers$ForeignAdvancedExternalizer.writeObject(ExternalExternalizers.java:78) at org.infinispan.marshall.core.GlobalMarshaller.writeExternal(GlobalMarshaller.java:652) at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:406) at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:355) at org.infinispan.marshall.core.BytesObjectOutput.writeObject(BytesObjectOutput.java:26) at org.infinispan.container.entries.ImmortalCacheEntry$Externalizer.writeObject(ImmortalCacheEntry.java:126) at org.infinispan.container.entries.ImmortalCacheEntry$Externalizer.writeObject(ImmortalCacheEntry.java:122) at org.infinispan.marshall.core.GlobalMarshaller.writeInternal(GlobalMarshaller.java:638) at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:402) at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:355) at org.infinispan.marshall.core.BytesObjectOutput.writeObject(BytesObjectOutput.java:26) at org.infinispan.commons.marshall.MarshallUtil.marshallCollection(MarshallUtil.java:243) at org.infinispan.commons.marshall.MarshallUtil.marshallCollection(MarshallUtil.java:221) at org.infinispan.marshall.exts.CollectionExternalizer.writeObject(CollectionExternalizer.java:75) at org.infinispan.marshall.exts.CollectionExternalizer.writeObject(CollectionExternalizer.java:27) at org.infinispan.marshall.core.GlobalMarshaller.writeInternal(GlobalMarshaller.java:638) at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:402) at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:355) at org.infinispan.marshall.core.BytesObjectOutput.writeObject(BytesObjectOutput.java:26) at org.infinispan.statetransfer.StateChunk$Externalizer.writeObject(StateChunk.java:80) at org.infinispan.statetransfer.StateChunk$Externalizer.writeObject(StateChunk.java:65) at org.infinispan.marshall.core.GlobalMarshaller.writeInternal(GlobalMarshaller.java:638) at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:402) at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:355) at org.infinispan.marshall.core.BytesObjectOutput.writeObject(BytesObjectOutput.java:26) at org.infinispan.commons.marshall.MarshallUtil.marshallCollection(MarshallUtil.java:243) at org.infinispan.commons.marshall.MarshallUtil.marshallCollection(MarshallUtil.java:221) at org.infinispan.statetransfer.StateResponseCommand.writeTo(StateResponseCommand.java:131) at org.infinispan.marshall.exts.ReplicableCommandExternalizer.writeCommandParameters(ReplicableCommandExternalizer.java:71) at org.infinispan.marshall.exts.CacheRpcCommandExternalizer.marshallParameters(CacheRpcCommandExternalizer.java:121) at org.infinispan.marshall.exts.CacheRpcCommandExternalizer.writeObject(CacheRpcCommandExternalizer.java:117) at org.infinispan.marshall.exts.CacheRpcCommandExternalizer.writeObject(CacheRpcCommandExternalizer.java:67) at org.infinispan.marshall.core.GlobalMarshaller.writeInternal(GlobalMarshaller.java:638) at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:402) at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:355) at org.infinispan.marshall.core.GlobalMarshaller.writeObjectOutput(GlobalMarshaller.java:183) at org.infinispan.marshall.core.GlobalMarshaller.writeObjectOutput(GlobalMarshaller.java:176) at org.infinispan.marshall.core.GlobalMarshaller.objectToBuffer(GlobalMarshaller.java:305) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.marshallRequest(JGroupsTransport.java:1007) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.sendCommand(JGroupsTransport.java:990) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeCommand(JGroupsTransport.java:826) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.performSyncRemoteInvocation(JGroupsTransport.java:1120) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotelyAsync(JGroupsTransport.java:249) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotelyAsync(RpcManagerImpl.java:291) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:323) at org.infinispan.statetransfer.OutboundTransferTask.sendEntries(OutboundTransferTask.java:266) at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:205) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread(ClassLoaderThreadFactory.java:47) at java.lang.Thread.run(Thread.java:748) Caused by: an exception which occurred: in object java.util.HashMap@7f3d222
What can cause this ConcurrentModificationException? How can we resolve or avoid this?
Also, node4 failed to start (= failed to deploy application) due to "Initial state transfer timed out" after the above error:
2021-10-18 20:24:15,951 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 83) MSC000001: Failed to start service org.wildfly.clustering.infinispan.cache.web.example.war: org.jboss.msc.service.StartException in service org.wildfly.clustering.infinispan.cache.web.example.war: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.Exception on object of type StateTransferManagerImpl at org.wildfly.clustering.service.FunctionalService.start(FunctionalService.java:70) at org.wildfly.clustering.service.AsyncServiceConfigurator$AsyncService.lambda$start(AsyncServiceConfigurator.java:117) at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35) at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1985) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1487) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1378) at java.lang.Thread.run(Thread.java:748) at org.jboss.threads.JBossThread.run(JBossThread.java:485) Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.Exception on object of type StateTransferManagerImpl at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly(SecurityActions.java:83) at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:71) at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:76) at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:185) at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:968) at org.infinispan.factories.AbstractComponentRegistry.lambda$invokePrioritizedMethods(AbstractComponentRegistry.java:703) at org.infinispan.factories.SecurityActions.lambda$run(SecurityActions.java:72) at org.infinispan.security.Security.doPrivileged(Security.java:44) at org.infinispan.factories.SecurityActions.run(SecurityActions.java:71) at org.infinispan.factories.AbstractComponentRegistry.invokePrioritizedMethods(AbstractComponentRegistry.java:696) at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:689) at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:607) at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:244) at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1051) at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:421) at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:646) at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:591) at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:477) at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:463) at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:449) at org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:86) at org.wildfly.clustering.infinispan.spi.service.CacheServiceConfigurator.get(CacheServiceConfigurator.java:77) at org.wildfly.clustering.infinispan.spi.service.CacheServiceConfigurator.get(CacheServiceConfigurator.java:55) at org.wildfly.clustering.service.FunctionalService.start(FunctionalService.java:67) ... 7 more Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache example.war on node14(site-id=null, rack-id=null, machine-id=MACHINE1) at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:233) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly(SecurityActions.java:79) ... 30 more
As far as I know, this initial state transfer timeout happens because another cluster node is unresponsive or slow down (e.g. long GC pause or system resource exhaustion). However, the GC log looks good around the time of the issue. (We have not yet got data about their system resources, so we are currently asking sosreport. We'll update once we got the data.)
I'm not sure how "ISPN000073: Unexpected error while replicating: ConcurrentModificationException" is related to the initial state transfer timeout. Is there any possibility that the above ConcurrentModificationException can be a cause of the initial state transfer timeout?